David Simon, the creator of HBO's epic series The Wire, has weighed in on the recent disclosure that the National Security Agency has been combing through our cell phone records as part of its anti-terrorism efforts. It's an interesting read, particularly coming from the guy who wrote such interesting stories (presumably based on what he saw as a crime reporter for the Baltimore Sun) about police surveillance. Basically, his take is that using broad swathes of cell phone data (numbers dialed, minutes used, locations, etc.) is not particularly invasive, is perfectly legal, and has been a regular tool of law enforcement since well before 9/11.
How might this be a useful law enforcement tool? To illustrate, I took the liberty of downloading my own cell usage data from the past month from Verizon. Below is a type of network graph called an "egonet" showing my cell phone conversations during the month of May. I'm at the center (the "ego"), and all the red dots (the "alters") are people to whom I've spoken. (No, the points aren't labeled.) Thicker lines indicate more frequent phone contacts.
You can see that most of the contacts are people I speak to only once or twice. The highlighted (more frequent) connections are my wife, my parents, my brother, a colleague, my kids' elementary school, and a guy who was doing some contract work at my house. Let's just assume that's a typical phone data pattern for a guy in my demographic profile who's not a terrorist. (You'll have to take my word for this.)
Now, if you were able to download the phone usage data for all the nodes depicted above and graph them, you'd have a pretty complex network diagram. It would show some small, dense networks (families, groups of friends) and some loosely-affiliated people who have their own connections. Now download the phone usage data for all of those nodes, and imagine the patterns it would show. Now imagine if you could do that for basically every cell phone subscriber in the country.
That's a huge amount of data, and depicting it graphically would pretty much be a waste of ink. Profiles like mine would quickly disappear into background noise. But computers can look for people who rise above the noise. Perhaps someone seems to belong to no local networks but just pops up to make a few phone calls that last less than a minute. Perhaps those calls occur within 24 hours of a bombing attack, or right after an al Qaeda speech is broadcast. Well, that's hardly proof of criminal activity, but it might be enough for investigators to seek a warrant for a wiretap or some other form of surveillance to learn more about the person making the calls.
This is related to another point Simon makes in his post: There's no reason to believe that the government is listening in on all of our phone calls, simply because the task is absurdly vast. What percentage of us are engaged in criminal conspiracies at any given moment? For investigators to somehow monitor all our phone calls to see if we're doing anything wrong is ridiculous: the signal-to-noise ratio is functionally zero. It would be more efficient to just walk door to door asking if we're doing anything illegal.
What the big data approach described above does is avoid the task of monitoring everything at once. It uses networking patterns to filter out the noise and find the few individuals who are behaving atypically, and focus on them.
Now, I'm not saying this is how the NSA actually operates; I really don't know. Nor am I saying that this is how it should operate. Just consider this an educated guess as to how a law enforcement organization would use this kind of data if it were available.