Friday, February 28, 2014

Chicago PD's Big Data

Using pseudoscience to justify racial profiling

The Chicago Police Department has ramped up the use of its "predictive analysis" system to identify people it believes are likely to commit crimes. These people, who are placed on a "heat list," are visited by police officers who tell them that they are considered pre-criminals by CPD, and are warned that if they do commit any crimes, they are likely to be caught.
The CPD defends the practice, and its technical champion, Miles Wernick from the Illinois Institute of Technology, characterizes it as a neutral, data-driven system for preventing crime in a city that has struggled with street violence and other forms of crime. Wernick's approach involves seeking through the data for "abnormal" patterns that correlate with crime. He compares it with epidemiological approaches, stating that people whose social networks have violence within them are also likely to commit violence.
The CPD refuses to share the names of the people on its secret watchlist, nor will it disclose the algorithm that put it there.
This is a terrible way of running a criminal justice system.
Let's start with transparency, because that's the most obviously broken thing here. The designers of the algorithm assure us that it is considering everything relevant, nothing irrelevant, and finding statistically valid correlations that allow them to make useful predictions about who will commit crime. In an earlier era, we would have called this discrimination -- or even witchhunting -- because the attribution of guilt (or any other trait) through secret and unaccountable systems is a superstitious, pre-rational way of approaching any problem.
The purveyors of this technology cloak themselves in the mantel of science. The core tenet of science, the thing that distinguishes it from all other ways of knowing, is the systematic publication and review of hypotheses and the experiments conducted to validate them. The difference between a scientist and an alchemist isn't their area of study: it's the method they use to validate their conclusions.
An algorithm that only works if you can't see it is not science, it's a conjuring trick. My six year old can do that trick: she can make anything disappear provided you don't look while she's doing it and don't ask her to open her hands and show you what's in them. Asserting that you're doing science but you can't explain how you're doing it is a nonsense on its face.
Now let's think about objectivity: the system that the CPD and its partners have designed purports to objectivity because it uses numbers and statistics to make its calculations. But -- transparency again -- without insight into how the system runs its numbers, we have no way of debating and validating the way it weighs different statistics. And what about those statistics? We know -- because of transparent, rigorous scholarship, and because of high-profile legal cases -- that police intervention is itself not neutral. From stop-and-search to arrest to prosecutorial zeal or discretion, the whole enterprise of crime statistics is embedded in a wider culture in which human beings with social power and representing the status quo can and do make subjective decisions about how to characterize individual acts.
Put more simply: if cops, judges and prosecutors are more likely to give white people in rich neighborhoods in possession of cocaine an easier time than they give black people in poor neighborhoods in possession of crack (and they do), then your data-mining exercise will disproportionately weight blackness and poorness as being correlated with felonies. Garbage in, garbage out -- there's nothing objective and scientifically rigorous about using flawed data to generate flawed conclusions.
But even assuming that this stuff could be made to work: is it a valid approach to crimefighting?
Consider that the root of this methodology is social network analysis. Your place on the heat-list is explicitly not about what you've done or who you are: it's about who your friends are and what they've done. The idea that people's social circles tell us something about their own character is as old as the proverb "A man is known by the company he keeps." Certainly, it wasn't a new idea to the framers of the Constitution (after all, the typical framer was both a member of a secret society and had recently participated in a guerrilla revolution -- they knew a thing or two about the predictive value of social network analysis).
But the framers explicitly guaranteed "freedom of association," in the First Amendment. Why? Because while "birds of a feather stick together," the criminalization of friendship is a corrosive force that drives apart the bonds that make us into a society. In other words: if the Chicago PD think that crime can only be fought by discriminating against people based on their friendships, they need to get a constitutional amendment before they put that plan into action.
Finally, this program assumes that its interventions will be positive, and this assumption is anything but assured. The idea that being told that you are likely to commit crimes will prevent you from doing so is no more obvious that the idea that being treated as a presumptive criminal will lead you to commit crimes. What's more, well-known, well-documented cognitive biases (theory blindness, confirmation bias) are alive and well in the criminal justice system: if someone on the blacklist is suspected of doing something minor, we should expect the police, prosecutors and judge to treat them more harshly than they would someone plucked from off the street. If you're already in a machine-generated ethnicity of pre-criminals, society will deal with you accordingly.
What's more, this will lead to more arrests, harsher charges and longer sentences for pre-criminals -- seemingly validating the methodology. It's the Big Data version of witchburning, a modern pseudoscience cloaked in the respectability of easily manipulated statistics and suspicious metaphors from public health.

No comments:

Post a Comment