Can Racial Bias Ever Be Removed From Criminal Justice Algorithms?

A recent vote over a proposed tool to predict the risk that a person would pose a threat to public safety in Pennsylvania stirred a debate over its unintended consequences.

Dozens of people packed into a Philadelphia courtroom on June 6th to voice their objections to a proposed criminal justice algorithm. The algorithm, developed by the Pennsylvania Commission on Sentencing, was conceived of as a way to reduce incarceration by predicting the risk that a person would pose a threat to public safety and helping to divert those who are at low risk to alternatives to incarceration.

But many of the speakers worried the tool would instead increase racial disparities in a state where the incarceration rate of black Americans is nine times higher than that of white people. The outpouring of concern at public hearings, as well as from nearly 2,000 people who signed an online petition from the non-profit Color of Change, had a big effect: While the sentencing commission had planned to vote June 14th on whether to adopt the algorithm, members decided to delay the vote for at least six months to consider the objections and to solicit further input.

Algorithms that make predictions about future behavior based on factors such as a person’s age and criminal history are increasingly used—and increasingly controversial—in criminal justice decision-making. One of the big objections to the use of such algorithms is that they sometimes operate out of the public’s view. For instance, several states have adopted a tool called COMPAS developed by the company Northpointe (now called Equivant), which claims the algorithm is proprietary and refuses to share crucial details of how it calculates scores.

In a striking contrast, the Pennsylvania sentencing commission has been very transparent. Legislation passed in 2010 tasked the commission with developing a risk assessment instrument for use by judges at sentencing “as an aide in evaluating the relative risk that an offender will reoffend and be a threat to public safety,” and to help identify candidates for alternatives to incarceration. Since 2010, the commission has released more than 15 reports detailing the development of the algorithm and has held 11 public hearings to invite feedback. The commission has also altered its proposal over time in response to the community’s input. For example, the Defender Association of Philadelphia and other organizations argued in 2017 that the use of past arrest record as an input factor would be likely to exacerbate racial bias, and this concern was a factor in the commission’s decision to switch to using convictions rather than arrests.

But advocates still have concerns about other elements of the algorithm. For instance, the commission found that predictions as to who was at high risk to be re-arrested for a violent crime (a “crime against a person”) had just 18 percent accuracy and so decided not to rely on this information. Instead, the instrument predicts general “recidivism,” defined as re-arrest on any misdemeanor or felony charge in Pennsylvania within a three-year period, or for recommitment to the Department of Corrections for a technical violation of parole. (The exception is the few cases where a person is given a low risk score for crime against a person but high risk for general recidivism, in which case both scores would be shown.)

There is widespread agreement that the commission deserves a lot of credit here. It’s rare to see designers of a criminal justice algorithm lay out the details, including the design process, in so much detail. (By comparison, basic information such as the inputs used in COMPAS remain out of public view, after a lawsuit in Wisconsin unsuccessfully challenged its use at sentencing as a violation of the right to due process.) Mark Houldin, policy director at the Defender Association of Philadelphia, said: “The Commission’s approach of publishing reports on each step is really great, and any jurisdiction that is seeking to adopt a tool like this should do exactly that. If the math, assumptions, and the decisions aren’t transparent, there is no way to allow stakeholders and the community to have a say.”

Even so, Houldin doesn’t think the proposed algorithm should be used at all. Given that the commission found it couldn’t predict risk of violent re-arrest with reasonable accuracy, criminal defense attorney Marni Snyder and other members of the Risk Assessment Task Force (formed by Snyder and Democratic state Representative Brian Sims) say that the commission, along with the legislature, should have reconsidered whether to propose the algorithm. Nyssa Taylor, criminal justice policy counsel for the American Civil Liberties Union of Pennsylvania, says that it is also extremely problematic that the definition of recidivism includes technical violations of parole—those in which no new crime is committed, like missing curfew and not notifying an officer of a change of address. She pointed out that Pennsylvania has one of the highest rates of parole in the country, where technical violations of parole are very common (constituting more than 50 percent of admissions to Pennsylvania state prisons in 2016). Technical parole violations are often as racially skewed as arrests, she said.

Many speakers at the public hearings also questioned the instrument’s reliance on data that has been heavily influenced by racial disparities in arrests (for example, black people are arrested for marijuana possession at a much higher rate than white people, though they use the drug at an equal rate) as well as in who is detained on bail and thereby more likely to plead guilty. This means that, even if the algorithm does have the same ability to predict re-arrest regardless of race (as the commission’s racial impact analysis argues), its predictions likely fail to predict the true offense rate.

Mark Bergstrom, the executive director of the commission, said that he and his colleagues are aware of the concerns about racial bias in arrests and other potential issues affecting the tool’s predictions. That’s why the commission included the safeguard in its proposal that judges should not use the “high” and “low” risk scores as a factor in determining a sentence. The commission recommends instead that judges order a “pre-sentence investigation” report for those who are given high or low risk scores (roughly outside of one standard deviation of the average score). A pre-sentence investigation report includes “risks and needs” information relevant to a person’s individual case (for instance, mental illness or drug addiction) and can be used as guidance in whether alternatives to incarceration might be appropriate. “Absent that information [from the pre-sentence investigation report], the judge might just assume the worst and give the longer sentence,” Bergstrom said. In short, the commission states that judges should use the tool only to identify when to get more information from a pre-sentence investigation report and not as a factor that directly affects a sentence.

However, many critics of the algorithm argue that telling judges not to let risk scores affect sentencing decisions is unlikely to be effective. Several point to a recent study of a pretrial risk assessment tool in Kentucky conducted by Megan Stevenson, a legal scholar at George Mason University. Her research suggests that judges often deviated from the tool’s recommendation, and thus it did not have the intended effect of reducing the state’s jail population over time. One lesson from the study, Stevenson writes, is that “risk assessment tools may not be used as designed.” A statement from AI Now Institute and the Center on Race, Inequality, and the Law, both based at New York University, expressed the concern that the risk assessment scores could present “a false veneer of scientific objectivity, resulting in inappropriately harsher sentences.”

As a result of these objections and others (these are just a few major concerns among others presented in the community’s testimony), the commission delayed the vote for at least six months in order to re-assess. Bergstrom said that they would publish a report on the concerns they’ve received, and there will be an independent evaluation released in the coming months by the Urban Institute. The commission is also soliciting proposals for how to improve the instrument and will hold another round of public hearings in December. State Senator Sharif Street, a Democrat from Philadelphia and member of the Commission on Sentencing, said, “I’d never seen anything that the sentencing commission was doing garner such a level of awareness and opposition from the community at large.”

Advocates celebrated the postponement of the vote and said that they’d continue to oppose the adoption of the tool. Several noted they couldn’t absolutely rule out the possibility that some newly designed algorithm would be appropriate but that it would have to be very clear that it was designed to effectively reduce incarceration and racial disparities. Hannah Sassaman, policy director at the non-profit Media Mobilizing Project and also a Soros fellow working on risk assessment, said, “No tool like this should be introduced into such a high-stakes decision-making process unless we can control it to make sure it is winding down mass incarceration as well as winding down the racial, ethnic, and other bias that exists currently in the system.”

This story originally appeared in New America’s digital magazine, New America Weekly, a Pacific Standard partner site. Sign up to get New America Weekly delivered to your inbox, and follow @NewAmerica on Twitter.

Related Posts