Only Equivant can say, and they’re not revealing the secrets of their algorithm. So the duo developed their own algorithm, and made it as simple as possible—“the kind of thing you teach undergrads in a machine-learning course,” says Farid. They found that this training-wheels algorithm could perform just as well as COMPAS, with an accuracy of 67 percent, even when using just two pieces of data—a defendant’s age, and their number of previous convictions. “If you are young and have a lot of prior convictions, you are high-risk,” says Farid. “It’s kind of obvious.”
Other teams have found similar results. Last year, a team of researchers led by Cynthia Rudin from Duke University showed that a basic set of rules based on a person’s age, sex, and prior convictions—essentially, an algorithm so simple you could write it on a business card—could predict recidivism as well as COMPAS.
The problem isn’t necessarily that COMPAS is unsophisticated, says Farid, but that it has hit a ceiling in sophistication. When he and Dressel designed more complicated algorithms, they never improved on the bare-bones version that used just age and prior convictions. “It suggests not that the algorithms aren’t sophisticated enough, but that there’s no signal,” he says. Maybe this is just as good as it gets. Maybe the whole concept of predicting recidivism is going to stall at odds that are not that much better than a coin toss.
Sharad Goel, from Stanford University, sees it a little differently. He notes that judges in the real world have access to far more information than the volunteers in Dressel and Farid’s study, including witness testimonies, statements from attorneys, and more. Paradoxically, that informational overload can lead to worse results by allowing human biases to kick in. Simple sets of rules can often lead to better risk assessments—something that Goel found in his own work. Hence the reasonable accuracy of Dressel and Farid’s volunteers, based on just seven pieces of information.
“That finding should not be interpreted as meaning that risk-assessment tools add no value,” says Goel. Instead, the message is “when you tell people to focus on the right things, even nonexperts can compete with machine-learning algorithms.”
Equivant make a similar point in a response to Dressel and Farid’s study, published on Wednesday. “The findings of ‘virtually equal predictive accuracy’ in this study,” the statement says, “instead of being a criticism of the COMPAS assessment, actually adds to a growing number of independent studies that have confirmed that COMPAS achieves good predictability and matches the increasingly accepted AUC standard of 0.70 for well-designed risk assessment tools used in criminal justice.”
There have been several studies showing that algorithms can be used to positive effect in the criminal-justice system. “We’re not saying you shouldn’t use them,” says Farid. “We’re saying you should understand them. You shouldn’t need people like us to say: This doesn’t work. You should have to prove that something works before hinging people’s lives on it.”
“Before we even get to fairness, we need to make sure that these tools are accurate to begin with,” adds Dressel. “If not, then they’re not fair to anyone.”