Richard Berk likes to think he knows what criminals will do—even before they know. The statistics professor, who teaches at the University of Pennsylvania, was recently willing to show off his skills. “What is the highest-risk age for re-offending?” he asked. I hazarded the early 20s, and was quickly corrected. “Teens,” he responded. “Actually, the [rate of re-offending] falls off very quickly in the early 20s.” But the trend line doesn’t hold, Berk explained. Violent activity starts to increase again in individuals a decade or so older. “You’re picking up the domestic-violence offenders,” Berk surmised. “They need someone to beat up on, and they’re in their late 30s.”
This sort of behavioral analysis is at the center of Berk’s expanding work as something of a crime predictor—a number cruncher whose algorithms are helping police and corrections officials forecast recidivism. The Pennsylvania Board of Probation and Parole, for instance, has been working with the professor for the past two years.
The state’s prison system, which for years has exceeded its capacity, paroles nearly 10,000 inmates every year. The parole board relies upon interviews with offenders, a survey, recommendations from prison officials, and victim statements to decide whom to free. The cost of mistakes is high: in 2010, six months after he was released on parole, Michael Ballard killed four people in eastern Pennsylvania. After he was caught, Ballard told the investigating officer to “blame the parole board.”
The public did. But against the backdrop of statewide calls for reform, Berk was already quietly working on a fix: an algorithm that could spit out a prediction of how likely it is that a would-be parolee will re-offend. Berk had begun building a similar algorithm for Philadelphia’s criminal-justice system in 2006, the year Philadelphia logged the highest murder rate among major cities. At the time, Philadelphia’s Adult Probation and Parole Department had 295 officers supervising nearly 50,000 individuals. The department asked Berk to predict which of the 50,000 would commit a serious crime within two years. “Our vision was that every single person, when they walked through the door, would be scored by a computer,” says Ellen Kurtz, the department’s director of research. The department would then use the score—low-, medium-, or high-risk—to decide how intensively to supervise released offenders. Officers assigned to low-risk individuals would handle up to 400 cases, and those monitoring high-risk offenders would have about 50.
Drawing from criminal databases dating to the 1960s, Berk initially modeled the Philadelphia algorithm on more than 100,000 old cases, relying on three dozen predictors, including the perpetrator’s age, gender, neighborhood, and number of prior crimes. To develop an algorithm that forecasts a particular outcome—someone committing murder, for example—Berk applied a subset of the data to “train” the computer on which qualities are associated with that outcome. “If I could use sun spots or shoe size or the size of the wristband on their wrist, I would,” Berk said. “If I give the algorithm enough predictors to get it started, it finds things that you wouldn’t anticipate.” Philadelphia’s parole officers were surprised to learn, for example, that the crime for which an offender was sentenced—whether it was murder or simple drug possession—does not predict whether he or she will commit a violent crime in the future. Far more predictive is the age at which he (yes, gender matters) committed his first crime, and the amount of time between other offenses and the latest one—the earlier the first crime and the more recent the last, the greater the chance for another offense.
Risk assessment in the justice system isn’t new. In 1927, Ernest Burgess, a sociologist at the University of Chicago, drew on the records of 3,000 parolees in Illinois to estimate an individual’s likelihood of recidivism. Today, the LSI-R, a 54-question survey developed in Canada (the same one used by the Pennsylvania parole board), and COMPAS, a similar tool created by a Michigan-based company, are the most popular of hundreds of risk-assessment instruments. But Berk’s methods may represent a significant advance. “I use tens of thousands of cases to build the system, [as well as] asymmetric costs of false positives and false negatives, real tests of forecasting accuracy, the discovery of new forecasting relationships, and yes, machine learning,” he said. How do the old methods stack up against Berk’s? “It’s like comparing a Ford Focus to a Ferrari,” he told me.