Predicting the outcome of Supreme Court decisions has long been a favorite parlor game for political scientists, attorneys, and legal-system junkies.
People have built statistical models, predictive algorithms, and flow charts, and have used machine learning to try to guess what the justices will decide. Some of these models are reliable. Several of them make accurate predictions around 75 percent of the time. Human prognosticators can be even more impressive than that.
One man, deemed by the website FiveThirtyEight to be “the best Supreme Court predictor in the world,” won FantasySCOTUS—yep, like fantasy football but for the high court—several years in a row and boasts an 80 percent accuracy rate for guessing what the justices will do.
But there’s another realm of Supreme Court activity where computers can tease out information that otherwise stays hidden even after a decision is issued: unsigned opinions. Last term, justices issued eight such decisions—meaning, as The New York Times pointed out, more than 10 percent of court’s docket were unsigned.
“Presumably meant to correct errors so glaring that they did not warrant extended consideration, they nonetheless illuminated a trend in the court’s work,” Adam Liptak wrote for the Times last year. “In most of them, one of two things happened. Prisoners challenging their convictions lost. Or law enforcement officials accused of wrongdoing won.”
Unsigned decisions (or, per curiam decisions, as they're known in the legal world) have “arguably been abused by courts, by the Supreme Court, and by lower courts,” says William Li, a 2016 computer-science graduate of M.I.T. who has been tracking the high court’s unsigned decisions for years.
“It’s a way of hiding behind a veil of anonymity,” he told me, “a mechanism that kind of removes accountability from them.”
So Li and his colleagues—Pablo Azar, David Larochelle, Phil Hill,
James Cox, Robert Berwick, and Andrew Lo—built an algorithm designed to determine which justice wrote unsigned opinions. (Or which justice’s clerks, as is often the case.) Their work began in 2012, amid rumors that John Roberts, the chief justice, had changed his mind at the last minute about the Affordable Care Act—a move that apparently meant he ended up writing most of the majority opinion after having already written much of the dissent. Li and his colleagues wanted to find out if that theory might be true.
They used a combination of statistical data mining and machine learning to glean each justice’s individual writing style, based on years of their signed opinions. The bot works by analyzing a backlog of opinions and plucking out the words, phrases, and sentence structures that characterize a justice’s unique style. The system then assigns a higher weight to those terms, so it knows what to look for when scanning a per curiam decision. Roberts, they learned, uses the word “pertinent” a lot.
“He seems to tend to start sentences with the word ‘here,’ and end sentences with ‘the first place’—as in, ‘in the first place.’” Li said. “Breyer uses, ‘in respect to.’ For Antonin Scalia, one predictive [word] was ‘utterly,’ and starting the sentence off with ‘of course.’ It does seem like there are these kinds of different writing signatures that exist.”
Distinct signatures that are detectable to a computer, but barely noticeable to a human. Consider, for example, the differences between key words the bot identified in decisions by justices Ruth Bader Ginsburg and Sonia Sotomayor. Ginsburg often uses the words “notably,” “observed,” and “stated,” while Sotomayor favors “observes,” “heightened,” and “lawsuits.”
To test the accuracy of the algorithm’s findings, Li and his colleagues showed it 117 signed opinions (so they knew the correct answer) but withheld authorship. The bot correctly guessed who wrote 95 of them—meaning it was right 81 percent of the time. As for the question of Roberts’s authorship in the Obamacare decision, the computer determined that Roberts almost certainly wrote the majority and Scalia almost certainly wrote the dissent.
Li called the accuracy of his model “gratifying,” but not totally surprising—in part because he expected it to work. “Taking the text of any kind of document and trying to predict a known set of authors—this kind of work actually goes back to the 1960s,” he said.
In 1964, for example, the mathematicians Frederick Mosteller and David Wallace used statistical methodology to examine the disputed authorship of the Federalist Papers. Their work, Inference and Disputed Authorship, was featured on the cover of Time magazine. Though their methods were groundbreaking at the time, Mosteller and Wallace were only able to use a minuscule dataset compared with what Li and his colleagues can do today. That’s in large part because of the growing sophistication of machine-learning algorithms and sheer computing power, but it’s also because of the volume of data now available to computer scientists.
“What’s fascinating is, it’s pretty easy to obtain data and—in some vague sense, some sort of hand-waving sense—labeled data, that is labeled by author,” Li said. “It’s reasonably easy to collect all of these Supreme Court opinions and so on.”
This may seem like a banal observation, but it actually signals a profound shift in what computer scientists are able to do. With enough data, perhaps a computer could do more than just predict authorship. Imagine, for instance, a robot justice built to issue decisions in the style of an individual justice. A combination of machine learning and language generation applied to a robust enough dataset might mean that lifetime appointments could, in theory, continue long after human justices die.
Not that this is a good idea. But it is a compelling one. In most dystopian nightmares, robots are wielding machine guns, not gavels. The potential for a robot Supreme Court—or, at least, robots capable enough to generate decisions that mirror human writing—is “in the realm of possibility,” Li says. “But I think we’re still a ways away from generating arguments indistinguishable from human opinions. It’s still a really challenging task for computers. The justice Turing test, if you will, might be difficult to solve.”