By this point of March Madness, with three games left in the men's NCAA basketball tournament, most brackets are busted. My downfall, usually, is that I pick with my heart. (Michigan State all the way!) But even for those who are cool, calculated, stats-obsessed robots in their bracket-building approach, it's hard to accurately guess the outcome of 63 basketball games in a row. Really hard.
This makes sense. "You are dealing with a 40-minute basketball game played by 20-year-olds and officiated by biased referees," said Michael Lopez, an assistant professor of statistics at Skidmore College. "Too many things can happen—indeed, too many things do happen—for anyone to end up on the correct side of the game much more than 75 percent of the time."
Even a complex algorithm designed to examine, say, every March Madness game in history, would have its limitations. Because, for one thing, a single basketball game cannot be fully reduced to numbers. But also: From a statistical standpoint, 63 games a year is a tiny number. "Which means that even if a set of probabilities was more accurate than another, it would be difficult to detect any difference in such a small sample size of games," Lopez told me. So even a robot programmed to be a number-crunching basketball genius wouldn't improve much on simpler existing models. Not with the data we're using now, anyway.
Lopez has spent a lot of time thinking about this kind of thing. Last year, he and another statistician wrote a paper about the underlying probabilities in college basketball to determine how much comes down to luck. The answer: a lot. The endings of two Elite Eight games last weekend—Michigan State's victory over Louisville and Kentucky's win over Notre Dame—are perfect examples, he said, because sinking a missed free throw and hitting a three-pointer could have changed the outcome in both cases.
"NCAA pools across the country were swung on those two shots," Lopez told me. "Did the people picking Kentucky and Michigan State make better picks? And did the people on Notre Dame and Louisville make bad ones? I would argue that those final shots were just the last in a set of coin flips that eventually decided the outcome. To some extent, the people on Kentucky and Michigan State didn't make the better picks, they just made the luckier ones. And it's really hard to get lucky over and over and over again." (The best way to try is to keep an eye on betting lines in Las Vegas. Or, as Lopez put it to me, "Folks running sportsbooks don't let people bet on sports unless they know they are going to make money in the long run.")
So why does luck cap out around 75 percent? That's about the upper limit for predictive accuracy in college hoops as well as in professional basketball, professional soccer, professional football, and college football, according to a 2013 paper about using machine learning to predict game outcomes.
"It is difficult to determine why this is the case," wrote the authors of that paper. Maybe, they guessed, it's a limitation of the kind of data that statisticians tend to use, which don't usually account for qualities like experience, leadership, or luck. "It is also possible, however, that there is simply a relatively large residue of college basketball matches that is, in the truest sense of the word, unpredictable."
That second possibility seems more likely to Albrecht Zimmermann, one of the co-authors on the 2013 paper. "I am convinced that there's basically a (relatively) strong element of chance," Zimmermann told me. And to complicate matters, from a data scientist's perspective, it's hard—if not impossible—to explore alternatives. "We can rarely go back and play the same match again," Zimmermann said. But there is still, perhaps, better data to be gathered. The NBA's game-tracking system, SportVU, records precisely how players move across the court and generates a mind-boggling trove of data at a time when some teams are still producing pencil-on-paper shot charts. Here's how Grantland's Kirk Goldsberry explained the first time he opened a SportVU file:
All I could see was an ocean of decimal points, trailing digits, and hundreds of XML tags sporadically interleaved among them. Right away, it was obvious this was the “biggest” data I had ever seen. I’ll always remember my surprise when it occurred to me that everything on my screen amounted to only a few seconds of player action from one quarter of one game.
One of the biggest promises of such a system is that it might enable people to "evaluate defensive performances in exciting new ways," according to the authors of a paper about defensive metrics in professional basketball that was presented at this year's MIT Sloan Sports Analytics Conference. Translating a team's defense into "conveniently countable" numbers only offers a glimpse at their actual skill.
While steals, blocks, and rebounds do provide some useful proxies for defensive skills, they represent small discrete signals within the perpetual broadcast of defensive play. Therefore, characterizations which rely on these event types are vulnerable to many forms of uncertainty—in short, such characterizations are unreliable.
And given that team statistics are "basically aggregated player statistics," Zimmerman told me, "any improvement there should help quite a bit with predictive accuracy... It's a bit of a cliché to say that the player-tracking data will revolutionize basketball analytics but that doesn't make the statement less true."
In the meantime, we're stuck with the not-bad-but-no-guarantees predictive modeling that's been in use for a long time. And "no matter how good of a predictive model that one builds," Lopez concluded in his research, "an immense amount of luck is also needed to win an NCAA tournament pool."
Which means it might not be popular to pick No. 7 Michigan State to win it all this year. But that doesn't mean they won't.