As he enters this week’s annual Australian Open for the 16th consecutive year, Switzerland’s living legend Roger Federer holds a plethora of records: He’s the owner of 17 Grand Slam singles titles, 302 weeks ranked No. 1 in the world, and 36 straight Grand Slams in which he reached the quarterfinals or better.
However, Federer also holds the dubious distinction of having the worst record among players active since 1990 in so-called “Simpson’s Paradox” matches–those where the loser of the match wins more points than the winner.
On the surface, his 4-24 record in such matches may seem hard to reconcile with the rest of his stellar statistics. A deeper inquiry, however, reveals mathematical proof of Federer’s unequaled in-match competitiveness over the course of his career.
But first, some background on this arithmetic oddity.
Simpson’s Paradox is a statistical quirk where seemingly correlated variables are reversed when combined. The application to tennis is nuanced: In tennis, a derivative of Simpson’s Paradox is seen in the small percentage of matches where players win more individual points than their opponent, but lose the overall match. This anomaly is an artifact of tennis’s decidedly unique scoring system. Its “best of N” format (best of three sets, usually, or best of five sets in some men’s professional matches) follows a point-game-set-match hierarchy with neither a running score nor a clock. The results can sometimes be peculiar. The only point the winning player must win is the last one.
Simpson’s Paradox can happen at the both the game level and point level in tennis. The former would be where the score is, for example, 0-6, 7-5, 7-5; the match’s loser wins more total games than the winner of the match. Such matches are exceedingly rare in tennis. The latter, those when the winner of the match wins less than 50 percent of the total points played, occur with some regularity and can be analyzed on a per-player basis.
In a recent academic article in the International Journal of Performance Analysis in Sports, Jeff Sackmann, Ben Wright, and I investigated the incidence of point-level Simpson’s Paradox in tennis. In a data set composed of more than 61,000 men’s ATP and Grand Slam matches dating back to 1990, we found that about 4.5 percent exhibited these paradoxical characteristics. We then looked at the outlier players with the best and worst respective records to put our results in context.
At one end of the spectrum was American player John Isner. At 6’10,” Isner unleashes one of the most intimidating serves in tennis history. He is also often remembered as the winner of the longest match in the history of tennis–an 11-hour epic at Wimbledon in 2010 that ended with a 70-68 fifth set win over Frenchman Nicolas Mahut. A quick inspection of the box score, however, shows that Mahut won 24 more points than Isner. A review of Isner’s career record in two dozen similar matches—that is, matches in which the winner won fewer points than the loser—revealed an impressive 19-5 record.
Isner’s success in these odd matches was unsurprising. His playing style consists of a dominating serve and one of the weakest service returns among top 100-ranked players. The result is lopsided point-level score lines, frequent tiebreakers, and a certain degree of energy-conserving tanking when returning serve. (In tennis, “tanking” occurs when players opt to exert less effort than they are capable of. In some cases, a player may tank an entire match.) The more common scenario is strategic tanking in-match for one or more short time spans. The latter is something Isner himself acknowledged when interviewed by Andrew Lawrence for a Sports Illlustrated piece in 2013:
The strength of Isner's game is his serve, which has topped out at 149 mph. “I need to have as much energy as possible in my service games,” says Isner, who relies on motion stretching and light weight training to keep his right arm strong. On defense, though, he picks his spots. Kind of. “If I'm up a break in a set, I can just ride out my serve,” he says. “That doesn't necessarily mean that I'm tanking the return games, but it gives me the opportunity to conserve energy for the service game, knowing that I have that break in hand.”
At the other end of the Simpson’s Paradox spectrum was, of course, Roger Federer. In completed matches, he was 4-24 in contests where the winner prevailed on less than 50 percent of the total points. Federer’s winning percentage in these matches (14.29 percent) was the worst among all 72 players in the sample who participated in at least 20 matches of this type during their careers. This result surprised us, as it differed wildly from other players who had similarly won multiple Grand Slam singles titles. Andre Agassi, Rafael Nadal, Pete Sampras, Sergi Bruguera, Marat Safin, Lleyton Hewitt, Yevgeny Kafelnikov, and Gustavo Kuerten were all .500 or better in Simpson’s Paradox matches. Jim Courier was the only player worse than 50-50 in such matches, with a non-alarming 11-15 record.
So we probed Federer’s matches further. Our analysis was revealing.
There are two non-mutually exclusive explanations for Federer’s curious results. First, given Federer’s decade-plus time as one of the top ranked players in the world, it is possible that his opponents chose to adopt a high-risk, yet upset-friendly playing style when on the other side of the net as an underdog. Knowing that their normal playing strategy would be unsuccessful against the near-unbeatable Federer, his opponents may have decided to select the optimal strategy for an upset, one described by Brian Skinner in a 2011 Journal of Quantitative Analysis in Sports article as follows:
When facing a heavily-favored opponent, an underdog must be willing to assume greater-than-average risk. In statistical language, one would say that an underdog must be willing to adopt a strategy whose outcome has a larger-than-average variance.
In tennis, such a go-for-broke strategy would involve hitting aggressive second serves and high-risk/high-reward returns when Federer’s is serving.
Second, we can infer that Federer does not engage in any short-term strategic tanking while playing. Unlike some of his peers, Federer doesn’t shirk. Even when he loses, the matches are rarely lopsided and almost every individual game is competitive. A nuanced analysis of the chair umpire’s point-by-point score sheet in Simpson’s Paradox matches would reveal that Federer often wins his service games by a 40-0 or 40-15 count, frequently loses his return games after one or more deuces, and drops tightly-contested tiebreakers when the set score reaches 6-6. In this subset of matches, Federer is just a victim of a scoring system where all individual points are not created equal.
And what does it all mean, in the end? It means that Federer’s dismal record in these quirky matches, somewhat ironically, is yet another data point in support of the empirically driven conclusion that he is the greatest tennis player ever.