The umpires and referees officiating professional sports make mistakes. But usually, nobody notices an official until that official screws up in some high-profile way. Last year's work dispute between the NFL and its referees, for example, was quickly resolved once replacement referees blew a last-second call that indisputably changed the outcome of the game. And earlier this month, MLB's Angel Hernandez made news when he ruled a homerun as a double—even after viewing a replay that clearly showed he was incorrect—that may have cost the Oakland Athletics a victory.
Speaking after the game, MLB executive Joe Torre acknowledged that Hernandez's judgment was wrong, but said there would be no action to change the game's outcome. He fell back on a common refrain used to defend erroneous calls: Umpires are human, humans make mistakes, and these mistakes add a certain charm to the game. As Torre said last year when responding to a different series of blown calls, "The game is imperfect. I don't know why we want everything to be perfect. Life isn't perfect, and this is a game of life."
The technology exists to fix many officiating problems. Jake Simpson recently suggested an infrastructure change to MLB that would alleviate the effects of errors by the on-field umpires. It's a perfectly good solution, but it has virtually no chance of being implemented. Torre and MLB have been adamant that there's not much appetite for additional replay and that such replay would slow the game down while adding little value.
Torre's right, but for the wrong reason. Fans do appreciate the human element, and it's not the blown calls that enrage fans so much as it is the poor personal conduct by the unapologetic umpires who make them. MLB doesn't have an umpire competence problem. It has an umpire behavior problem. Fans can often tolerate errors or rudeness, but not the combination of the two; they tend to be more accepting of a blown call when the offending umpire does not make a spectacle of himself. Rather than immediately citing reasons why instant replay is not the solution, MLB could mollify much of its criticism simply by mandating better behavior.
Consider the separate cases of Angel Hernandez and Jim Joyce. Hernandez, told by everyone—including his bosses—that his call in the Indians-Athletics game was wrong, refused recorded interviews and said there wasn't enough evidence to overturn the blown call. Jim Joyce, meanwhile, is the umpire who blew the final out of Armando Galarraga's "perfect" game in June 2010. Upon seeing replays after the game, a teary-eyed Joyce took responsibility for his mistake: "It was the biggest call of my career, and I kicked it. I just cost that kid a perfect game."
Galarraga forgave Joyce and received a good-sportsmanship Corvette for his troubles. The two apparently even co-authored a book. Arguably, the class shown by Joyce and Galarraga after the blown call was a better human-interest story than a perfect game would have been.
Why was Joyce so easily forgiven while Hernandez was vilified? One reason may be that Joyce is generally viewed by players as a good umpire, whereas Hernandez is seen as the opposite. But another likely is that Joyce showed humility after making a mistake.
On May 22, I conducted a one-question survey using Amazon's Mechanical Turk to investigate what factors determine the public's reaction to a blown call. I asked a panel of 280 Americans to view an image of a game-changing close play at home plate, and to then read varying descriptions of the umpire's ensuing actions. The survey used the image below, in which the runner is clearly "out."
The scenario, as described to the respondents, is that the runner's team is losing by a single run, and that his being tagged out ends the game. If the runner is called safe, his team goes on to win the game thanks to the incorrect ruling. If the runner is called out, his team loses. A fictitious umpire name was used to prevent any priming on the part of fans that might associate certain feelings toward a given MLB umpire.
Each description had two variables: a competence variable and a behavior variable. One part of the description addressed the umpire's competence by giving one of two outcomes: Either he had made the correct call—out—or erroneously ruled the runner safe. The other part addressed the umpire's behavior in a post-game press conference. The umpire was either apologetic and polite or defensive and impolite. For example, some respondents saw the image above and then a description of the "incorrect/impolite" scenario, which read,
Earlier this season, the Philadelphia Phillies played a home game against the Pittsburgh Pirates. Down 3-2 with 2 outs in the 9th inning, Phillies runner John Mayberry attempted to score on a single. He was ruled safe by umpire Jason Williams, tying the game. Replays showed after the game that Williams was incorrect. The Phillies, down to their last out, went on to win the game.
Asked about the call after the game, Williams blew up. "How many of you out there can do my job," a red-faced Williams shouted. "None of you. You need me out there, and I'm the one you all bother every time there's a close play. I'm not going to sit here and be told how to do my job by people who can't do my job!"
After viewing the image and reading the descriptions, respondents then rated the umpire involved in the play between 1 (very bad) and 5 (very good). Multiple responses by the same user as well as responses taking less time than one could reasonably read and assess the scenario were not considered in the final data.
There are thus four possible combinations of umpires' on-field calls and post-game behavior: correct/polite, correct/impolite, incorrect/polite, and incorrect/impolite. It stands to reason that the highest-rated umpire would be both correct and polite and the lowest-rated incorrect and impolite. The most important question, though, is which umpire finishes second—in other words, do fans value demeanor more highly, or competence?
As it turns out, the people polled hold the two characteristics in almost equal value. The correct and polite umpire is ranked highest, and the wrong and rude umpire ranks lowest, but the two variations in between are virtually indistinguishable. On a 5-point scale, the averages are less than a quarter of a point apart—a difference that is not statistically significant. On the 5-point scale, only the umpire who was incorrect and rude fell on the wrong side of "no opinion." The full data can be viewed here.
So, it seems likely that had Angel Hernandez not continually insisted that a clear home run was actually a double, and had he shown contrition for the mistake, the blowback would have been significantly less severe. Indeed, MLB accidentally stumbled into the same lesson two weeks ago, when it suspended an umpire for allowing an illegal pitching change. The mistake was acknowledged, and though it went uncorrected in the scorebook, it was punished, and no controversy ensued.
Fans can often tolerate errors or rudeness, but not the combination of the two.
To be certain, this experiment only scratches the surface of the psychology of umpire evaluation. But it helps support the idea that the MLB could tamp down public scorn by reining in offensive umpires. While fans may want competent officials with sparkling personalities, they'll settle for Jim Joyce moments when calls are blown.
These changes need not be drastic. For example, MLB could be more proactive in preventing umpires from physically pursuing arguing managers or players—an action that escalates the dispute and further slows a game. MLB's fining of umpire Tom Hallion for cursing at a player is a good thing, but such oversight must be more consistently applied. Mandating a proper code of umpire conduct would effectively preserve both the speed of the game and its human fallibility.