Can Google Predict the Impact of Racism on a Presidential Election?

A provocative new study argues that Google searches for racial epithets can be synced with election results to reveal what Americans truly think.

Since 1982, political pollsters and Democrats have worried about the tendency of African American politicians to underperform on Election Day relative to their last known standing in non-partisan and credible polls. Dubbed the Bradley effect, after the Los Angeles mayor who lost his bid for the California governor's mansion despite being ahead in polls, or the Wilder effect, after the Virginia governor who narrowly became that state's first black executive after polls showed him with a sizable lead, the theory predicts that white voters' concern over appearing racist will cause them to overstate their willingness to vote for a black politician when queried by pollsters.

ss-dSeth Stephens-Davidowitz

In 2008, concern over the possibility of the effect contributed to Democratic pre-election anxiety (it's almost always Democrats who worry about it, since most African Americans who run in statewide general election contests are Democrats). "In recent days, nervous Obama supporters have traded worry about a survey -- widely disputed by pollsters yet voraciously consumed by the politically obsessed -- that concluded racial bias would cost Mr. Obama six percentage points in the final outcome," reported Kate Zernike in an October 2008 Week in Review piece in The New York Times. "He is, of course, about six points ahead in current polls. See? He's going to lose."

"How much we are under-representing people who are intolerant and therefore unlikely to vote for Obama is an open question," Andrew Kohut, the president of Pew Research Center, told the paper. "I suspect not a great deal, but maybe some. And 'maybe some' could be crucial in a tight election."

Obama, as we all know, went on to win, becoming the country's first black president and claiming victory with a margin of more than 7 percent over John McCain.

Now the concern that Obama might lose because he's black is back, thanks to the provocative article "The Effects of Racial Animus on a Black Presidential Candidate: Using Google Search Data to Find What Surveys Miss" (PDF), by Seth Stephens-Davidowitz, a doctoral candidate in economics at Harvard University.

The New York Times' Sunday Review featured a fascinating infographic by Stephens-Davidowitz, and he unpacked his research in an accompanying blog post examining how he used Google searches done prior to the 2008 election to gauge racist sentiment in certain geographic areas and correlate that with Obama's eventual vote share.

Some key excerpts from the study itself:

How can we know how much racial animus costs black candidates if few voters will admit such socially unacceptable attitudes to surveys? I use a new, non-survey proxy for an area's racial animus: Google search queries that include racially charged language. I compare the proxy to an area's votes for Barack Obama, the 2008 black Democratic presidential candidate, controlling for its votes for John Kerry, the 2004 white Democratic presidential candidate. Previous research using a similar specification but survey proxies for racial attitudes yielded little evidence that racial attitudes affected Obama. Racially charged search, in contrast, is a robust negative predictor of Obama's vote share. My estimates imply that continuing racial animus in the United States cost Obama 3 to 5 percentage points of the national popular vote in 2008, yielding his opponent the equivalent of a home-state advantage country-wide.

In short, were there no racism in America, Stephens-Davidowitz appears to be arguing, Obama's strong finish would have been an epic blow-out.

Here's more about Stephens-Davidowitz's research method -- warning: racial epithets ahead -- and findings:

The baseline proxy that I use is the percentage of an area's total Google searches from 2004-2007 that included the word "nigger" or "niggers." I choose the most salient word to constrain data-mining. I do not include data after 2007 to avoid capturing reverse causation, with dislike for Obama causing individuals to use racially charged language on Google. My regression analysis includes 196 of 210 media markets, encompassing more than 99 percent of American voters.

The epithet is a common term used on Google. During the period 2004-2007, there were roughly the same number of Google searches that included the word "nigger(s)" as there were Google searches that included words and phrases such as "migraine(s)," "economist," "sweater," "Daily Show," and "Lakers." (Google data are case-insensitive.) The most common searches including the epithet (such as "nigger jokes" and "I hate niggers") return websites with derogatory material about African-Americans. The top hits for the top racially charged searches are nearly all textbook examples of antilocution, a majority group's sharing stereotype-based jokes using coarse language outside a minority group's presence. This was determined as the first and crucial stage of prejudice in Allport's (1979) classic treatise. From 2004-2007, the searches were most popular in West Virginia; upstate New York; rural Illinois; eastern Ohio; southern Mississippi; western Pennsylvania; and southern Oklahoma.

I find that racially charged search is a large and robust negative predictor of Obama's vote share. A one standard deviation increase in an area's racially charged search is associated with a 1.5 percentage point decrease in Obama's vote share, controlling for John Kerry's vote share. The statistical significance and large magnitude are robust to controls for changes in unemployment rates; home-state candidate preference; Census division fixed effects; prior trends in presidential voting; changes in Democratic House vote shares; swing state status; and demographic controls. The estimated effect is somewhat larger when adding controls for an area's Google search volume for other terms that are moderately correlated with search volume for "nigger" but are not evidence for racial animus. In particular, I control for searches including other terms for African-Americans ("African American" and "nigga," the alternate spelling used in nearly all rap songs that include the word) and profane language.

The results imply that, relative to the most racially tolerant areas in the United States, prejudice cost Obama between 3.1 percentage points and 5.0 percentage points of the national popular vote. This implies racial animus gave Obama's opponent roughly the equivalent of a home-state advantage country-wide. The cost of racial prejudice was not decisive in the 2008 election. But a four percentage point loss by the winning candidate would have changed the popular vote winner in the majority of post-war presidential elections....

A large cost of race in the general election is consistent with some scholars' estimates that, in light of the immensely unpopular incumbent Republican president, Obama substantially underperformed in the 2008 general election (Lewis-Beck et al., 2010; Tesler and Sears, 2010). It also can explain why white male Democratic candidates consistently outperformed Obama in hypothetical general election polls (Jackman and Vavreck, 2011). And it can explain why House Democrats' vote gains from 2004 to 2008 were significantly larger than Obama's gain relative to Kerry.

But for all that, it's not totally clear from Stephens-Davidowitz's findings what the Electoral College impact of racism was or would be. The popular vote is obviously critical in a presidential election, but it's mediated. It cannot come as a shock to anyone that Obama is not seen as the cat's meow in places like West Virginia, southern Mississippi or southern Oklahoma. And even a large racial cost in those states would have had no impact on Obama's general election prospects, because he was always going to lose those states. Meanwhile, racism in places like upstate New York and rural Illinois, as documented in Google searches, may be culturally and politically significant and yet still pretty much irrelevant to Obama's reelection prospects, as any Democrat who's so weak he can't even win New York or Illinois is someone heading into a blow-out loss nationwide.

Where racial animus might intersect with the Electoral College to matter -- eastern Ohio, western Pennsylvania, parts of Florida -- on Election Day is something to contemplate. Still, if Obama loses, it will be hard to argue that those well-known swing states and regions don't also have unusually significant economic problems that might turn them away from any incumbent president running on the historically weak fundamentals Obama is. Some researchers will point to Obama's race as a factor if he loses -- but even more will point to the biggest and best-know electoral predictor of all: the strength of the economy.