Armed With Facebook 'Likes' Alone, Researchers Can Tell Your Race, Gender, and Sexual Orientation

But the deeper aspects of your personality remain hard to detect.

Students in a health education class at Woodrow Wilson High School in Washington, DC, submit their personality problems to a panel, which leads a class discussion. The girl in the foreground is the subject of discussion. (Library of Congress, 1943)

Have you Facebook liked The Godfather, The Daily Show, "Morgan Freeman's Voice," To Kill a Mockingbird, and (bizarrely) curly fries? If you said yes to each of those, then you really need to lay off liking things on Facebook. Also, you're probably intelligent, at least as would be established on a basic personality test.

With remarkable accuracy, researchers from the University of Cambridge and Microsoft have been able to discern people's gender, sexuality, age, race, and political affiliation, based solely on their Facebook likes. With significantly less accuracy, they've also tried to predict certain personality traits -- e.g. intelligence, satisfaction with life, emotional stability, conscientiousness -- and though such traits were harden to predict, the researchers were able to come up with lists of "most predictive Likes" for each. It's possible to see how, with a much larger corpus, even certain subtleties of personality could be recognized deep within the idiosyncratic data of Facebook likes.

The authors, Michal Kosinski, David Stillwell, and Thore Graepel, say the results demonstrate "how accurate and potentially intrusive such a predictive analysis can be." By default, Facebook likes are public information, but you can change that with a few moments' effort.

The most easily predictable variables were ethnicity -- underscoring the segregation that runs through society, online and off -- and gender. "African Americans and Caucasian Americans were correctly classified in 95 percent of cases, and males and females were correctly classified in 93 percent of cases," the authors write.

Religion and political leanings were also pretty plain, with predictions accurate in 82 percent and 85 percent of all cases, respectively. Men's sexual orientations were predictable 88 percent of the time; women less so, at 75 percent, "which," the authors observe, "may suggest a wider behavioral divide (as observed from online behavior) between hetero- and homosexual males."

One fascinating detail emerged from the researchers' efforts to tell whether a person's parents had stayed together until he or she turned 21. Although their calculations had pretty low accuracy for that variable (60 percent), the authors were surprised it was detectable through Facebook likes even a little bit. What makes a kid with divorced parents different from one whose parents are together? In the case of Facebook likes, it's that, as the authors explain, "individuals with parents who separated have a higher probability of liking statements preoccupied with relationships, such as 'If I'm with you then I'm with you I don't want anybody else.' "

This is not the first time research has shown that Facebook behavior accurately represents a person's life. Earlier this year researchers at UC San Diego found that they could predict who a person's closest friends were based on their patterns of Facebook interaction. As social-media theorist Nathan Jurgenson said of that study at the time, "The notion that the Internet is, or ever really was, some other, cyber, space, is wrong headed." That is to say, of course researchers can make predictions about you based on what you do on Facebook; you live online, leave clues about your life there, and researchers can study them, much like they could study your offline actions and draw conclusions about you based on that data.

The case of this new study is quite the same. Is it so surprising that researchers can "predict" that a man who has Facebook-liked the No H8 Campaign, the Human Rights Campaign, Adam Lambert, and Ellen DeGeneres is gay? Or that someone who Facebook-likes The Bible, Jesus Christ, and Christian music is a Christian? Or, say, that those who like Joe Biden and health-care reform are Democrats?

Okay those are extreme examples, and clearly the algorithm was able to make predictions with information far less obvious than that, but Jurgenson's point still holds: Just as we humans can make judgments about another person's intelligence, sexuality, or political leanings based on a scattershot set of clues -- entertainment preferences we know them to have or opinions we know them to hold -- a computer can do much the same. Its inputs may come by way of Facebook likes, but its process is familiar.