What It Means That Computers Can Tell These Smiles Apart, But You Can't

Computer algorithms can now discern the meaning behind humans' facial expressions. 

[optional image description]
Image via Mohammed Hoque, Daniel McDuff, and Rosalind Picard.

The four people above are taking part in a clinical experiment. In the screenshots shown here, each guy is smiling once out of delight (reacting to a picture of an adorable baby) and once out of frustration: made to complete an online form that keeps malfunctioning. Awww vs. argh: same general expression, totally different emotion.

So which is which? Who's smiling out of joy, who out of annoyance?

If you're not totally sure, you're not alone. We humans need context and narrative to be able to discern the meanings of our fellow humans' facial expressions. We're sensitive to subtleties. That's one thing that makes us different from machines.

Except ... when it's not. In a paper just published in IEEE Transactions on Affective Computing, Mohammed Hoque, Daniel McDuff, and Rosalind Picard share a system that allows computers to become as sensitive as -- and, in fact, even more sensitive than -- humans.

The team, members of MIT's Affective Computing Group, combined two insights to arrive at their algorithm. First, genuine smiles tend to build slowly and linger, while frustrated smiles tend to appear and disappear quickly. Second, the musculature of fake smiles tends to differ from that of genuine ones: hence "thin" smiles, "stiff" smiles, etc.

Those types of smiles are often involuntary. When Hoque and his colleagues asked study participants to feign frustration, 90 percent of them did so without smiling. But when the researchers presented their subjects with a task that caused genuine frustration -- filling out an online form, only to find their information deleted after they pressed the "submit" button -- 90 percent of them ended up smiling. Frustratedly.

The algorithm Hoque and his colleagues developed accounts for that expressive difference. And it does so quite effectively. The team's computer-based system was able to figure out which smiles were fake 92 percent of the time. The success rate for humans who were asked to do the same: 50 percent, which is obviously the same as if they had randomly guessed.

The most immediate and obvious application of the team's findings would be to aid people diagnosed with Autism Spectrum Disorder. Emotion-reading computer programs could help the autistic to assess and interpret other people's facial expressions -- one of the biggest impediments to social interaction.

But what about the broader implications? First, a hope ... then, a caveat.

On the one hand, as the paper puts it, the team's findings could be used "to develop automated systems that recognize spontaneous expressions with accuracy higher than the human counterpart." Facial recognition is now a fairly common technology, used in everything from Facebook to city streets. Emotion recognition is the next logical step in that progression -- a field that could bring a whole new meaning to "sentiment analysis."

Emotion-analyzing computers may mean that, soon, the line that divides "human" from "machine" could become just a little bit thinner. When machines can understand people's weird, expressive subtleties -- the little tics and tricks that give us so much of our expressive uniqueness -- the "IRL interaction > digital interaction" argument loses just a bit of its traction. At the moment, services like Skype and FaceTime and Google+ Hangouts are valuable not just because they help us to communicate across geographic divides, but also because they help us to communicate across semantic divides. They replace LOLs and emoji with laughter and faces. Compared to their text-based alternatives, they allow for communication that is, in every sense, more meaningful.

So computers empowered with emotion-reading abilities could -- could -- have implications for communicating, for marketing, for the way we think about machines in the first place. That's the hope.

And here's the caveat: Consider how narrow the MIT study actually is in the task it's asking of machines. Computers outperformed humans in this one specific task of emotional identification. What happens, though, when more layers of complexity -- more faces, more types of smiles, more situations -- are added to the mix? Humans would once again outperform the computers. The beauty and the downfall of algorithms is their narrowness: they're wonderfully systematized and horribly adaptable. Deep Blue may beat you at chess; challenge it to Candy Land, though, and victory will be yours.

One goal of the Affective Computing Group's general research, Hoque points out, is to "make a computer that's more intelligent and respectful." And while today's paper points out how achieving that goal may be possible, it also highlights how crazily incremental the progress toward it will have to be. The MIT team has developed a system that can tell the difference between frustrated smiles and joyful ones in a given set of circumstances. That's remarkable. But to create computers that can just read emotions, as a general, human-like thing, they'll have to delineate a huge array of mental states as expressed through a huge array of human faces. They'll have to parse the connections among those emotions.

And that's no small task. Creating "automated systems that recognize spontaneous expressions with accuracy higher than the human counterpart" will be incredibly hard. We humans, after all, are not known for our lack of complexity. So while emo-computers may be possible, they are also many painstaking steps away. For now, for better or for worse, humans are still the best judges of humankind.

For the image at top, (a), (d), (f), and (h) depict instances of frustration; (b), (c), (e), and (g) depict instances of delight. For what it's worth, your correspondent got 3 of the 4 of these wrong.