What Happens When We Turn the World's Most Famous Robot Test on Ourselves?

For years the Turing Test has been used to compare humans with computers. Now sociologists are using it to compare humans with each other.


This weekend marks the centenary of Alan Turing's birth. Turing was one of the greatest computer scientist of all time. In a 1950 paper that outlined what has come to be known as the Turing Test he offered a way out of endless philosophical speculation about whether computers could ever be classed as 'intelligent.' He said that if human judges ask interview questions of a hidden computer and a hidden person and cannot tell the difference after five minutes, the computer should be considered intelligent. Nowadays, programmers compete yearly for the Loebner Prize, which is won by the computer that is most often mistaken for a human.

A celebration of the life and work of the pioneering computer scientist
See full coverage

But the Turing Test's application is no longer limited to questions of artificial intelligence: Social scientists too are getting in on the action and using the test in a completely new way -- to compare different human subjects and their ability to pass as members of groups to which they do not belong, such as religious and ethnic minorities or particular professional classes. With the Turing Test, sociologists can compare the extent to which subjects can understand people who are different from them in some way.

In the words of sociologists, what they're now studying is called "interactional expertise." The easiest way to understand what interactional expertise entails is to contrast it with a more common idea, contributory expertise. Contributory experts are the typical array of professionals (physicists, chemists, lawyers, economists, musicians etc.) who develop specialized knowledge and skill through formal education and long experience.

Interactional experts, by contrast, are not primary practitioners. They learn about a field primarily by talking with the people who have acquired contributory expertise. The new claim is that linguistic socialization enables interactional experts to acquire enough tacit knowledge to see the world from a contributory expert's perspective. Their existence defies the cliché that understanding a person necessitates walking a mile in his or her shoes. Interactional experts can do more than talk the talk -- they can 'walk the talk' or, really, 'talk the walk' by offering authoritative technical judgments, making inside jokes, and raising devil's advocate questions that revolve around ideas normally known only to specialists.

Wherever deep interdisciplinary collaboration takes place across technical fields, interactional expertise is used.

Wherever deep interdisciplinary collaboration takes place across technical fields, interactional expertise is used. Science and technology journalists (if they go very deeply into specialist subjects) may become interactional experts, while others can be found among sociologists and historians of science and technology. Project managers need interactional expertise to excel at their jobs because it puts them in a position to understand and talk with different technical groups in a way that will generate respect. Activists may develop interactional expertise, but readily find powerful insiders are primed to use their lack of credentials against them.

While interactional expertise is not a new phenomenon, it is a new concept. Now, using a variation of Turing's test, researchers are beginning to show what interactional expertise can do.

* * *

Turing based his test on the Imitation Game, a parlor game in which men pretended to be women (or women pretended to be men) and the judge had to determine who said what. In the mid-1990's, sociologist Harry Collins started to use the idea as a research tool. In the very first experiments, Collins hypothesised that female judges would be better at spotting men pretending to be women than men judges. Not recognizing Turing's inspiration embodied the outdated gender divided society of his time, the experiment failed to reveal any meaningful differences.

In the 2000s, however, Collins, together with colleagues at Cardiff University, tried new Imitation Games to study three things: 1) whether color-blind people could pass an imitation game as color-perceivers, 2) whether those without perfect pitch -- the ability to recognise and name a musical note just as most of us can recognise and name a color -- could pretend they had perfect pitch, and, 3) whether the blind, proper -- at least those who had lost their sight in early childhood -- could pretend to be sighted. The new subject matter gave rise to interesting results that followed a particular pattern.

Consider the blind. They spend their whole lives immersed in sight-dominated societies that speak sight-dominated languages. Based on interactional-expertise theory, their exposure to this language should enable them to make the same judgments as sighted people, even where they are discussing things they have never seen, such as the bounce of a tennis ball, its relationship to the line and how hard it is to call it 'in' or 'out'. By contrast, because sighted people lack immersion in blind society, their attempts to pass as blind should come across as more caricatured than authentic. Rather than extrapolating from blind people's actual discussions of their experiences, sighted people are inclined to imagine by subtraction, guessing, unconvincingly, what it would be like to go through life without seeing.

The other cases yielded similar results: the color-blind were better able to pass as color-perceivers than vice versa, while those with perfect pitch were better able to pass as those without perfect pitch. The reversal of polarity when the color-blind were compared to the pitch-blind was exactly what would be expected. These experiments were a proof of concept, establishing the Imitation Game as a research tool that can reveal interactional expertise in action.

In the next experiment, Collins revisited his own inspiration for thinking about interactional expertise. For decades Collins has been doing sociological research on an international group of scientists trying to detect gravitational waves. Eventually he was struck by the fact that although he was not a physicist himself, never took part in any experiments, and didn't help with the writing of any physics papers, he was quite good at talking physics with the physicists. So, he participated in an Imitation Game, asking a gravitational-wave physicist to pose technical questions while he and another gravitational-wave physicist answered them. After some stylistic editing, the complete set of questions and competing answers were sent out to nine other gravitational wave physicists who were asked to identify who was who. Seven said they could not figure it out, while two pinpointed Collins as the physicist. Nature was sufficiently impressed to include a one-page news account, "Sociologist Fools Physics Judges."

Puzzled by the 2-0 result (since 50/50 was expected to be the best possible outcome), Collins discussed the situation with judges. He learned that their minds were made up when the real physicist gave a textbook answer to one of the questions, while Collins worked out a correct but rather different answer. The judges mistakenly assumed that Collins could have been the only person giving the textbook response so that the newly worked-out physics answer had to have been given by what we can now call the contributory expert rather than the interactional expert. They were wrong; interactional expertise had shown itself to be surprisingly powerful.

Success at passing in the Imitation Game shouldn't be seen as confidence trickery, hoaxing, or any other kind of pulling the wool over the judge's eyes. Confidence tricks work by giving the judge -- or mark -- such a strong reason for wanting to believe the trickster that they subconsciously repair any mistakes in the performance. In such cases, the trickster can do a poor job and get away with it. In the Imitation Game, however, judges know from the outset that one of the players is pretending. They are alert to the slightest mistake and they ask difficult questions that can be answered only if the interactional expert has genuine expertise.

* * *

That the earliest imitation games compared men and women has greater significance than might be first apparent: Andrew Hodges' biography suggests that Turing had gender identity on his mind because he was gay. Back then homosexuality was a crime in Britain and, in 1954, at the age of 42, the brilliant Turing was hounded to suicide, eating a cyanide-laced apple.

In Turing's time, the lack of understanding of homosexuality among the wider population would have enabled few straights to pass as gay in imitation games. Nowadays, however, heterosexual people's increased knowledge and understanding of homosexual cultures would enable at least a few more of them to succeed. If only we could go back to the 1950s, multiple imitation games could be used to test this idea. We obviously can't, but something similar in fact is being tried.

Collins and his collaborators are developing a new and much more complex incarnation of the Imitation Game under sponsorship of the European Research Council. They run games on topics like pretending to be gay, pretending to be a Christian, pretending to be a member of an ethnic minority, and so forth, in different regions of Europe. The idea is to find out if the game can be used as a tool for measuring differences in the extent to which these groups are mutually understood in different societies.

At Cardiff University, students found it easier to pretend to be gay than Christian.

At Cardiff University, students found it easier to pretend to be gay than Christian. A measure, called the 'identification ratio', or 'IR', was developed to make numerical comparisons. The IR is right answers minus wrong answers divided by the total number of trials. For the blind pretending to be sighted the IR was 0.13; for the sighted pretending to be blind the IR was 0.86. For straight students pretending to be gay, the IR was 0.4; for secular students pretending to be active Christians, the IR was 0.7. This gives some indication of how secular a country Britain has become.

Though this result was striking, with the numbers of students involved, the difference between 0.4 and 0.7 isn't quite statistically significant. The research is now moving on from tests with quasi-controls where big differences could be expected, to cross-national comparisons of a single condition where differences are going to be smaller and far larger samples are required. Since playing large numbers of imitation games is time-consuming and difficult organizationally, the experiments have been taken apart and redesigned with the production line, ethos in mind. Initially, some games are played to produce sets of good judge-questions and good expert-answers. Then, the sets of questions are presented to much larger numbers of pretenders. Finally, the sets of questions and pretender-answers are recombined with original expert-answers, and sets of completed dialogues are sent to fairly large numbers of judges. This is a better way to test for differences between populations because the representative sample of pretenders is key. Initial results suggest that the technique may work, but it is in its early days.

Many consider themselves experts after reading a popular science book, watching a TV program that covers a technical issue, or looking up a complex topic on the Internet. A good question to ask is how the new knowledge compares with interactional expertise. Could you pass an Imitation Game? If not, when a debate arises, are you really qualified to advocate for one technical point of view over another?

The author and editors would like to acknowledge Harry Collins' generous assistance in helping get the article into its final form.

Evan Selinger is supported by The National Science Foundation (Award # 1140190, RCN-SEES: Sustainable Energy Systems). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.