Since 1991, the Turing Test has been administered at the so-called Loebner Prize competition, an event sponsored by a colorful figure: the former baron of plastic roll-up portable disco dance floors, Hugh Loebner. When asked his motives for orchestrating this annual Turing Test, Loebner cites laziness, of all things: his utopian future, apparently, is one in which unemployment rates are nearly 100 percent and virtually all of human endeavor and industry is outsourced to intelligent machines.
To learn how to become a confederate, I sought out Loebner himself, who put me in touch with contest organizers, to whom I explained that I’m a nonfiction writer of science and philosophy, fascinated by the Most Human Human award. Soon I was on the confederate roster. I was briefed on the logistics of the competition, but not much else. “There’s not much more you need to know, really,” I was told. “You are human, so just be yourself.”
Just be yourself has become, in effect, the confederate motto, but it seems to me like a somewhat naive overconfidence in human instincts—or at worst, like fixing the fight. Many of the AI programs we confederates go up against are the result of decades of work. Then again, so are we. But the AI research teams have huge databases of test runs for their programs, and they’ve done statistical analysis on these archives: the programs know how to deftly guide the conversation away from their shortcomings and toward their strengths, know which conversational routes lead to deep exchange and which ones fizzle. The average off-the-street confederate’s instincts—or judge’s, for that matter—aren’t likely to be so good. This is a strange and deeply interesting point, amply proved by the perennial demand in our society for dating coaches and public-speaking classes. The transcripts from the 2008 contest show the humans to be such wet blankets that the judges become downright apologetic for failing to provoke better conversation: “I feel sorry for the humans behind the screen, I reckon they must be getting a bit bored talking about the weather,” one writes; another offers, meekly, “Sorry for being so banal.” Meanwhile a computer appears to be charming the pants off one judge, who in no time at all is gushing LOLs and smiley-face emoticons. We can do better.
Thus, my intention from the start was to thoroughly disobey the advice to just show up and be myself—I would spend months preparing to give it everything I had.
Ordinarily this notion wouldn’t be odd at all, of course—we train and prepare for tennis competitions, spelling bees, standardized tests, and the like. But given that the Turing Test is meant to evaluate how human I am, the implication seems to be that being human (and being oneself) is about more than simply showing up.
To understand why our human sense of self is so bound up with the history of computers, it’s important to realize that computers used to be human. In the early 20th century, before a “computer” was one of the digital processing devices that permeate our 21st-century lives, it was something else: a job description.
From the mid-18th century onward, computers, many of them women, were on the payrolls of corporations, engineering firms, and universities, performing calculations and numerical analysis, sometimes with the use of a rudimentary calculator. These original, human computers were behind the calculations for everything from the first accurate prediction, in 1757, for the return of Halley’s Comet—early proof of Newton’s theory of gravity—to the Manhattan Project at Los Alamos, where the physicist Richard Feynman oversaw a group of human computers.
It’s amazing to look back at some of the earliest papers on computer science and see the authors attempting to explain what exactly these new contraptions were. Turing’s paper, for instance, describes the unheard-of “digital computer” by making analogies to a human computer:
The idea behind digital computers may be explained by saying that these machines are intended to carry out any operations which could be done by a human computer.
Of course, in the decades that followed, we know that the quotation marks migrated, and now it is “digital computer” that is not only the default term, but the literal one. In the mid-20th century, a piece of cutting-edge mathematical gadgetry was said to be “like a computer.” In the 21st century, it is the human math whiz who is “like a computer.” It’s an odd twist: we’re like the thing that used to be like us. We imitate our old imitators, in one of the strange reversals in the long saga of human uniqueness.
Philosophers, psychologists, and scientists have been puzzling over the essential definition of human uniqueness since the beginning of recorded history. The Harvard psychologist Daniel Gilbert says that every psychologist must, at some point in his or her career, write a version of what he calls “The Sentence.” Specifically, The Sentence reads like this:
The human being is the only animal that ______.
The story of humans’ sense of self is, you might say, the story of failed, debunked versions of The Sentence. Except now it’s not just the animals that we’re worried about.
We once thought humans were unique for using language, but this seems less certain each year; we once thought humans were unique for using tools, but this claim also erodes with ongoing animal-behavior research; we once thought humans were unique for being able to do mathematics, and now we can barely imagine being able to do what our calculators can.
We might ask ourselves: Is it appropriate to allow our definition of our own uniqueness to be, in some sense, reactive to the advancing front of technology? And why is it that we are so compelled to feel unique in the first place?
“Sometimes it seems,” says Douglas Hofstadter, a Pulitzer Prize–winning cognitive scientist, “as though each new step towards AI, rather than producing something which everyone agrees is real intelligence, merely reveals what real intelligence is not.” While at first this seems a consoling position—one that keeps our unique claim to thought intact—it does bear the uncomfortable appearance of a gradual retreat, like a medieval army withdrawing from the castle to the keep. But the retreat can’t continue indefinitely. Consider: if everything that we thought hinged on thinking turns out to not involve it, then … what is thinking? It would seem to reduce to either an epiphenomenon—a kind of “exhaust” thrown off by the brain—or, worse, an illusion.
Where is the keep of our selfhood?
The story of the 21st century will be, in part, the story of the drawing and redrawing of these battle lines, the story of Homo sapiens trying to stake a claim on shifting ground, flanked by beast and machine, pinned between meat and math.
Is this retreat a good thing or a bad thing? For instance, does the fact that computers are so good at mathematics in some sense take away an arena of human activity, or does it free us from having to do a nonhuman activity, liberating us into a more human life? The latter view seems to be more appealing, but less so when we begin to imagine a point in the future when the number of “human activities” left for us to be “liberated” into has grown uncomfortably small. What then?
Alan Turing proposed his test as a way to measure technology’s progress, but it just as easily lets us measure our own. The Oxford philosopher John Lucas says, for instance, that if we fail to prevent the machines from passing the Turing Test, it will be “not because machines are so intelligent, but because humans, many of them at least, are so wooden.”
Beyond its use as a technological benchmark, the Turing Test is, at bottom, about the act of communication. I see its deepest questions as practical ones: How do we connect meaningfully with each other, as meaningfully as possible, within the limits of language and time? How does empathy work? What is the process by which someone enters into our life and comes to mean something to us? These, to me, are the test’s most central questions—the most central questions of being human.
Part of what’s fascinating about studying the programs that have done well at the Turing Test is seeing how conversation can work in the total absence of emotional intimacy. A look at the transcripts of Turing Tests past is, frankly, a sobering tour of the various ways in which we demur, dodge the question, lighten the mood, change the subject, distract, burn time: what shouldn’t pass for real conversation at the Turing Test probably shouldn’t be allowed to pass for real conversation in everyday life either.
When the Turing Test was first proposed in 1950, it was a hypothetical: technology was nowhere near the point at which a practical test was possible. But, as we know, it got there; the first conversational computer program to attract significant notice and attention was Eliza, written in 1964 and 1965 by Joseph Weizenbaum at MIT. Modeled after a Rogerian therapist, Eliza worked on a very simple principle: extract key words from the users’ own language, and pose their statements back to them. (“I am unhappy.” “Do you think coming here will help you not to be unhappy?”) If in doubt, it might fall back on some completely generic phrases, like “Please go on.” This technique of fitting the users’ statements into predefined patterns and responding with a prescribed phrasing of its own—called “template matching”—was Eliza’s only capacity.
A look at an Eliza transcript reveals how adeptly such an impoverished set of rules can, in the right context, pass at a glance for understanding:
User: Men are all alike.
Eliza: In what way?
User: They’re always bugging us about something or other.
Eliza: Can you think of a specific example?
User: Well, my boyfriend made me come here.
Eliza: Your boyfriend made you come here.
User: He says I’m depressed much of the time.
Eliza: I am sorry to hear you are depressed.
User: It’s true. I am unhappy.
Eliza: Do you think coming here will help you not to be unhappy?
User: I need some help, that much seems certain.
Eliza: What would it mean to you if you got some help?
User: Perhaps I could learn to get along with my mother.
Eliza: Tell me more about your family.
The results were stunning: many of the people who first talked with Eliza were convinced that they were having a genuine human exchange. In some cases, even Weizenbaum’s own insistence to the contrary was of no use. People asked to be left alone to talk “in private,” sometimes for hours, and returned with reports of having had a meaningful therapeutic experience. Meanwhile, academics leapt to conclude that Eliza represented “a general solution to the problem of computer understanding of natural language.”
One of the strangest twists to the Eliza story, however, was the reaction of the medical community, which decided Weizenbaum had hit upon something both brilliant and useful. The Journal of Nervous and Mental Disease, for example, said of Eliza in 1966:
Several hundred patients an hour could be handled by a computer system designed for this purpose. The human therapist, involved in the design and operation of this system, would not be replaced, but would become a much more efficient man.
The famed scientist Carl Sagan, in 1975, concurred:
I can imagine the development of a network of computer psychotherapeutic terminals, something like arrays of large telephone booths, in which, for a few dollars a session, we would be able to talk with an attentive, tested, and largely non-directive psychotherapist.
As for Weizenbaum, appalled and horrified, he did something almost unheard-of: an about-face on his entire career. He pulled the plug on the Eliza project, encouraged his own critics, and became one of science’s most outspoken opponents of AI research. But the genie was out of the bottle, and there was no going back. The basic “template matching” skeleton and approach of Eliza has been reworked and implemented in some form or another in almost every chat program since, including the contenders at the 2009 Loebner Prize competition. The enthusiasm—as well as the unease—about these programs has only grown.