On Tuesday, Google showed off Duplex, a new service the company is testing that allows Google Assistant to call establishments on a user’s behalf to make a dinner reservation or schedule a haircut. The voice synthesis in these calls is jaw-dropping:
With a few millisecond mistakes, Duplex sounds like a human, complete with mmms and uhhs and cheery colloquialisms. The ability of the AI to respond to real, messy language and unexpected sequences is also incredible. Generating conversational human sentences in real time can probably be considered a “solved problem,” as people around here like to say.
The audio suggests that computer voices just skipped right past 2001: A Space Odyssey, and it turns out robots won’t sound like an overlord, but a, uhh, Millennial.
The way Google presented the technology encouraged people to think about themselves as powerful users, casting magic bots out across the world to do our bidding. But it’s the other side of the interaction that deserves attention.
What new skills will service workers develop to identify and respond to voice bots calling on behalf of a “client” (as Duplex put it on one call)? Will the bots get equal service? How will these systems be used to generate better spam? Will the true privilege of high-wage work be the satisfaction of working with other (real) people, while low-wage workers increasingly interact with bots and screens?
This moment we’re in right now—where humans and bots find themselves in an unprecedented admixture—is one more step in the automation of different kinds of human labor. In the quiet, white-collar automation that swept the world in the last quarter of the 20th century, the messiness of human processes required many intermediate steps in the transition from paper and human to computer and computer. Much of what service work used to be was automated over the last few decades. Now, computers make the decisions, and the main role of the human is to deliver this information after pressing some keys on a computer.
Think of a car-rental counter. You’ve booked the reservation online, outfitting the car and contract exactly as you wanted to. That system has told the various workers what they need to get ready and issued the contract and rung up the total. The human involved has only one real job: to run the upsell script before you get in your car and go.
Of course, humans work around the design of the system. They tell jokes, shade competitors, hand out upgrades to people they like, dispense advice about the city, press their lips tightly when a customer says something stupid. But the system merely wants them to run the upsell script. That’s the job. They do the same thing over and over, under time pressure, acting with the grim knowledge that quantitative metrics will be the primary means by which their performance will be judged.
The playwright and author Barbara Garson captured many of these dynamics in a chapter on airline-reservation clerks in her 1988 anthropological study, The Electronic Sweatshop. Garson, encountering this as a new phenomenon, is aghast:
In a feat of standardization even more phenomenal than McDonald’s fry-vat computer, the airlines have found ways to break down human conversation into predictable modules that can be handled almost as routinely as a bolt or a burger.
She finds workers pushing to up their sales bookings and reduce their AHU (After Hang-Up) time stats. They are people who understand that the money they make is a direct result of their ability to hit the quantitative targets by which they are judged, no matter the human experience of the person on the other side of the telephone line. The skilled, long-term reservation agents who knew “all the company’s routes, fares, and policies” and could use that knowledge to help customers by understanding their needs were becoming obsolete. They didn’t know how to optimize themselves for this new, robotic world.
In the end, almost everyone lost their jobs anyway, as most travel booking moved to websites where the work was shifted onto customers who—now that the airline-reservation agents acted like robots—would much rather interact with a web form than have a phone call with a know-nothing trying desperately to sell them something in the shortest possible time.
As with airline reservations, so it goes with many other similar services. Who likes to call a business? Not most young people. We’d much rather hit a button on an app or—soon, I guess—dispatch a bot to talk to the harried, underpaid employee who has been tasked with answering the phone that day.
Automation begets automation. In that sense, Google Duplex feels not like something new and amazing (though it is also that), but something old and stultifying. For decades now, we’ve been forcing human service workers to act like robots. This makes many service interactions unpleasant enough that people want to avoid them, so now, Google will provide everyone with a robot that can act like a human. Finally, technological capitalism has generated the correct match for the robotic service worker: a robot service worker.
It’s almost hilarious, a George Saunders short story. It’s almost tragic, a Samuel Beckett play.
Maybe none of it matters. Who needs small talk? All we’re doing is exchanging information, flipping a bit on a calendar from zero to one. So why not let an AI do that errand? There are bigger thoughts to think, children to spend quality time with, exercise to do, community groups to volunteer for, code to write.
But maybe small talk has a purpose. The urban theorist (and hyper-observer of the city) Jane Jacobs thought that all these dumb interactions were the social fabric. She wrote in The Death and Life of Great American Cities:
The sum of such casual, public contact at a local level—most of it fortuitous, most of it associated with errands, all of it metered by the person concerned and not thrust upon him by anyone—is a feeling for the public identity of people, a web of public respect and trust, and a resource in time of personal or neighborhood need. The absence of this trust is a disaster to a city street.
After Jacobs published her book in 1961, most American cities suburbanized, cutting people off from this kind of casual contact with the people around them. But at the same time, they connected through other means, the telephone primary among them.
Already, the push-button service framework, Uber for everything, has eroded this last bastion of local chitchat. Google Duplex will simply extend that trend even to those businesses who have not given in to the computerization of their reservation systems. And I don’t except myself from this trend: I hate making these little phone calls as much as anyone else.
But if Jacobs is right that simply making conversation with our fellow human beings—“most of it associated with errands”—is what generates trust in the world around us, then what happens when no one is ever quite sure if Alexis or Duplexis is calling?