This is a story about how the future gets weird.
It's about how humans interact with each other, and machines, and systems that can only properly be called cyborg.
Let's start, though, with a man sitting on a couch. His phone rings. It's a telemarketer for a home security service.
"This is Richard, how are you today?" asks the telemarketer. His voice is confident and happy. His accent is classic American. Perhaps he grew up in Nebraska.
Richard continues, "I'm just calling you with a very special offer. My company, the Home Security Company, is giving away a free wireless home security system and in-home installation."
The man on the couch tries to claim he's busy, but the telemarketer parries, "I know you're busy, but this'll just take a few minutes," then soldiers on.
They go back and forth for several minutes before the telemarketer successfully pushes him down the sales funnel to a specialist who will set up an in-home visit.
Such conversations happen millions of times a year, but they are not what they appear. Because while a human is picking up the phone, and a human is dialing the phone, this is not, strictly speaking, a conversation between two humans.
Instead, a call-center worker in Utah or the Philippines is pressing buttons on a computer, playing through a marketing pitch without actually speaking. Some people who market these services sometimes call this "voice conversion" technology. Another company says it's "agent-assisted automation technology."
My own wordplay for it would be automatonation, after the great mechanical inventions of the 18th century that simulated complex processes through the artful combination of men and machines. Semi-autonomous telemarketing connects nicely with the developments it parallels in the drone world. "Ventriloquistic telemarketing" has a nice, multisyllabic ring, too.
But perhaps the best term is "cyborg telemarketing." As one experienced manager in the Philippines told me, "Basically, the agent is just the driver but the system has its own life. The agents work as ears and hands of the system." At its best, computer system and operator merge like a character from the movie Avatar and his or her steed.
How does this work, in practice?
Let's look at one company that provides this product, Avatar Technologies, which advertises itself as "Outsourcing Without the Accent." They created the video above.
The Avatar interface looks like this:
While the man on the couch might was just sitting there talking, the Avatar agent would have been sitting in the Filipino city of Jaro Iloilo, staring at an interface. The keyboard has hotkeys that can play different sound clips that were recorded by a perfect English speaker. Here are two samples Avatar provides on its website, "Dale Harris" from the US and "Samantha" from Australia:
Avatar is not alone in selling these services, though they are the newest company in the field, having been founded in the summer of 2012.
I found three other companies that sell cyborg call-center software or services like this, all of which are based in Utah. CallAssistant is headquartered in Logan, Utah, but has several call centers including a small one in the Philippines; Perfect Pitch Technologies has offices in Lehi, Utah and Albay in the Philippines; KomBea is also based in Lehi.
While each place has obviously engineered its own solution, the basics of the companies' systems are similar. All the audio is pre-recorded and it's triggered live in response to what a call receiver says. In some cases, a single call-center worker will run two or even three calls at the same time.
The audio breaks down into two categories. The first contains the more scripted bits of salesmanship, which, in CallAssistant technology, are mapped to the number keys. The second set of sounds are the little conversational asides that help make the conversation feel more natural. Hit "L," for example, and the voice laughs. Hit the equals sign, and the voice says, "Exactly."
More sophisticated systems have multiple responses for each type of question that might be asked by a call receiver. So, for example, if a customer asks about the company of the telemarketer, one might press Ctrl+C. On the first press, the voice would say, "The name of my company is XYZ." On the second press, the voice would change tone and wording: "Um yeah I'm from XYZ company?" On the third press, you might get, "It's XYZ company and we offer ABC."
In other implementations, the telemarketing humans can step in with their own voices to answer queries. KomBea has gone the farthest with this idea. They practice what they call "accent neutralization.
"I can promise you that 99 percent of the people do not know that the agent just shifted from pre-recorded to a live voice and back to pre-recorded audio," KomBea's CEO, Art Coombs, told me.
All these different measures are part of making the human-cyborg conversation feel "natural," even though it is anything but. A company that helps firms stay on the right side of American telemarketing regulations looked at CallAssistant's "Echo" technology, and made this point in a glowing review.
"The Echo technology merely substitutes sound files for the agent’s voice (although the agent can also interject with his or her own voice at any time) and assures positive interactive experiences for the consumer. CallAssistant’s agents interact with callers by selecting the appropriate audio file responses..." the company wrote. "The customer experiences a completely natural conversation complete with laughing, positive affirmation, and most importantly, natural interaction. "
Let's just put it out there: This a creepy system. On the Internet, no one may know you're a dog, but on the phone? It just seems wrong. What good could spring from a bunch of conversations in which one member is ventriloquizing through a machine?
And yet, in the course of reporting this story, in talking with a half dozen people here and in the Philippines, I've come to reconsider that initial reaction.
Let's imagine that cyborg telemarketing makes call-center workers happier, call receivers more satisfied, and the sales companies more successful. Everyone wins.
If that were the case, wouldn't it make sense to support soundboard-assisted calls, even if they offend our sensibilities about what human-to-human contact should be like?
* * *
For a moment, I want to back up to talk about how I stumbled into this world. A couple weeks ago, Time's Washington bureau chief got a call from a telemarketer named "Samantha West" that he suspected was not human.
After some funny tests, like asking the suspected bot what the main ingredient in tomato soup was, Time concluded that the telemarketer, who was working on behalf of an insurance company, was not human. What really got to people was that this "robot" kept denying that it was a robot, which you can hear in this sound clip:
When I reached out to people in the telemarketing industry, they expressed skepticism that the system found in the Time recordings could be a computer. Instead, I was told it was likely someone working with a soundboard. I wrote up my story as, "The Only Thing Weirder Than a Telemarketing Robot." This week, Time confirmed with the insurance company that they were using a soundboard-assisted telemarketing firm.
Ever since I started this project, I've been looking for the company that created Samantha West. I strongly suspect that Avatar created West, but I can't prove it. Multiple attempts to contact people at different levels within the company have gone unheeded.
For what it's worth, the three Utah-based companies denied that West was their creation. All three said that their systems would not work in the way recorded by Time. Their agents would tell the people being called exactly what was going on. PerfectPitch even drops a disclaimer on people at the beginning of the call.
"Every phone call that is made with our technology, we are open about this," PerfectPitch's Munns told me. "We actually proactively tell them that we are using pre-recorded audio."
"In the Kombea world," Coombs told me, "if someone were to say, 'Am I talking to a robot?' the agent has the ability, either with a pre-recorded message or their live voice, to say, 'You are talking to a live person, but to ensure the information is accurate, I'm using pre-recorded audio messages.'"
"I don't know if you know anything about Six Sigma," Coombs asked rhetorically. "But a human being is at best a 2-sigma machine. Which means that humans get things right 92 to 93 percent of the time. If you think about that, if I take a 100 calls, that means that 7 to 8 of those callers don't get the right information, not because I'm trying to mislead but because I got in a fight with my wife or I hate this call center job or I'm tired and I made a mistake."
All three CEOs expressed frustration that the first time their fledgling industry has seen the light of day, it is in the context of a shady operation that wasn't even executed well. "Whoever created Samantha West is not good," Coombs said.
"We're not trying to be deceptive," Coombs concluded. "What we're trying to do in the industry is combine the human intelligence of a human being with the accuracy and consistency of technology."
"We think it's wrong to play that game that was played in that recording," Munns said. And he wanted it noted, it's very rare that their agents are asked if they are recordings or robots.
I asked if he had a name for when that happens. "We don't because it does not happen," he responded. "We get it maybe once every three or four months. It's built to work very well."
These guys want standards put in place that require the disclosure of pre-recorded audio, in one form or another. Being businessmen, they would prefer self-regulation, but one can easily see the FCC, perhaps, stepping in to make sure that happens.
I've contacted the FCC for their position on soundboard-assisted calls, but have not heard back.
Perhaps this is ironic, or merely interesting, but if you look at the marketing that all of the companies do, they push the idea that their systems can help people who want to do telemarketing stay within the bounds of the existing American regulations. CallAssistant's Bills pointed out that a nationwide debt-settlement company is being sued by the Consumer Financial Protection Bureau for deceptive telemarketing calls made by an old-school telemarketing company.
"Consumers thought they were getting something for free, a trial, and they kept getting billed," he said. "One of my agents can't do that. The offer is going to be made exactly as it is intended to be. You take away the ability of someone to misrepresent something. You take away the ability to omit required disclosures."
To hear these companies tell it, if you take away a lot of abilities, perhaps what's left is a more ideal telemarketing call experience.
For example, the data that they can capture with these systems is far less noisy than what anyone could collect from a traditional call center. That has allowed CallAssistant, for example, to reduce the number of times that an agent tries to keep you on the line after you've already said no, which is by far the most annoying thing telemarketers do.
"It became obvious to us we were wasting time with rebuttals that were never going to go anywhere. We were doing too many rebuttals. Not only are you wasting time talking to people if it is only going to convert 1 out of 1000 times. It's also irritating consumers," Bills said. "An agent could just sit there and pound someone with rebuttals, but our system won't allow them to."
Over and over in these conversations, I wondered how workers responded to the system. Did they feel that they were actually connecting with people, as they sat there pushing the equal sign to say, "Exactly"? Wasn't there something fundamentally wrong about that?
Then I checked my Twitter account and favorited several tweets. I retweeted a couple, too.
* * *
Bills was his system's original client. He'd been in the movie distribution business, running a call center to do marketing. "Call centers are like water," Bills said. "They flow to the place of the lowest labor cost. Hence the Philippines and even domestically, call centers are clustered around as low a labor cost as you can get."
"We had the same problems every other call center has," he said. "It's low-paid, low-skilled, entry-level people that frankly aren't going to be there a lot."
The company got to thinking about building their own system, which eventually became Echo, the product CallAssistant sells. "We gotta come up with a way that every agent can be as good as our best ones. We didn't have any biologists on staff, so we couldn't clone anybody," he said. "We did have technologists, though."
The soundboard technology they came up with "really transformed our business. Our conversions improved. Our average order improved. Our complaints dropped off dramatically. Our return-to orders dropped off dramatically."
As important, it changed the feeling of the call center, too. They no longer had to go around banging a gong for every sale, or offering gift certificates for the day's high earner.
"The impact on the people was really dramatic. It was one of the things we didn't expect," Bills told me. "In outbound sales, it knocked our turnover from 400 percent a year to 135 to 140 percent. And it dramatically changed the characteristics of employing people."
To be clear, 140 percent turnover is about on par with the fast-food industry. The paragons of employee retention keep their numbers in the single digits. These are still hard jobs.
But maybe this technology makes it a little bit easier.
"It creates detachment," Bills said. "What we see is that our employees, when they have a successful outcome of the call, they take pride in operating the system effectively. When it doesn't work, they say, 'Ahhh that wasn't me.' It doesn't beat people up in the same way."
The machine absorbs some of the "emotional wear and tear" that comes with the job. CallAssistant can even employ people full time because the "shift fatigue" that hits other outbound telemarketing firms doesn't set in in quite the same way.
* * *
Though no one quite puts it this way, the number-one selling point for the soundboard technology is obvious to Filipino telemarketers: Americans' xenophobia. We want to hear from people who sound just like us.
In the course of reporting this story, I was contacted by an experienced Filipino call-center employee and manager who has worked with one of the companies mentioned in this story. Because talking about the industry could negatively impact her future employability, I agreed to her request for pseudonymity. She chose the name Andrea Marie Ugarte.
In an email, Ugarte ticked off the reasons for using voice technology.
1. Accent - some of our agents though can speak English really well have problems with their accents and it's unfortunate fact that when Americans hear foreign accented agents, they just hang up on us.
2. Quality - the technology improves our quality by a great margin as they deliver 100 percent correct message - no deviation or misleading statements.
3. Productivity - it improves our productivity by more than 100 precent. Since reps do not need to speak and just press buttons, they can handle 2-3 PCs at the same time.
What's not to love?
Marco Edward Calilao is a Perfect Pitch account manager. He's the Foursquare mayor of the company's Philippines office, and maintains several other prominent social media presences. He likes to post photos from his childhood on Facebook, along with portraits of himself in drag.
I asked him, over email, to tell me what the experience of working at a soundboard call center is like. How do people feel about it?
"Based on feedbacks and observations, working on a Non-Voice company such as Perfect Pitch is fun not to mention that there is less stress on the part of the reps, all they just need to do is familiarize themselves on the options that they need to press to have a natural conversation and make sure they have excellent comprehension to fully understand the prospect," Calilao wrote. "They do not need to have an 'American' accent unlike with the usual Inbound or Outbound call center where reps are using their own voice or do their own talking."
Think about how rough it would be to be told by some single-language-speaking, first-world jerk that you, a college-degreed, up-and-coming Filipino youth, were annoying because of your accent. Now imagine being told that hundreds of times a day. What kind of anxiety might you start to feel each time you opened your mouth?
No wonder people like hitting the button that says, "Hello, I'm Richard!" in perfect Nebraskan English.
"As a manager, I really love this tech as it really helped our company a lot in doubling the production and I didn't have to worry a lot if my reps are saying the right things... The onboarding process is also faster by 100 percent (works well if you want to start making money)," Ugarte wrote to me. "In a 'normal' voice campaign, it takes two weeks up to four weeks to train reps. With this tech, if the reps do not have any experience yet with the system, the training takes up to one to two weeks."
Working at a place like this might feel like playing a very strange massively multiplayer online game.
Especially when your company sells the fun aspect so hard. Avatar recruits people by having resume parties at clubs ("Applicants get to party at Flow Super club after submitting their resume onsite with our Human Resource department.") and supporting a local beauty queen contest linked to an indigenous ritual ("Avatar Technologies, a proud sponsor of Miss Iloilo Dinagyang 2014").
Their YouTube channel shows them holding an American Idol-style competition to celebrate their one-year anniversary. Watch that video. It opens with these words flashing on the screen: "Level Up. Step Up. Stand Out. Work with the pros and be a pro. Come Join the fastest growing, highest paying, most exciting call center in Iloilo." Then it cuts straight to the Idol-imitation tryouts. About 15 seconds in, behind singing employees, a sign is taped on the wall. It reads, "English Only Policy!"
* * *
My first job was as a telemarketer. I wanted to buy a car, so when I was 15, I worked for a summer doing business-to-business outbound sales.
We sold software to manage the paperwork about chemicals used at factories, which are known as material-safety data sheets. Our script ran on a series of linked Word documents, and we were told to stick to reading what was on the screen. My manager, whose name was Jim, asked that we call him Jimbo, and required that we all attend "Wiener Wagon Fridays" at a local hot dog stand. (The Wiener Wagon special was a chili dog piled with Fritos.)
Despite these perquisites, it was a brutal job. Call after call. Hour after hour. Sales did not go well, and the southern paper-mill plant managers that I managed to get on the phone were not swayed by my lispy, northwestern English. At lunch, eating teriyaki chicken out of a styrofoam container from the deli by the bus station felt like heaven compared with the slog of the day.
No one lasted very long; some didn't even make it to the end of summer. Only a middle-aged woman named Kimberly, who always wore long-sleeved, cream-colored blouses, managed to sell anything. If only we could have cloned her!
But we could never have imagined that we wouldn't need to. A future in which telemarketers didn't use their own human voices, but rather computerized systems of pre-recorded audio that let them fake the most-effective accent, was unthinkably weird.
But the present order is equally strange. A world in which mass telemarketing exists, driven by the dictates of the global economic system, means that something like 1.5 million people have to sit in call centers doing telemarketing, and more than 10 million others have to take inbound calls.
Right now, the cyborg systems are making millions of calls, but Coombs estimates that there are "less than 10,000 desks" occupied by agents using these technologies.
Munns says he's heard that dozens of companies are trying to pull off something like what these cyborg market leaders have.
And if what these men have told me about their call centers is true, we should want this kind of technology to roll out to all the call centers from Vancouver, Washington (where I worked) to Manila.
And yet, I said to Bills, for that to happen, people would have to ignore how their telemarketing sausage was made.
"I think this is the way it all should be done," he responded. "We need to get the entire universe here to accept how the sausage is made. But once you do that, the entire experience for inbound or outbound, it's better."
Could Americans abandon the idea that when they hear a voice on the phone, it matches up exactly with a person in the world, if it meant a better experience for them and a better work life for the people making phone calls?
"They are like robots already," Munns said, referring to call agents who have to read scripts all day, following a protocol set out by their managers. "The software can improve the experience people are already having."
Often, when we look around our world at the technologies we have, it's hard to imagine the series of steps that got us to where we are. If we end up with this weird future of semi-autonomous telemarketing, let this story show why it made sense at the time.