After all, one of the most crucial skills of a good assistant is communication: He or she has to be able to listen and reply to you effectively for anything else to make much difference. Haptic commands, fingers on a keyboard or screen, can be a clunky way to have a conversation, Huffman points out, "whereas voice is much more natural." That's not merely a matter of convenience. Voice lends itself to a kind of dialogue -- to an interaction with a device that seems to take place on relatively human terms -- much more readily than fingers do. Siri may be far from perfect, but it (she?) is onto something big in that respect. Voice, Huffman says, is "just a much more powerful way to interact with a mobile device."
That kind of interaction, for the most part, has only recently been a possibility. While, sure, we've had incremental steps along the way -- Dragon dictation software, Siri and her predecessors -- engineers have only recently had the tools to convert "voice" into "interface." "It's really the first time in history," Huffman says, that the necessary technological elements have come together to allow a person to talk to a computer -- and, by extension, to talk to the stuff that the computer implicitly contains. First, you needed voice recognition capability that could keep up with the idiosyncrasies and the speed of typical speech. Then, you needed the intelligence aspect -- natural language processing and understanding -- that could convert sounds into words into meaning. Finally, you needed a knowledge graph: the interconnected and structured data that could offer people the answers they were looking for.
In 2010, Google acquired Metaweb, a firm that maintained "an open database of things in the world." Freebase, at the time, offered information about more than 12 million of those things, including "movies, books, TV shows, celebrities, locations, companies and more." Under Google, it now offers much more. And the database has provided Google, in turn, with interconnected and structured data that along with Wikipedia and other sources now inform Google's Knowledge Graph -- a way for Google to apply the logic of the semantic web to everyday consumer products. It's a way of knowing things, even if you don't know the right questions to ask. It's a way, Metaweb founder told Alexis last year, of "going sideways through the web."
So then. Combine that resource with voice recognition and natural language processing, and you have a powerful new way to interact with knowledge and the people who make use of it. And then combine all that with new forms of mobile hardware -- voice-controlled Google Glass, for example -- and you have a whole interface, a whole new way to make sense of the world as you navigate it. Google Now, Alexis put it when the service debuted, "is the base layer of the Glass video, or of any Google AR future. It's the servant that trains itself. It's the bot that keeps you from having to use your big clumsy thumbs." So the search box on a page -- that elegant little object that made Google what it is today -- will soon be, Google is betting, surpassed by something more intuitive and human-friendly. It'll be a shift from search box to voice box, from query to conversation. The layer of information that has hummed in the background of our computer screens and our smartphones and our lives, Google believes, will soon be brought to the fore, delivered by a friendly personal assistant who is smart and always hard-working and who knows you almost as well as you know yourself. And whose name, in this case, is Google.