The problem, as Schalk sees it, is that humans and computers are both able to do far more than the interface between them allows.
“Computers are very fast, they’re extremely sophisticated, and they can process data of gargantuan complexity in just a fraction of a second,” he told me. “Humans are very good at other things. They can look at a scene and immediately know what’s going on. They can establish complex relationships. The issue that is now emerging is an issue of communication. So the underlying problem and question is, how do humans, who are extremely powerful and complex, interact with their increasingly complex and capable environments? Robots are extremely complex. Computers are extremely complex. Our cellphones are extremely complex.”
At this point in technological history, interfaces are built so that computers can do as much as possible within the limitations of a human’s sensory motor systems. Given what many people use computers for, this arrangement works out well—great, even. Most of the time, people are reading, writing text, and looking at or clicking on pictures and video. “For that, keyboards and mice—and trackpads, and to a lesser extent, voice control, which I think is still not so ubiquitous due to its relative unreliability—are still cheap, robust, and well-suited to the task,” says Kaijen Hsiao, a roboticist and the CTO of Mayfield Robotics, located just south of San Francisco. For others though, traditional interfaces aren’t enough.
“If I’m trying to explain to a computer some complex plan, intent, or perception that I have in my brain, we cannot do that,” Schalk says.
Put simply, it’s a communication issue that’s even more challenging than human-to-human communication—which is itself complex and multi-faceted. There’s always some degree of translation that happens in communicating with another person. But the extra steps required for communicating with a machine verge on prohibitively clunky.
“And when you’re trying to explain that same thing to a computer, or to a robot, you have to take this vivid imagery [from your head], and you have to translate this into syntactic and semantic speech, thereby already losing a lot of the vividness and context,” Schalk says. “And then you’re taking the speech and you’re actually translating this into finger movements, typing those sentences on a computer keyboard. It’s completely ridiculous if you think about it.”
On a practical level, for most people, this ridiculousness isn’t apparent. You have to write an email, you use your keyboard to type it. Simple.
“But if you just, on a very high level, think about how pathetic our interaction with the environment has become, compared with where it used to be, well, that’s a problem, and in fact that problem can be quantified,” Schalk says. “Any form of human communication doesn’t really [travel at] more than 50 beats per second—that’s either perceived speech or typing. So that’s basically the maximum rate at which a human can transmit information to an external piece of technology. And 50 beats per second is not just inadequate. It is completely, grossly pathetic. When you think about how many gigabytes per second a computer can process internally and what the brain can process internally, it’s a mismatch of many, many, many orders of magnitude.”