Here’s a fun experiment: Next time you’re on a crowded bus, loudly announce, “Hey Siri! Text mom, ‘I'm pregnant.’” Chances are you’ll get some horrified looks as your voice awakens iPhones in nearby commuters’ pockets and bags. They’ll dive for their phones to cancel your command.
But what if there was a way to talk to phones with sounds other than words? Unless the phones’ owners were prompted for confirmation—and realized what was going on in time to intervene—they’d have no idea that anything was being texted on their behalf.
Turns out there’s a gap between the kinds of sounds that people and computers understand as human speech. Last summer, a group of Ph.D. candidates at Georgetown and Berkeley exploited that gap: They developed a way to create voice commands that computers can parse—but that sound like meaningless noise to humans. These “hidden voice commands,” as the researchers called them, can deliver a message to Google Assistant-enabled Android phones nearby through bursts of what sounds like scratchy static.
For the commands to work, the speaker that broadcasts them has to be nearby: The researchers found that commands became ineffective at a distance of about 12 feet. But that doesn’t mean someone has to be conspicuously close to a device for their hidden-command attack to succeed. A message could be encoded into the background of a popular YouTube video, for example, or broadcast on the radio or TV.