What This Clever Attack on Speech Recognition Means for Your Next Solution

attack on speech recognition.jpg

My kids constantly tell me I’m deaf. I tell them they are just mumbling and need to speak up. Somewhere in there lies the truth.

Most of the time what they are saying isn’t what I want to hear anyway. It usually involves me opening my wallet or driving them somewhere.

But what if what was unheard was also actively malicious?

Hackers silently attack speech recognition

That’s exactly what was invented by a group of security researchers in China. The team from Zhejiang University created a silent acoustical technique they called DophinAttack. How fitting.

I’m not going to dig into the science here but you can happily spend a few hours exploring the details of their paper. You could also check out this TechCrunch article that provides a nice summary of how it was done.

At a high level, they were able to issue inaudible commands to most popular home automation and speech recognition systems. This includes Siri, Google Now, Samsung S Voice, Huawei HiVoice, Cortana, and Alexa.

How bad is it?

Their paper explains that “By leveraging the nonlinearity of the microphone circuits, the modulated low frequency audio commands can be successfully demodulated, recovered, and more importantly interpreted by the speech recognition systems.”

They were able to send tones with varying levels of ultrasonic frequencies with embedded voice commands that were undetectable to the human ear. Think of a dog whistle.

By doing so, they were able to silently issue various commands to the devices including wake phrases such as “Alexa” or “OK Google” and multi-word requests such as “unlock the front door”.

Their paper further explains they were able to issues commands that, “include activating Siri to initiate a FaceTime call on iPhone, activating Google Now to switch the phone to the airplane mode, and even manipulating the navigation system in an Audi automobile.”

Even though this is quite unnerving, especially the Audi example, there are caveats to the feasibility of this attack. For example, you need to be within a few feet of the device for this inaudible attack to work.

What does this mean for you?

But what does this expose in terms of future attack vectors? What happens as technology continues to advance and the Internet of Things becomes pervasive? The beauty of science can also be the basis for clever misuse and potential criminality.

Now more than ever technology solution design and development requires out of the box thinking. Not only does traditional security need to be baked in from the beginning but time needs to be spent thinking of the non-traditional, and downright far-fetched, what-ifs.

By examining the unbelievable, technology providers may implement countermeasures much earlier in the process. Perhaps voice biometrics would have made it into an earlier product release if DolphinAttack was known, or at least thought of, sooner.  

Too often I see solution delivery targeted toward the “happy path”. Don’t get caught up solely focused on the features and benefits. Spend serious time contemplating all that could possibly go wrong. It’s much better to explore and address worst case scenarios in development rather than frantically trying to respond to them in production.

While there are limits to how far you need to take your design accommodations, don’t let today’s features silently turn into tomorrow’s failures.