This holiday season has seen a virtual explosion in the sales of digital assistants – Amazon Echo, Google Assistant, etc. In fact, to the industry’s surprise, the Amazon Echo Dot has sold out for the season! These amazing inventions place a world of knowledge, media and home control at the reach of any human voice to answer questions, provide the weather, personal information, or for media.
How Digital Assistants Work
The technology taps into the resources of the Internet, and as a generation 2 product makes it easy to order goods, delivered directly to your door, without touching a computer or going to a store. Their operating systems are triggered by a key word like, “Alexa,” “Echo,” or, “Hey, Google.” Once activated, they begin securely processing your voice to deliver services via the cloud resources developed by Amazon (Alexa or Echo), Google (Hey, Google), Apple (Siri) or Microsoft (Cortana).
Security Concerns
While the built in microphone technology is always on, it does not process, store or record your voice outside of the activation by the keyword. At least according to what the manufacturers say. This then raises some serious questions and security concerns about the future of this technology.
Let’s take for example an Amazon Echo. Commands like:
- “Play classic rock.”
- “What is the weather?”
- “Set an alarm for 6 am.”
- “Order laundry detergent.”
- “What is in my calendar today?”
…are all valid when prefixed with “Alexa.” Other commands like, “Record a voice memo,” or “Record the next 10 minutes of audio,” are not. But could they eventually be valid? Since these devices are always listening, could they be used to document, record or transcribe conversations (the commands are transcribed in Alexa’s history online) at the request of a simple voice command by an authorized party? There is no verbal password for privileged commands like, “What is in my calendar today?” Should there be?
While the manufacturers advertise the security of these systems, I have several security concerns for these Internet of Things devices that could be potentially leveraged against us like Internet cameras, faulty routers, and insecure Bluetooth door locks have been.
Audio Hijacking
Digital assistants listen for audio input and respond. They have no concept based on the voice of a particular user. This leaves them vulnerable to undesirable voice commands.
- Unauthorized Users – Simply put, this is unapproved users activating the system to get information. Since some of the information could be sensitive, like an owner’s calendar, unauthorized users can therefore use a digital assistant to reveal personally identifiable information.
- Third Party Audio Sources – Third party audio sources, like televisions and answering machines, can trip the keyword on a digital assistant. To that end, they could be used maliciously to order merchandise or create unwanted events for the device. For example, consider an answering machine that plays the message out loud while it is being recorded for screening purposes. If an attacker knows the digital assistant is in proximity, simple commands like, “Alexa, set an alarm for 3:30am.” or “Order product X.” can be executed. Also, if a TV show had a line, “Alexa, order product Y”, how many devices would auto execute the purchase? Third party sources do not necessarily need authorization to commit a command and audio commands can be injected by other audio sources within ear shot of the digital assistant.
New or Undocumented Audio Features
It is well known that these devices have hidden commands that are in development, used for diagnostics or just not documented. For example, Amazon Echo has a built in equalizer that can be controlled with bass and treble verbal commands. If the addition of audio recording, voice memos or other audio commands ever become available, there are currently no provisions to disable specific commands to stop spying, inappropriate recording or eavesdropping.
Government Warrants
While audio recording is currently not available in these devices, it is technically possible, and command history is already available. Law enforcement already can access text messages and voice mail with a warrant. There are currently no legal provisions for law enforcement to request this information, just like recent tribulations for unlocking a cell phone, nor turning on audio monitoring (if it ever becomes possible) for wiretapping, surveillance, or in the interests of national security.
Implications to Privacy
When you decide to use any digital assistant, patterns in commands, purchasing and frequency are all data mined for future advertising and targeted marketing. This information is shared with internal departments and 3rd party firms to assist in their consumer profiling. We accept a complex license agreement as soon as we begin using these digital assistants and must be aware that we are giving up a level of privacy as soon as begin using them. The convenience alone is acceptable to most people, but for some they are unaware of what is going on behind the scenes.
Lack of Privileged Command Structure
The convenience of asking for the weather or traffic to work is a fantastic time saver. Simple commands are generic but others are obviously more sensitive. Asking for your calendar on Amazon’s device (if linked to a supported calendar in the Alexa app) can reveal sensitive information. Unfortunately, the only password and privileged structure to commands is relegated to the application used to setup the device. Users need to be aware that there is no concept of least privilege within these digital assistants, and once configured anyone can use these commands. If one of these devices is located in your office, once turned on (Alexa or Google) or if the parent device is logged in (Siri or Cortana), then access to PII is very possible with anyone’s voice and with no method for privileged elevation or authentication.
While these systems are “hear” to stay, the laws, usage and potential misuse are questions we will need to address in the next coming years. My biggest concerns are a potential lack of privileged controls to information, future features for audio recording, and government access to history and potential backdoors. Traditional information technology solutions can benefit from least privileged access, password management, and security best practices. Unfortunately, these devices do not. It is in our best interest to know if these IoT devices are on our corporate networks using tools like the free Retina IoT Scanner and consider securing access when possible to mitigate these next generation privileged threats.

Morey J. Haber, Chief Security Officer, BeyondTrust
Morey J. Haber is the Chief Security Officer at BeyondTrust. He has more than 25 years of IT industry experience and has authored four books: Privileged Attack Vectors, Asset Attack Vectors, Identity Attack Vectors, and Cloud Attack Vectors. He is a founding member of the industry group Transparency in Cyber, and in 2020 was elected to the Identity Defined Security Alliance (IDSA) Executive Advisory Board. Morey currently oversees BeyondTrust security and governance for corporate and cloud based solutions and regularly consults for global periodicals and media. He originally joined BeyondTrust in 2012 as a part of the eEye Digital Security acquisition where he served as a Product Owner and Solutions Engineer since 2004. Prior to eEye, he was Beta Development Manager for Computer Associates, Inc. He began his career as Reliability and Maintainability Engineer for a government contractor building flight and training simulators. He earned a Bachelor of Science degree in Electrical Engineering from the State University of New York at Stony Brook.