Amazon accidentally sent 1,700 private voice files to an unauthorized customer in Germany in response to a request for personal data. The data allowed a German magazine to identify and track down the person whose voice was recorded on the files, according to a published report.
Under the General Data Protection Regulation (GDPR), customers can request access to their personal data collected by companies like Amazon, Facebook, Twitter, and others. In August one customer in Germany did just that, asking Amazon.de–the German version of the service–for records of the data the company has on file about him, according to German magazine c’t.
Amazon fulfilled the request, sending him a download link to a 100MB ZIP file. But they gave the customer more than he bargained for: 1,700 WAV files and a PDF cataloging unsorted transcripts of a stranger’s voice commands, recorded by the Alexa-controlled Echo speaker.
Because of the very personal and specific nature of the activity heard on the voice files–which were recorded in a male stranger’s living room, bedroom and even his shower–the magazine was even able to identify and eventually contact the person whose voice was heard, showing just how easy it is to find someone’s personal identity if companies mishandle personal data, especially if it’s recorded in his own home.
Fly on the wall
“We were able to navigate around a complete stranger’s private life without his knowledge, and the immoral, almost voyeuristic nature of what we were doing got our hair standing on end,” according to the magazine, which said they also at times heard the voice of a female companion giving Alexa commands.
Researchers said the sounds contained in the files–such as alarms, Spotify commands and public-transport inquiries–revealed a lot about the victims’ personal habits, jobs and personal tastes. Moreover, when the first customer notified Amazon of the error in November, he said the company didn’t respond, merely deleting the files from the server and appearing to take no immediate steps to notify customers of the voice-file breach until after c’t later contacted Amazon.
“A company like Amazon would be crazy to treat sensitive personal data with anything but the utmost care,” according to the article in c’t. “This is the worst case scenario that consumer and data protection activists have been warning us about.”
The situation also could be a dire financial scenario for Amazon if it’s true the company didn’t inform customers or lawmakers of the breach in a timely way. The GDPR, enacted in May, requires companies to notify the authorities within 72 hours of confirming a data breach, or face penalties based on company revenue. For Amazon, this means fines that could be in the billions.
Reason, resolution and ramifications
In an e-mailed statement Amazon attributed the “isolated incident” to “human error,” and said it’s taking the necessary steps under the GDPR to handle the breach. “We have resolved the issue with the two customers involved and have taken steps to further improve our processes,” a spokesperson said. “We were also in touch on a precautionary basis with the relevant regulatory authorities.”
Researchers at c’t suggested that Amazon could avoid such issues if it deletes voice files recorded by Echo in a timely way instead of saving them on the cloud. The company claims in its data-privacy FAQ that it saves the files to support development of its voice and language-recognition systems and gives customers the ability to access and delete voice files if they so desire.
While the company does appear to be cleaning up its mess–albeit after prompting from c’t and the original customer who made the data request–the incident still is troublesome, especially in a time when data privacy is such a hot-button issue.
News of the incident comes on the heels of a string of data-privacy imbroglios at Facebook, which acknowledged that a software flaw exposed information on millions of customers. Facebook was in defense mode yet again this week after a New York Times’ article blew the whistle on partnerships that gave high-tech companies extensive access to user data without appearing to get specific consent.