This is disturbing: Google’s contractors who transcribe audio clips collected by its AI based Google Assistant can listen-in to sensitive information about users, including names, addresses, and details about their personal lives, Belgian outlet VRT News reports. Following the report, Google conceded that it “partner[s] with language experts around the world” and 0.2% of all audio snippets collected are reviewed. It also said that it reviews the collected audio “whether you’re speaking English or Hindi”. This indicates that information of Indian users of Google Assistant might also have their audio reviewed by Google’s transcribers. Google has been promoting its assistant in India via hoardings and even television advertisements.
MediaNama has reached out to Google with the following questions:
- How many clips are generated everyday (In India)?
- What is the percentage of those clips that are heard by language experts?
- Is this program going to stop or will Google continue to let transcribers listen to audio clips?
- Do you have a team of such “language experts” who transcribe the audio of Indian users?
0.2% of all audio clips is not a small number
Google said that its transcribers review around 0.2 percent of all audio snippets collected by Google Assistant. Here’s the problem with that: In India, Google Home’s market share is around 39% as of now. Android, which has the Google Assistant on phone, has the largest marketshare of phones in India. Even if, for example, 1 million conversations are recorded each day – which is a conservative assumption – it would mean that transcribers have access to 200,000 audio snippets each day, and these are the numbers for India alone. Imagine the numbers for markets like the US and Europe where more people use Google Assistant each day. The VRT News report claimed that Google has thousands of transcribers spread across the world.
VRT also reported that Google’s claims of anonymising audio clips before reviewal isn’t entirely true. A transcriber told the outlet that often times it isn’t clear to them as to what is being said and transcribers have to then look up every word, address, personal name or company name on Google or on Facebook. This way, the identity of the person in the audio clip is discoverable without much effort.
Spooky AI assistants
This isn’t the first story that shows how our interactions with virtual assistants may not be as private and secure as we might believe. In April this year, we reported that thousands of Amazon employees around the world listen to users’ voice recordings captured on Alexa-powered Echo speakers. Amazon workers listen to the audio clips which they then transcribe, annotate and feed back into the software to improve Alexa’s voice recognition ability and to help it understand commands better. The teams are based out of Amazon offices in Boston, Costa Rica, India and Romania with each reviewer working for 9 hours a day working on 1000 audio clips per shift.