Microsoft bucks trend, maintains contractor reviews of voice recording

Cortana integration lets Windows Phone users speak to the new Band.

A wave of privacy scandals has led several major companies to end or pause their programs for human reviews of voice recordings. But Microsoft has bucked the trend.

Last week, a whistleblower went to the press to reveal that Microsoft relied on employees and contractors to review recordings made by its Skype Translator call platform and its Cortana voice assistant. The company had documentation informing users that audio recorded by its services might be reviewed to improve systems for language processing, but there was no explicit mention that the reviews would be done by humans.

In response to the outcry, Microsoft has revised its privacy policy, FAQs, and other language to clarify that there are people who will listen to captured audio. The company’s privacy policy now states that “Our processing of personal data for these purposes includes both automated and manual (human) methods of processing.” The FAQ pages for Skype Translator and Cortana have also been updated to explain that Microsoft employees or contractors might transcribe and review recordings. Both FAQs note the privacy protections Microsoft has for those activities, which it also presented when the initial reports about its review program were published.

Microsoft is taking a different approach than some of the other companies working on voice technology. Most of these companies have also faced backlash over how they make or review recordings. Apple said it would end its human review program and will let Siri users choose if they participate in grading. Google has temporarily stopped human reviews as well. Amazon has also launched a way for Alexa users to opt out of human review programs.

Although privacy watchdogs may protest that (given the exposé about its Skype and Cortana reviews) Microsoft’s protections are not sufficient. There are reasons why the company might not want to end the practice completely. Anyone who has used a voice assistant knows that they are imperfect tools at best. Each of them has weak spots, like struggling to understand proper names or incorrectly parsing a complicated request or just activating at the wrong time. Machine learning can get freakishly smart on its own, but given the idiosyncrasies of human speech, it makes sense that voice tools need more human intervention to improve accurately. When a computer system fails to work, how could another computer system be expected to catch the error?

This is the type of question tech companies have always faced when working on something new. But as so often happens, the businesses are discovering the flaws in their creations at the same time as users do. Voice assistants can be useful or entertaining in a range of situations, but they still put active microphones into omnipresent devices. If companies want voice to take off, then they need to convince consumers that they aren’t abusing or misusing that constant access. Being clear about their goals for voice tech and launching better privacy protections are a part of that process.

https://arstechnica.com/?p=1552477