Earlier this week, videoconferencing company Zoom made headlines for a recent terms of service update that implied that its customers’ video calls could be used to train AI models. Those terms said that “service generated data” and “customer content” could be used “for the purpose of product and service development,” such as “machine learning or artificial intelligence (including for the purposes of training or tuning of algorithms and models.”
Zoom Chief Product Officer Smita Hashim attempted to clarify in a blog post that “[Zoom does] not use audio, video, or chat content for training our models without customer consent,” that Zoom customers own data like meeting recordings and invitations, and that “service generated data” referred to telemetry and diagnostic data and not the actual content of customers’ calls.
Perhaps sensing that a blog post written separately from the terms of service was inadequate, Zoom today updated both the terms of service and Hashim’s blog post, and each now contains the same statement in bolded text:
Zoom does not use any of your audio, video, chat, screen sharing, attachments or other communications-like Customer Content (such as poll results, whiteboard and reactions) to train Zoom or third-party artificial intelligence models.
According to Hashim’s updated blog post, this doesn’t reflect a policy change, but it was done “based on customer feedback” to make Zoom’s policies “easier to understand.”
The new blog post also makes it clear that “enterprises and customers in regulated verticals like education and healthcare” often have their terms of service written and updated separately from the public ones that cover “online customers” (that is, individual end-users who use Zoom independently of a large organization). These organizations often have their own strict data privacy requirements for both business and legal reasons, and they would need different terms of service to ensure that those requirements were being met.
Following this year’s explosion of high-profile generative AI projects, multiple services have made changes to either prevent data from being used to train AI models or to specify what data can be used and when. Reddit and the site once known as Twitter have limited third-party API access to their platforms out of concern that human-generated data was being used for AI training (at least, that’s part of the official explanation); Twitter also blamed AI for recent changes to the number of tweets users could view in a single day. Several groups of artists have also sued companies like OpenAI, alleging that AI models trained on their words and images are “industrial-strength plagiarists” that are “powered entirely by [artists’] hard work.”
https://arstechnica.com/?p=1960446