On Wednesday, Stability AI announced it would allow artists to remove their work from the training dataset for an upcoming Stable Diffusion 3.0 release. The move comes as an artist advocacy group called Spawning tweeted that Stability AI would honor opt-out requests collected on its Have I Been Trained website. The details of how the plan will be implemented remain incomplete and unclear, however.
As a brief recap, Stable Diffusion, an AI image synthesis model, gained its ability to generate images by “learning” from a large dataset of images scraped from the Internet without consulting any rights holders for permission. Some artists are upset about it because Stable Diffusion generates images that can potentially rival human artists in an unlimited quantity. We’ve been following the ethical debate since Stable Diffusion’s public launch in August 2022.
To understand how the Stable Diffusion 3 opt-out system is supposed to work, we created an account on Have I Been Trained and uploaded an image of the Atari Pong arcade flyer (which we do not own). After the site’s search engine found matches in the Large-scale Artificial Intelligence Open Network (LAION) image database, we right-clicked several thumbnails individually and selected “Opt-Out This Image” in a pop-up menu.
Once flagged, we could see the images in a list of images we had marked as opt-out. We didn’t encounter any attempt to verify our identity or any legal control over the images we supposedly “opted out.”
Other snags: To remove an image from the training, it must already be in the LAION dataset and must be searchable on Have I Been Trained. And there is currently no way to opt out large groups of images or the many copies of the same image that might be in the dataset.
The system, as currently implemented, raises questions that have echoed in the announcement threads on Twitter and YouTube. For example, if Stability AI, LAION, or Spawning undertook the huge effort to legally verify ownership to control who opts out images, who would pay for the labor involved? Would people trust these organizations with the personal information necessary to verify their rights and identities? And why attempt to verify them at all when Stability’s CEO says that legally, permission is not necessary to use them?
Also, putting the onus on the artist to register for a site with a non-binding connection to either Stability AI or LAION and then hoping that their request gets honored seems unpopular. In response to statements about consent by Spawning in its announcement video, some people noted that the opt-out process does not fit the definition of consent in Europe’s General Data Protection Regulation, which states that consent must be actively given, not assumed by default (“Consent must be freely given, specific, informed and unambiguous. In order to obtain freely given consent, it must be given on a voluntary basis.”) Along those lines, many argue that the process should be opt-in only, and all artwork should be excluded from AI training by default.
Currently, it appears that Stability AI is operating within US and European law to train Stable Diffusion using scraped images gathered without permission (although this issue has not yet been tested in court). But the company is also making moves to recognize the ethical debate that has sparked a large protest against AI-generated art online.
Is there a balance that can satisfy artists and allow progress in AI image synthesis tech to continue? For now, Stability CEO Emad Mostaque is open to suggestions, tweeting, “The team @laion_ai are super open to feedback and want to build better datasets for all and are doing a great job. From our side we believe this is transformative technology & are happy to engage with all sides & try to be as transparent as possible. All moving & maturing, fast.”
https://arstechnica.com/?p=1904587