Once OpenAI figured out what happened, data was restored, OpenAI said. But the NYT alleged that the only data that OpenAI could recover did “not include the original folder structure and original file names” and therefore “is unreliable and cannot be used to determine where the News Plaintiffs’ copied articles were used to build Defendants’ models.”
In response, OpenAI suggested that the NYT could simply take a few days and re-run the searches, insisting, “contrary to Plaintiffs’ insinuations, there is no reason to think that the contents of any files were lost.” But the NYT does not seem happy about having to retread any part of model inspection, continually frustrated by OpenAI’s expectation that plaintiffs must come up with search terms when OpenAI understands its models best.
OpenAI claimed that it has consulted on search terms and been “forced to pour enormous resources” into supporting the NYT’s model inspection efforts while continuing to avoid saying how much it’s costing. Previously, the NYT accused OpenAI of seeking to profit off these searches, attempting to charge retail prices instead of being transparent about actual costs.
Now, OpenAI appears to be more willing to conduct searches on behalf of NYT that it previously sought to avoid. In its filing, OpenAI asked the court to order news plaintiffs to “collaborate with OpenAI to develop a plan for reasonable, targeted searches to be executed either by Plaintiffs or OpenAI.”
How that might proceed will be discussed at a hearing on December 3. OpenAI said it was committed to preventing future technical issues and was “committed to resolving these issues efficiently and equitably.”
It’s not the first time OpenAI deleted data
This isn’t the only time that OpenAI has been called out for deleting data in a copyright case.
In May, book authors, including Sarah Silverman and Paul Tremblay, told a US district court in California that OpenAI admitted to deleting the controversial AI training data sets at issue in that litigation. Additionally, OpenAI admitted that “witnesses knowledgeable about the creation of these datasets have apparently left the company,” authors’ court filing said. Unlike the NYT, book authors seem to suggest that OpenAI’s deleting appeared potentially suspicious.
https://arstechnica.com/tech-policy/2024/11/tech-problems-plague-openai-court-battles-judge-rejects-a-key-fair-use-defense/