OpenAI Scores Initial Win in Copyright Lawsuit Over AI Training Data
A New York judge dismisses a copyright lawsuit against OpenAI for training its AI, while news outlets can file a new complaint.
On Thursday, a federal judge in New York threw out a case of copyright infringement against OpenAI. The lawsuit said that OpenAI illegally used news stories from Raw Story and AlterNet to train its AI models, which was a big win for the company. The case, which was part of a larger wave of copyright lawsuits against AI companies, was mostly about claims that OpenAI’s ChatGPT used protected content without approval.
U.S. District Judge Colleen McMahon said that Raw Story and AlterNet did not show enough harm to support the case, but she did give the plaintiffs a chance to file again. McMahon was “skeptical” about whether they could show a “cognizable injury” in a new case, even though she had already made her decision.
The lawsuit, which was first filed in February, said that OpenAI used thousands of stories from Raw Story and AlterNet without permission to train its language model. It also said that ChatGPT copies parts of these copyrighted works when users ask it to, which could be against the rights of content authors.
This action was different from others like it because it wasn’t about direct copyright infringement. This time, it said that OpenAI violated copyright by taking away copyright management information (CMI) from the pieces. To OpenAI’s defense, Judge McMahon said that the claimed harm—using the articles to train ChatGPT without paying for them—did not reach the level of harm needed to go forward with the case.
Matt Topic, Raw Story’s lawyer, said that he was confident that they could change the complaint to address the court’s concerns after hearing the ruling. OpenAI hasn’t said anything about the decision yet.
Based on their arguments, authors, artists as well and publishers, who hold the licenses for the works employed in this lawsuit, are increasingly filing suits against AI firms because generative AI models use these works without acquiring permissions and without remunerating the rightful owners of these works. The New York Times was among the first big media outlets to sue OpenAI in this way, with the hope of defining legal boundaries to AI training data.
Judge McMahon’s decision demonstrates just how strained copyright disputes become where AI training data is concerned and states that we need new legal concepts to address such conflicts. Thus, OpenAI has avoided this issue for now, but similar cases will likely check what the shift in the balance between AI development and IP rights looks like.