Copyright and AI: US publishers sue OpenAI and Microsoft

In addition to the New York Times, eight other publishers are suing OpenAI and Microsoft. It is about the training data and the output of the AI services.

Save to Pocket listen Print view
A,Person's,Head,Covered,By,An,Ai-labeled,Dark,Cloud

(Bild: photoschmidt/ Shutterstock.com)

5 min. read
This article was originally published in German and has been automatically translated.

The publishers accuse OpenAI and Microsoft of using copyrighted material without permission to train their AI models. At the same time, the AI services are using information from the publishers' articles to generate answers, but are not labeling this. Eight US publishers have therefore filed a lawsuit with the district court in New York. The New York Times is also suing the AI providers on similar grounds, as are a number of news websites. For OpenAI, the decisions could mean all or nothing. If the courts find in favor of the plaintiffs, the articles and content in question would have to be deleted from the AI models - which is technically impossible. Secondly, the AI services may no longer have access to new content, i.e. articles from publishers, which would enormously reduce the knowledge and therefore the purpose of a chatbot.

The new plaintiffs are the Daily News, Chicago Tribune Company, Orlando Sentinel Communications Company, Sun-Sentinel Company, San Jose Mercury-News, DP Media Network, ORB Publishing and Northwest Publications. They are filing the lawsuit against Microsoft Corporation and OpenAI Inc, including all of its subsidiaries. The public complaint begins by stating that the defendants have used millions of copyrighted articles from the publishers, without permission and without payment, "for the commercialization of their products with generative AI, including ChatGPT and Copilot". Although OpenAI was once a non-profit company, the company's valuation of 90 billion US dollars shows that this is no longer the case, according to the statement of claim. Microsoft's value has also been increased by OpenAI.

The lawsuit also states that OpenAI pays for all services, such as computing power, employees and more - but not for the training material. Reference is made to Sam Altman's statement that training the AI models would not have been possible without this copyrighted material. OpenAI's CEO told the House of Lords: "Restricting training data to public domain books and drawings created more than a century ago might be an interesting experiment, but would not produce AI systems that meet the needs of today's citizens."

In fact, OpenAI has already entered into partnerships with some publishers where they pay for the training material. For example, they pay Axel Springer Verlag, the AP news agency and - as was recently announced - the Financial Times millions for the right to use articles. Numerous publishers have excluded the relevant crawlers from their sites.

In the lawsuit, the publishers also refer to their role in society and for democracy. They provide neighborhoods with important information, they want to ensure that taxpayers' money is not wasted, that politicians fulfill their duties and more. It costs them millions to send real people to real places and events to report. Publishers speak of plagiarism, but also warn that hallucinations of AI services diminish the credibility of the press. Local news is the backbone of a democracy, they say.

Not content with the accusations, the plaintiffs also say that while OpenAI and Microsoft scrape and store all content from the internet, the products they publish are locked down and charge for. Finally, they say they are not about a battle between old and new technology. "This lawsuit is about Microsoft and OpenAI not being entitled to use copyrighted newspaper content to build their new billion dollar businesses without paying for that content. As this litigation will show, the defendants must both obtain consent from publishers to use their content and pay fair value for that use."

The allegations made by the New York Times, which has been suing OpenAI for some time, sound very similar. They also accuse the AI company of even issuing specific articles on ChatGPT that were behind a paywall. This was the reason why OpenAI cut its connection to the Internet for a short time. OpenAI believes that the way in which the New York Times achieved this completion of the articles was a hack. This involved entering the beginning of the article as a prompt - and getting the rest. The news websites The Intercept, Raw Story and AlterNet are also jointly suing OpenAI. Other authors and artists are challenging several AI providers on similar grounds.

(emw)