New York Times sues Microsoft, OpenAI for alleged copyright infringement

The New York Times (NYT) Company has taken legal action against Microsoft Corporation and various entities associated with OpenAI for alleged infringement on its intellectual property.

The lawsuit, filed in the United States District Court for the Southern District of New York, alleges that Microsoft and OpenAI used NYT’s copyrighted material to train their artificial intelligence models without authorization, leading to copyright infringement and unfair competition.

‘Vital to our democracy’

In the complaint, NYT describes independent journalism as “vital to our democracy” and “increasingly rare and valuable” before asserting that, for over 170 years, it has invested heavily in providing “deeply reported, expert, independent journalism,” a service made possible through “the efforts of a large and expensive organization.”

Central to NYT’s allegations is the claim that Microsoft and OpenAI’s generative artificial intelligence (GenAI) tools, including Bing Chat and ChatGPT, were developed using large language models (LLMs) trained on millions of NYT’s copyrighted articles and other works. The complaint alleges that these AI tools can generate outputs that “[recite] Times content verbatim, closely summarize it, and mimic its expressive style.”

The lawsuit brings multiple claims against the defendants, including copyright infringement, vicarious and contributory copyright infringement, and violation of the Digital Millennium Copyright Act. The NYT alleges that the defendant’s actions constitute a “free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment.”

According to the complaint, “Defendants’ unlawful use of The Times’s work to create artificial intelligence products that compete with it threatens The Times’s ability to provide that service.” It also accuses the defendants of engaging in willful infringement, stating:

“Defendants’ infringing conduct alleged herein was and continues to be willful and carried out with full knowledge of The Times’s rights in the copyrighted works.”

Seeking relief, the NYT demands statutory damages, compensatory damages, restitution, permanent injunctions against further infringement, and destruction of all AI models and training sets incorporating its works.

Potentially historic case

As the case proceeds, it will likely prove to be a crucial moment in determining generative AI’s relationship to copyright law.

IP and AI attorney Cecilia Ziniti called the suit “historic” in a thread on X, saying it was likely “the best case yet alleging that generative AI is copyright infringement.”

Ziniti emphasized the crucial issues of “access and substantial similarity” in the case, noting that ChatGPT’s outputs closely resemble NYT’s content, making up a large part of the Common Crawl dataset on which it was trained. She also highlighted Exhibit J from the lawsuit, which uses color coding to demonstrate substantial overlap between the two.

In her analysis, Ziniti also pointed out that while OpenAI has established content agreements with other media outlets, such as Politico, it lacks one with NYT. She argues that this apparent oversight could create legal challenges as it might suggest OpenAI’s intentional disregard for certain intellectual property rights.

Leave a Reply

Your email address will not be published. Required fields are marked *