OpenAI, the pioneering developer behind the innovative chatbot ChatGPT, has declared that creating advanced AI tools such as ChatGPT would be deemed 'impossible' without access to copyrighted material. The revelation comes amidst mounting pressure on AI firms regarding the content they utilize to train their products.OpenAI's acknowledgment of the indispensability of copyrighted data arises as entities like ChatGPT and image generators, exemplified by Stable Diffusion, extensively draw from a vast reservoir of internet data, much of which is protected by copyright laws. Notably, last month, the New York Times initiated legal action against OpenAI and Microsoft, a key investor in OpenAI, accusing them of engaging in the "unlawful use" of its content to develop their respective products.
In a recent submission to the House of Lords communications and digital select committee, OpenAI emphasized the breadth of copyright coverage, stating, "Because copyright today covers virtually every sort of human expression... it would be impossible to train today’s leading AI models without using copyrighted materials." The company argued that restricting training data to out-of-copyright works would result in inadequate AI systems, failing to meet the requirements of contemporary citizens.Addressing the lawsuit from the New York Times, OpenAI reiterated its respect for the rights of content creators while relying on the legal doctrine of "fair use" as a defense. OpenAI asserted in its submission that it firmly believes, "legally, copyright law does not forbid training."
The legal challenges faced by OpenAI are not isolated incidents; several notable authors, including John Grisham, Jodi Picoult, and George RR Martin, have previously sued OpenAI, alleging "systematic theft on a mass scale." Additionally, Getty Images is pursuing legal action against Stability AI, the creator of Stable Diffusion, for alleged copyright breaches. In the US, music publishers, including Universal Music, are suing Anthropic, an Amazon-backed company behind the Claude chatbot, accusing it of misusing copyrighted song lyrics to train its model.
In its House of Lords submission, OpenAI expressed support for independent analysis of its security measures and advocated for "red-teaming" of AI systems, where third-party researchers emulate the behavior of rogue actors to test product safety. Notably, OpenAI is part of a group of companies collaborating with governments on safety testing for their most powerful AI models, a commitment established during a global safety summit in the UK last year.