Major international news media organisations, including Associated Press and Agence France-Presse, have written an open letter calling for a legal framework to protect their content from being used by tech companies for training AI models without consent of the intellectual property rights holders. Other signatories to the letter include European Pressphoto Agency, European Publishers’ Council, USA TODAY Network, Getty Images, National Press Photographers Association, National Writers Union, News Media Alliance, and The Authors Guild.
According to the letter, news publishers have advocated for the following regulatory and industry action:
- Transparency over the training datasets used for developing AI models.
- Obtaining consent of the “intellectual property rights holders” for utilising their data for training AI tools.
- Establishing a mechanism for enabling cooperation between media companies and AI developers regarding terms of access and use of their content.
- Requiring generative AI models and users to clearly and specifically identify AI-generated content.
- Taking steps to eliminate bias and misinformation from output generated by generative AI services.
Why it matters:
The lack of rules around the use of datasets for training generative AI systems has necessitated discussions over preventing the use of Copyright-protected online content without permission among regulators. AI developers have also begun working on deals with publishers to agree on terms of use—for example, AP struck a deal with OpenAI to use the company’s technology and product expertise in exchange for licensing part of its news archive to Open AI. The developments in this space highlight the challenges that companies will face while using data on the web for machine learning and for training AI tools.
Concerns raised: The organisations have noted that generative AI is often trained on protected media content and that such information produced by news publishers can be disseminated by AI applications without any remuneration or attribution to the original creators.
“Such practices undermine the media industry’s core business models, which are predicated on readership and viewership (such as subscriptions), licensing, and advertising. In addition to violating copyright law, the resulting impact is to meaningfully reduce media diversity and undermine the financial viability of companies to invest in media coverage, further reducing the public’s access to high-quality and trustworthy information,” the letter noted.
Compensation for content being used to train ChatGPT: In February, news outlets such as the Wall Street Journal and CNN called out OpenAI for using their articles to train the Artificial Intelligence (AI) software without any sort of agreement for usage. They also stated that they must be paid to license content to OpenAI for AI training purposes.
This came after Francesco Marconi, a computational journalist, tweeted on February 15 that ChatGPT was being trained using a large number of news sources. Marconi could acquire the list of 20 news sources through ChatGPT by using the prompt: “Which specific news sources was chatGPT trained on? Provide a list of the top news sources in your database.” He noted that if there’s no agreement with these publishers, it may amount to a violation of the publishers’ terms and service.
STAY ON TOP OF TECH NEWS: Our daily newsletter with the top story of the day from MediaNama, delivered to your inbox before 9 AM. Click here to sign up today!
Also Read:
