wordpress blog stats
Connect with us

Hi, what are you looking for?

After artists and coders, news outlets up against OpenAI for using their articles to train ChatGPT

This issue has also been flagged by the creative industry, amid increasing use of ChatGPT and focus on artificial intelligence

In the latest criticism against ChatGPT, news organisations have called out OpenAI for using their articles to train the Artificial Intelligence (AI) software without any sort of agreement for usage, Bloomberg reported. News outlets such as Wall Street Journal and CNN have stated that they must be paid to license content to OpenAI for AI training purposes.

Jason Conti, general counsel for News Corp’s Dow Jones’s unit—which publishes the Wall Street Journal, told Bloomberg that the news firm’s work should be used for AI training purposes only after acquiring a license for it from the company and currently there’s no such deal between OpenAI and Dow Jones. Similarly, an anonymous source from the CNN has said that this is in violation of their terms of service and that the company plans to reach out to OpenAI to further discuss the matter.

There is not much clarity over how data from the internet is used for Machine Learning or training an AI tool. This has raised concerns of copyright infringement in the creative industry. We delve deeper into this in our report on Google’s MusicLM and the copyright issues associated with the AI model.

STAY ON TOP OF TECH POLICY: Our daily newsletter with top stories from MediaNama and around the world, delivered to your inbox before 9 AM. Click here to sign up today! 

Who revealed the news sources first?

Advertisement. Scroll to continue reading.

Francesco Marconi, a computational journalist who has previously worked with the Wall Street Journal, tweeted out on February 15 that ChatGPT is trained using large number of news sources. Marconi could acquire the list of 20 news sources through ChatGPT by using the prompt: “Which specific news sources was chatGPT trained on? Provide a list of the top news sources in your database.”


In addition to Wall Street Journal and CNN, the list included major publishers such as New York Times, Reuters, Al Jazeera, BBC News, The Guardian, Associated Press, The Economist, Bloomberg among others. Marconi stated that if there’s no agreement with these publishers, it may amount to a violation of the publishers’ terms and service. Given the long list of news sources the AI bot mentioned, one can anticipate the legal challenges OpenAI will find itself drowning in soon if others follow suit.

Why it matters:

Advertisement. Scroll to continue reading.

The use of ChatGPT sparked discussions about the AI tool plagiarising papers, reports and other reference materials in the publishing sector. By claiming compensation for their work being used for AI training, news outlets now join the band of artists and coders who are already suing companies for scraping their creations without a deal. The developments in this space will be worth noting to understand the challenges that companies will face while using data on the web for machine learning and for training AI tools.

Lawsuits against AI-systems in news:

Early February, Getty Images sued Stability AI, creators of Stable Diffusion, which is an open-source AI model for creating images based out of text prompts, for “brazen infringement of Getty Images’ intellectual property on a staggering scale”. Getty Images has claimed that the AI company copied over 12 million images from their database without their permission or compensation “as part of its efforts to build a competing business”.

In 2022, programmers and copyright lawyers filed a class action lawsuit against Microsoft, GitHub and OpenAI alleging that GitHub Copilot, has been found to use “long sections of licensed code” without crediting the original coders. Copilot is a GitHub product which works as an AI-based coding assistant and is trained on large “public repositories” of codes from the web, many of which are licensed.

This post is released under a CC-BY-SA 4.0 license. Please feel free to republish on your site, with attribution and a link. Adaptation and rewriting, though allowed, should be true to the original.

Also read:

Advertisement. Scroll to continue reading.

Written By

Curious about privacy, surveillance developments and the intersection of technology with education, caste and welfare rights.

MediaNama’s mission is to help build a digital ecosystem which is open, fair, global and competitive.



Amazon announced that it will integrate its logistics network and SmartCommerce services with the Open Network for Digital Commerce (ONDC).


India's smartphone operating system BharOS has received much buzz in the media lately, but does it really merit this attention?


After using the Mapples app as his default navigation app for a week, Sarvesh draws a comparison between Google Maps and Mapples


In the case of the ‘deemed consent' provision in the draft data protection law, brevity comes at the cost of clarity and user protection


The regulatory ambivalence around an instrument so essential to facilitate data exchange – the CM framework – is disconcerting for several reasons.

You May Also Like


Google has released a Google Travel Trends Report which states that branded budget hotel search queries grew 179% year over year (YOY) in India, in...


135 job openings in over 60 companies are listed at our free Digital and Mobile Job Board: If you’re looking for a job, or...


By Aroon Deep and Aditya Chunduru You’re reading it here first: Twitter has complied with government requests to censor 52 tweets that mostly criticised...


Rajesh Kumar* doesn’t have many enemies in life. But, Uber, for which he drives a cab everyday, is starting to look like one, he...

MediaNama is the premier source of information and analysis on Technology Policy in India. More about MediaNama, and contact information, here.

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ

Subscribe to our daily newsletter
Your email address:*
Please enter all required fields Click to hide
Correct invalid entries Click to hide

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ