In the latest criticism against ChatGPT, news organisations have called out OpenAI for using their articles to train the Artificial Intelligence (AI) software without any sort of agreement for usage, Bloomberg reported. News outlets such as Wall Street Journal and CNN have stated that they must be paid to license content to OpenAI for AI training purposes.
Jason Conti, general counsel for News Corp’s Dow Jones’s unit—which publishes the Wall Street Journal, told Bloomberg that the news firm’s work should be used for AI training purposes only after acquiring a license for it from the company and currently there’s no such deal between OpenAI and Dow Jones. Similarly, an anonymous source from the CNN has said that this is in violation of their terms of service and that the company plans to reach out to OpenAI to further discuss the matter.
There is not much clarity over how data from the internet is used for Machine Learning or training an AI tool. This has raised concerns of copyright infringement in the creative industry. We delve deeper into this in our report on Google’s MusicLM and the copyright issues associated with the AI model.
STAY ON TOP OF TECH POLICY: Our daily newsletter with top stories from MediaNama and around the world, delivered to your inbox before 9 AM. Click here to sign up today!
Who revealed the news sources first?
Francesco Marconi, a computational journalist who has previously worked with the Wall Street Journal, tweeted out on February 15 that ChatGPT is trained using large number of news sources. Marconi could acquire the list of 20 news sources through ChatGPT by using the prompt: “Which specific news sources was chatGPT trained on? Provide a list of the top news sources in your database.”
Here’s the prompt I used: "Which specific news sources was chatGPT trained on? Provide a list of the top news sources in your database."
— Francesco Marconi (@fpmarconi) February 15, 2023
In addition to Wall Street Journal and CNN, the list included major publishers such as New York Times, Reuters, Al Jazeera, BBC News, The Guardian, Associated Press, The Economist, Bloomberg among others. Marconi stated that if there’s no agreement with these publishers, it may amount to a violation of the publishers’ terms and service. Given the long list of news sources the AI bot mentioned, one can anticipate the legal challenges OpenAI will find itself drowning in soon if others follow suit.
Why it matters:
The use of ChatGPT sparked discussions about the AI tool plagiarising papers, reports and other reference materials in the publishing sector. By claiming compensation for their work being used for AI training, news outlets now join the band of artists and coders who are already suing companies for scraping their creations without a deal. The developments in this space will be worth noting to understand the challenges that companies will face while using data on the web for machine learning and for training AI tools.
Lawsuits against AI-systems in news:
Early February, Getty Images sued Stability AI, creators of Stable Diffusion, which is an open-source AI model for creating images based out of text prompts, for “brazen infringement of Getty Images’ intellectual property on a staggering scale”. Getty Images has claimed that the AI company copied over 12 million images from their database without their permission or compensation “as part of its efforts to build a competing business”.
In 2022, programmers and copyright lawyers filed a class action lawsuit against Microsoft, GitHub and OpenAI alleging that GitHub Copilot, has been found to use “long sections of licensed code” without crediting the original coders. Copilot is a GitHub product which works as an AI-based coding assistant and is trained on large “public repositories” of codes from the web, many of which are licensed.
This post is released under a CC-BY-SA 4.0 license. Please feel free to republish on your site, with attribution and a link. Adaptation and rewriting, though allowed, should be true to the original.
Also read:
- MusicLM: Five Key Points By Copyright Lawyers On Google’s AI-Based Music Generator
- Explainer: Meet The New ChatGPT-Powered Microsoft Bing Search Engine And Edge Browser
- Google To Launch ChatGPT Rival AI Chatbot Called Bard
- Microsoft Teams Premium To Incorporate ChatGPT: 4 Key Take Aways
- Quick Take: ChatGPT Has Revived The Debate About AI Taking Away Jobs
Curious about privacy, surveillance developments and the intersection of technology with education, caste and welfare rights.
