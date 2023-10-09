wordpress blog stats
BBC to block ChatGPT from scraping data off its websites

With this, BBC joins the rank of other news outlets such as the New York Times, CNN, Reuters, and the Australian Broadcasting Corporation (ABC) to block OpenAI’s web crawler from accessing its website.

“We do not believe the current ‘scraping’ of BBC data without our permission in order to train Gen AI models is in the public interest,” Rhodri Talfan Davies, the Director of Nations at the BBC, recently said in a blog post. He stated that the BBC has taken steps to prevent web crawlers like those from Open AI and Common Crawl from accessing its websites.

At this point, the copyright issues associated with OpenAI’s generative AI tool ChatGPT are common knowledge. Many experts have explained that there are ethical issues associated with scraping public data for training AI tools since often the authors of the data might not have given informed and meaningful consent for the use of said data. 

While others (like the Authors Guild and Getty Images) have sought out legal remedies to protect their creative work from copyright infringement, news platforms have taken an alternative approach by limiting OpenAI’s access to their content. It remains to be seen how successful this would be, given that content cross-posted onto other sites or on syndication platforms would still end up being discoverable to the web crawler. 

Besides stating that it will not allow web crawlers on its websites, Davies also outlined three principles that will shape BBC’s work with generative AI—

  • Will act in the best interests of the public: He says that the company will explore how it can deliver greater value to its audience. At the same time, he explained, BBC will also  attempt to mitigate the challenges created by generative AI, “including trust in media, protection of copyright and content discovery.” He said that the BBC will work with tech companies, media operators, and regulators to “champion safety and transparency in the development of Gen AI and protection against social harms.”
  • Will prioritize talent and creativity: Davies says that the BBC will work with reporters, writers, and broadcasters to explore how they can use generative AI. He assures that the company will “always consider the rights of artists and rights holders when using Generative AI.”
  • Will be open and transparent: He mentioned that the company will be transparent with its audience when Generative AI output features in its content and services. He explained that the BBC will never rely solely on AI-generated research in its output. 

Davies said that the company intends to use generative AI in multiple projects. “These projects will assess how Gen AI could potentially support, complement or even transform BBC activity across a range of fields, including journalism research and production, content discovery and archive,” he explained. 

