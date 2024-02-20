Reddit has signed a deal with an unnamed artificial intelligence (AI) company to allow the company to train its AI models on Reddit’s data. This deal, which is worth $60 million on an annualized basis could still change as the company’s plans to go public are still in the works, as per a Verge report.

While the news of the deal is recent, Reddit’s intent to paywall access to the platform’s data isn’t. In June 2023, Reddit changed its application programming interface (API) policy introducing a new premium access point to Reddit’s Data API for third-party apps (those built on top of Reddit, and used by moderators on the platform) that require higher data usage limits. This policy change fell in line with an interview by the company’s CEO and founder Steve Huffman who had spoken up about not wanting to give away Reddit data for free to large companies. He pointed out that people shared a lot of personal details on the platform which could help language learning models (LLMs like ChatGPT) generate better results.

According to a report by the Washington Post from October 2023, Reddit met with top generative AI companies to come to a deal about being paid for its data. The company was considering blocking search crawlers from Google and Bing if a deal couldn’t be reached. While doing so would have prevented the platform from being discovered through searches, the company was okay with the trade-off.

Why it matters:

Accessing data without paying for it is proving to be problematic for AI firms, especially considering the several instances of legal trouble faced by OpenAI (the company behind ChatGPT). Last year, the company saw five copyright lawsuits for using writers’ content without any proper compensation. More recently, OpenAI and Microsoft were sued by The New York Times , for training their AI models on copyrighted articles. This deal could encourage other social media platforms to put a premium on any AI firm accessing their data to train AI models.

