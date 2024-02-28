wordpress blog stats
User Data for Sale? Automattic’s In Talks with OpenAI and Midjourney: Report

Automattic assures user control amid potential data deals with AI companies, but questions linger over privacy.

Tumblr and WordPress.com’s parent company Automattic are currently in talks with OpenAI and Midjourney to sell their user data, according to a 404 Media report. It is unclear what type of data will be sold to the two artificial intelligence
(AI) companies. However, internal documents reveal that a query made to prepare data for OpenAI and Midjourney compiled a huge number of user posts that it wasn’t supposed to. The report states that it is unclear whether these user posts were sent to the AI companies or whether the internal document discussed the process of the data being scrubbed. This deal has not been confirmed by any of the companies involved.

Automattic wouldn’t be the first company to sell its data to an AI company. Earlier this month, it was reported that Reddit signed a $60 million a year deal to make its content available for training Google’s AI models.

Automattic’s statement on user choice:

While the deal has not been confirmed, Automattic has added a statement to its website saying that it is “working directly with select AI companies as long as their plans align with what our community cares about: attribution, opt-outs, and control.” It assures users that its partnerships with AI companies will respect opt-out settings. Automattic plans to regularly update AI partners about people who newly opt-out and ask that their content be removed from past sources and future training.

The company mentions that it currently blocks AI platform crawlers by default and has a setting to discourage search engines from indexing a site on WordPress.com and Tumblr which it claims signals the search engine not to crawl that content or include it in search results.

Why are we seeing deals between platforms and AI companies:

Accessing data without paying for has landed OpenAI in legal trouble several times in the recent past.  Last year, the company saw five copyright lawsuits for using writers’ content without any proper compensation. More recently, it is OpenAI and Microsoft were sued by The New York Times, with the publication claiming that ChatGPT and Microsoft’s Bing were generating output that recites Times content verbatim, closely summarizes it, and mimics its expressive style. All this has deprived the Times of subscription, licensing, advertising, and affiliate revenue, the publication stated.

Given the legal trouble caused by using data without setting up some sort of agreement with publications/creators, it makes sense for OpenAI and other AI companies to enter deals similar to the ones reached with Automattic and Reddit.

