Section 2.1 of the new policy states that:
“We may use the information we collect and publicly available information to help train our machine learning or artificial intelligence models for the purposes outlined in this policy.”
Elon Musk tweeted a clarification that this will only include public data, not direct messages or anything private.
The new policy goes into effect on September 29, 2023.
Twitter in July imposed various restrictions on its platform to curb access to AI services like Google Bard and ChatGPT from scraping its data to train their models. Elon Musk even threatened to sue Microsoft for this. Twitter now appears to want to leverage all that data for its own advantage.
The major concerns with scraping data to train AI models are copyright, attribution, and privacy. Although data shared on the platform is publicly available, they lose their provenance when they are scraped to train models. The data that was scraped might be copyrighted or contain personal data. For example, Elon Musk has been encouraging journalists to publish directly on X. What if this data is used to train AI models without proper attribution to these journalists? Or what if you share some artwork on the platform that is later used to train an AI image-generating model?
Article continues below ⬇, you might also want to read:
- 12 Countries Write An Open Letter To Tackle Data Scraping: Here’s All You Need To Know
- Google Bard Admits To Attempting Scraping Of Twitter Data: Here Are The Details Straight From The Horse’s Mouth
- Fail Whale: Why Is Twitter Imposing Limits On How Many Tweets You Can See In A Day?
- Here’s How You Can Block OpenAI’s Web Crawler From Scraping Your Site
— Moepi (@itzmoepi) September 1, 2023
One X user pointed out that we might be overreading this development, and this new clause is just boilerplate language for collecting data to improve ad targeting, but because of how broadly it is worded, concerns linger:
The ads are only provided as one example and are not the only thing they are using your information for. That doesn't matter because "we may use your information to train artificial intelligence models" is vague enough to include using your posted art to train AI art models.
— Moepi (@itzmoepi) September 4, 2023
Separately, X’s new policy also states that it may collect biometric information “for safety, security, and identification purposes.” While the policy does not define biometric information, it generally includes facial data, fingerprints, iris scans, etc. It would be worrying if this data is used by X to train AI models to build facial recognition technology, but we hope that’s not going to be the case since X says it will only use publicly available data for AI training purposes.