Twitter “may” label tweets containing synthetic and manipulated media, including videos, audio and images, and will remove such content, if they are “deceptively shared,” and pose “serious harm”, the company said on February 4. This process will begin from March 5, 2020. According to a video shared by Twitter Safety, users will be able to get more information from “reputable sources” on a piece of content labeled as “manipulated media”. In October 2019, Twitter had said that it was working on a new policy to “address synthetic and manipulated media” on the platform and had sought comments on the same.

How does Twitter categorise manipulated media? Twitter will determine if a piece of content has been “significantly and deceptively altered or fabricated” by assessing:

  • Whether the content has been substantially edited in a manner that fundamentally alters its composition, sequence, timing, or framing
  • Any visual or auditory information (such as new video frames, overdubbed audio, or modified subtitles) that has been added or removed
  • Whether media depicting a real person has been fabricated or simulated.

If a piece of content fits under any of the above categories, Twitter “may” label it. Twitter will not label or remove media that have been edited in ways that do not fundamentally alter their meaning, such as retouched photos or colour-corrected videos. It will determine if a media has been significantly manipulated by using its own technology, or through partnerships with third-parties. If it fails to “reliably determine”  if media have been altered or fabricated, it “may” not label or remove them.

  • While Twitter hasn’t name-checked videos/audio edited using AI/ML — deepfakes —, it said that “synthetic and manipulated media take many different forms and people can employ a wide range of technologies to produce these media”.
  • It isn’t clear what would happen if a piece of content gets mislabeled. Is there any recourse for such cases? We have reached out to Twitter for more information.

Determining the context of manipulated media: Twitter said that it will determine the context in which media are shared and if that could lead to “confusion or misunderstanding” or have a “deliberate intent” to deceive people. Other factors it will use to asses context are:

  • The text of the tweet accompanying or within the media
  • Metadata associated with the media
  • Information on the profile of the person sharing the media
  • Websites linked in the profile of the person sharing the media, or in the tweet sharing the media

Crackdown on manipulated media if they pose ‘serious harm’: Twitter will crack down harder on manipulated media if it’s presented as truth or “likely to impact public safety or cause serious harm.” Specific harms include:

  • Threats to the physical safety of a person or group
  • Risk of mass violence or widespread civil unrest
  • Threats to the privacy or ability of a person or group to freely express themselves or participate in civic events, such as:
    • Stalking or unwanted and obsessive attention
    • Targeted content that includes tropes, epithets, or material that aims to silence someone
    • Voter suppression or intimidation

Twitter clarified that it would “err toward removal in borderline cases that might otherwise not violate existing rules for Tweets that include synthetic or manipulated media”. It also said that it is “more likely” to remove content under this policy if it finds that “immediate harms” are likely to result from the content’s presence on Twitter. We have reached out to them to understand the time frame for assessing “immediate harms”.

How the labelling and removal process will work: Twitter said that it would provide “additional context” on tweets that have such manipulated media. It “may”:

  • Apply a label to the content
  • Show a warning to people before they share or like the content
  • Reduce the visibility of the content on Twitter and/or prevent it from being recommended
  • Provide a link to additional explanations or clarifications, such as in a Twitter Moment or landing page.
  • It said that it will take all the above actions in “most cases”.

To assess if media needs to be removed, the platform provided the following chart:

How the removal process will work

A few questions: Since the removal of manipulated media is contingent upon them being deceptive and posing “serious harm,” we have asked Twitter if an 18 second clip shared by BJP’s Tajinder Pal Singh Bagga, where he claimed that slogans against Hindus were raised at a protest at the Gateway of India, would be labeled or removed. The video shared by Bagga was a clipped portion of a longer video, and wasn’t edited further, though the caption accompanying the video was found to be false by Alt News.

  • What would happen if an anti-CAA rally’s video is posted as a pro-CAA rally in the caption? Will such videos be labeled or removed?

Big tech attempts to battle deepfake videos and manipulated media

  • Twitter’s policy on dealing with manipulated media comes after Facebook released its own set of rules regulating deepfake videos and other forms of doctored videos on the platform. YouTube too reiterated how it would deal with doctored videos, especially those related to the US Presidential elections.
  • In September 2019, Facebook and Microsoft announced the Deepfake Detection Challenge (DFDC) to produce technology that can be used to detect a deepfake videos.
    • In October 2019, Amazon Web Services said that it would work with Facebook and Microsoft on the challenge and contribute up to $1 million in AWS credits to researchers and academics over the next two years.
  • In September 2019, Google released a large dataset of visual deepfakes in order to aid researchers to directly support deepfake detection efforts.