15 human and digital rights organisations have released an updated version of the Santa Clara Principles. The Santa Clara Principles, which were first released in 2018, laid down minimum requirements of transparency and fairness in social media companies’ content moderation processes.
The new principles expand on those requirements and ask companies to consider human rights, cultural contexts, clarity, state involvement, algorithmic fairness in enforcing their policies, while making disclosures about the same. The principles also have a few recommendations for governments and how they deal with social media companies.
According to Wired, the principles were endorsed by various social media companies like Facebook, YouTube, Twitter, Reddit, GitHub after they were released. The new principles could lead to new transparency and content moderation policies followed by social media companies.
What are the new principles?
Human rights and due process
Companies should consider human rights and due process in their content moderation processes, using automated processes only when there is high confidence in their accuracy, and providing users with clear appeals to moderation decisions/actions, the principles say. They should do so by:
- Clearly outlining to users how they integrate human rights considerations in their content moderation policies.
- Informing users of the extent to which it uses automated processes in content moderation and how it takes human rights considerations therein.
- How the company has considered the importance of due process in the enforcement of its rules and policies, and they have maintained its fairness and integrity.
Clear rules and policies
Content moderation rules should be clear, precise and available in an easily accessible location for users. These should have the following:
- Types of content prohibited by the company with detailed guidance and examples.
- Types of content other actions can be taken against, such as algorithmic downranking, with detailed guidance and examples on each type of content and action
- The circumstances under which the company will suspend a user’s account, whether permanently or temporarily.
The principles say ‘Cultural Competence’ means that those making content moderation and appeal decisions understand the language and the social, political, and cultural contexts of the posts. This should be implemented by :
- Providing access to rules, policies, notices, and appeal proceedings in the user’s language.
- Allowing moderation decisions by those familiar with the relevant language or dialect. Moderation decisions are made with sufficient awareness of any relevant regional or cultural context; and
- Reporting data that demonstrates their language, regional, and cultural competence for the users they serve, such as numbers that demonstrate the language and geographical distribution of their content moderators.
State involvement in content moderation
The principles ask that companies recognise that state involvement in content moderation has risks to user rights- it particularly flags state involvement in the creation of a company’s rules or policies as well as requests from government bodies like courts, law enforcement agencies, etc. It asks that users be informed when their content is taken down as a result of state action, and told about the law that invited the same. Further, specifically, it asks that users be informed of the following information:
- Details of any rules or policies, whether applied globally or in certain jurisdictions, which seek to reflect requirements of local laws.
- Details of any formal or informal working relationships and/or agreements the company has with state actors when it comes to flagging content or accounts or any other action taken by the company.
- Details of the process by which content or accounts flagged by state actors are assessed, whether on the basis of the company’s rules or policies or local laws.
- Details of state requests to action posts and accounts.
Integrity of the moderation processes
The principles raise concerns about abuse of automated content moderation systems, their reliability, effectiveness, and recommend that companies take the following steps:
- Clearly outline what controls users have access to which enable them to manage how their content is curated using algorithmic systems, and what impact these controls have over a user’s online experience.
- Give users a high level of confidence that content moderation decisions are made with care and consideration to human rights. Users should also have a high-level understanding of the moderation process.
- Actively monitor the quality of their decision-making to assure high confidence levels, and are encouraged to publicly share data about the accuracy of their systems and to open their process and algorithmic systems to periodic external auditing.
Principles for State actors
Removing barriers to transparency: The principles ask that governments allow companies them to reveal the number of requests for information or takedowns of content except in situations where not prohibiting such revelations has a clear legal basis and achieves a legitimate aim. They further ask governments to remove barriers, and not introduce new ones that prevent companies from complying with the principles.
Being transparent themselves: Governments should reveal the number of content moderation actions that it was involved in, such as content takedowns or account suspensions along with the legal basis for it. This should account for all state actors, including subnational bodies.
The principles also ask that government’s think of how they can social media companies more transparent, in accordance with these principles, through regulatory or non-regulatory mechanisms.
Updates to the first principles
The new principles are split among foundational and operational principles. The latter provides ‘more granular expectations for the largest or most mature companies with respect to specific stages and aspects of the content moderation process’ according to the authors. These formed the previous iteration of the Santa Clara Principles, but have now been significantly expanded.
Data to be reported
Apart from the data mentioned in the aforementioned recommendation for disclosure, the principles ask that the following data should be reported quarterly, in an openly licensed and machine-readable format.
The new principles recommend that companies provide the number of content and accounts actioned, by region or country, and by a rule along the following categories:
- Total number of pieces of content actioned and accounts suspended
- Number of successful and unsuccessful appeals to such decisions
- Number of successful and unsuccessful appeals to decisions taken by automated systems
- Number of posts or accounts reinstated by the company, without appeal- thus recognising an error
- Number related to enforcement of hate speech policies, by targeted group or characteristic
- Numbers related to content removals and restrictions made during crisis periods, such as during the COVID-19 pandemic and periods of violent conflict.
On involvement of state actors:
The principles ask for special disclosure, over and above its previous principles on state involvement in moderation. These should be broken down by country
- The number of demands or requests made by state actors for content or accounts to be actioned
- The identity of the state actor for each request
- Whether the content was flagged by a court order/judge or other type of state actor
- The number of demands or requests made by state actors that were actioned and the number of demands or requests that did not result in actioning.
Data related to flagging of posts
The principles say that in order to prevent abuse of a company’s flagging mechanisms, they should also report
- Numbers of posts flagged during a period
- Number of such flags traced to bots
- Total number of accounts and content flagged for
i) Alleged violation of rules and policies
ii) Source of the flag (state actors, trusted flaggers, users, automation, etc.)
Information related to automated systems:
Flagging the increased involvement of automated systems in content moderation, the principles ask for additional disclosures on the following areas:
- When and how automated processes are used (whether alone or with human oversight), categories and types of content they are used in,
- key criteria used by automated processes for making decisions
- The confidence/accuracy/success rates of automated processes, including changes over time and differences between languages and content categories,
- The extent to which there is human oversight over any automated processes, including the ability of users to seek human review of any automated content moderation decisions;
- The number (or percentage) of successful and unsuccessful appeals when the content or account was first flagged by automated detection, broken down by content format and category of violation.
- Participation in cross-industry hash-sharing databases (content known to be produced by problematic elements) or other initiatives and how the company responds to content flagged through such initiatives.
The previous iteration only required data to be disclosed along 6 dimensions which were total number of posts and accounts flagged, and total number of those suspended. This information should further be given by location, source of flag, format of content, and category of rule violated.
How should a user be served a notice?
The new principles require companies to include the following in a notice sent to a user when their account or content is actioned:
- The user should be specifically informed if the content was removed because it violated local law, and not the network’s policies
- The notice should be in language of the original post, or of the user’s interface.
- In case of content removal, a notice should be put in the original location of the post. Where appropriate, even in other cases the principles say, any other relevant individuals, such as flaggers or group administrators should be sent a notice.
The previous iteration of the rules laid down the following requirements which have been retained:
- Information like URL, or an excerpt to allow the user to identify the content that was actioned.
- The specific clause that was violated should be included in the notice.
- The user should be informed of channels to appeal the decision, and any time limits or other conditions therein.
- The notice should also be in a form where it is still accessible, even if the user’s account is suspended or terminated.
- How the content was flagged whether by a state actor, automated system, trusted flagger,etc.
- Even the individual reporting content should be provided with a log of the result of the moderation process for the pieces of content.
The process of appeals
The new principles add that companies-
- keep a person deciding on the appeal, who is familiar with the language or cultural context of the content under action.
- They should apply the principle of proportionality, prioritising appeal on the most severe restrictions such as content removal and account suspension.
- Further, they should take in to considerations that there could special circumstances where the appeals process may have to be expedited, because the content is time sensitive, or if the affected user may be the target of an abusive takedown scheme, and so on.
The previous iteration of the rules laid down the following requirements which have been retained:
- The appeals process should be clear and accessible, users using it should be provided with details of a timeline and the ability to track the progress.
- A person or a panel of persons who were not involved in the initial decision should take up the appeal.
- Users should be allowed a chance to present additional information during the appeal
- Summary: IT Rules clarifications from the government and some comments from us
- IT Rules FAQs: What are the issues that MIB needs to address?
- CIS Report: Are the IT Rules constitutionally and legally sound?