We missed this earlier: A bill was tabled in the US Senate which would require social media platforms to disclose information about their most popular content, algorithm usage, content moderation policies, and violations of said policies.
“Recent events, including the leak of documents as part of the “Facebook Papers,” have underscored the public value of this data and a need for legislation that increases the platforms’ transparency,” Senators Chris Coons, Rob Portman, and Amy Klobuchar had said while announcing the Platform Accountability and Transparency Act (PATA)
The bill looks to provide researchers and the public with more information about the workings of platforms as they gain more relevance in everyday lives, according to a press release. It also proposes to establish the Platform Accountability and Transparency Office which would vet, approve, and facilitate the disclosure of information to researchers studying platforms. The Office would also be responsible for coming up with guidelines that ensure cybersecurity and privacy are maintained while disclosing such information.
Why it still matters: There have been recent calls for platforms to pull back the curtains on their algorithms; for instance, Elon Musk has advocated for the same with respect to Twitter. In the past, the Ministry of Electronics and Information Technology (MeitY) also wrote to Meta asking for details of its algorithms. This bill has also stayed relevant in the research context as platforms themselves have indicated a willingness to release certain detailed information.
Dear reader, we urgently need to build capacity to cover the fast-moving tech policy space. For that, our independent newsroom is counting on you. Subscribe to MediaNama today, and help us report on the policies that govern the internet.
What are platforms required to disclose under the bill?
Details of advertisements: The bill requires platforms to disclose the following information concerning advertisements, within a year of its enactment:
- Name and ‘unique identifier’ of the advertiser
- Copy of the ad showed to users
- Targeting criteria selected by the advertiser and criteria used to deliver the ad
- Metrics about the extent of dissemination or engagement with the ad, including dates active and ad spending
- Metrics about the audience reached with the ad, their demographic or geographic data
- If the ad was determined to violate platform policies
Popular and prevalent content: Platforms have to proactively share samples of certain public content on the platforms, weighed against the engagement that they received, as per the bill. The following information should be shared about content that either originated or was spread by a major public account.
- The underlying content itself, including any public uniform resource locator link to the content
- Metrics about the extent of dissemination of or engagement with the content
- Metrics about the audience reached with the content
- Information about whether the content has been determined to violate the platform’s policies
- Information about the extent to which the content was recommended by the platform or otherwise amplified by the platform’s algorithms
- Information about the user accounts responsible for the content (including whether such accounts posted content deemed violating in the past)
The bill defines a ‘major public account’ as one with at least 25,000 followers or whose content is viewed by at least 1,00,000 users per month.
Content violating platform policies: The following information regarding content moderation should be provided regularly:
- Statistics regarding the amount of content that violated its policies, broken down by:
- the violated policy
- the action that was taken in response to the violation
- the methods the platform used to identify the violating content (such as AI, user report, human moderator review, or other means)
- the extent to which the content was recommended or otherwise amplified by platform algorithms
- the extent to which the user chose to follow the account that originated or spread the violating content, and if so, whether that account had been recommended to the user by the platform
- Statistics regarding the number of times violating content was viewed by users and the number of users who viewed it
- Estimate reports about the prevalence of violating content (measured by the number of impressions of violating content) that are broken down by the factors previously mentioned.
- Representative examples of violating content should also be shared with researchers
Metrics and workings of algorithms: The bill says that the following information should be released at least semi-annually:
- A description of all product features that made use of algorithms during the reporting period
- Summary of signals and features used as inputs to the described algorithms, including an explanation of all user data incorporated into these inputs, ranked or based on the significance of their impact on the algorithms’ outputs
- Summary of data-driven (AI/ML) models used in the described algorithms, including the optimisation objective of such models (such as pre-dictions of user behavior or engagement)
- Summary of metrics used by the platform to score or rank content
- Summary of metrics calculated by the company to assess product changes or new features, with an assessment of their relative importance in company decision-making
- A description of significant datasets in the platform’s possession related to content or users of the platform, enforcement of content policy, or advertising, as necessary or appropriate to inform and facilitate researcher data access requests
- Significant changes during the reporting period from the last report
What are the rules for researchers?
The proposed Platform Accountability and Transparency Office (PATO) established under the Federal Trade Commission, will create regulations for the sharing and security of all information shared under the bill, including data given to researchers. The US National Science Foundation (NSF) will vet the research applications and the data that they need from platforms.
- While considering such a proposal, the NSF will consult with the PATO to note any cybersecurity or privacy challenges that will need to be addressed in the latter’s regulations.
- Once a project is approved, the PATO will lay down and inform the platform of cybersecurity and privacy safeguards that it will have to follow while sharing such data. These safeguards could include encrypting the data or delinking it from individuals’ data.
- After this, the platform can submit its comments about PATO’s cybersecurity and privacy safeguards within 20 days.
- The PATO shall issue a final set of safeguards within 20 days.
- Once the research is complete, a pre-publication version must be submitted to the PATO and the platforms at least 30 days prior to the research being ‘publicly released’.
- The PATO or the platforms can check and raise objections if the research has violated any applicable laws or the commission’s regulations as well as the release of any personal, confidential information and trade secrets within 15 days. If no objections are raised, the research can proceed.
- In case of objections, the researcher can resubmit a modified version of the research to both parties within 120 days. In case of objections raised by platforms, researchers can contest them further.
- In case the researcher contests the platform’s objection, the PATO will decide on the matter within 50 days after the researcher sends their contestations.
What are the carve-outs for researchers?
The bill proposes a ‘safe harbour’ provision for researchers using data as facilitated by this bill, for their research projects. Thus, it says that no action can be taken against such researchers under any local, federal, or State (US) laws for the violation of any platform policies and for simply using the data provided to them through this bill, while being in compliance with the privacy and cybersecurity safeguards laid down by the PATO.
NSF to publish information on the selection process for researchers: The NSF has to establish a process for soliciting research applications, publish a list of its criteria for identifying qualified research projects, and publish a list of its criteria for identifying the data necessary to conduct research under the bill; all withing six months of the bill’s enactment.
The data that is identified by researchers should be feasible for the platform to provide; be proportionate to the needs of the qualified researchers to complete the qualified research project; and not cause the platform undue burden.
Research should be approved or exempt from institutional review board: A research project cannot be approved by the NSF unless the institutional review board of the university that a researcher is affiliated with, has approved of the project itself, unless the project is exempt from or does not fall in the criteria for an IRB’s review. An IRB is an independent body instituted by universities in the US, which seek government aid for research projects.
Platforms must provide continued access to data: “Platforms must enable qualified researchers to preserve access to qualified data and information (research data to be supplied for approved projects) as necessary to carry out qualified research projects (i.e researcher projects approved under the bill),” the bill says.
Terms for using data accessed by researchers: Under the bill, researchers cannot attempt to re-identify or access any personal information from the data provided to them and can only use the data provided to them only in accordance with the terms of their research project, the PATO’s safeguards, and in compliance with local, state and federal information-sharing and privacy laws. In case a researcher is found to be in violation of these terms, they can face action under the applicable local, state, or federal laws.
What if the platform does not comply?
In case a platform has violated any provisions of the bill, the FTC can file a civil action in a district court of the US which could result in fines going up to $10,000 USD for each violation. Researchers can also file such an action or urge the FTC to file it if a platform does not provide them with the data they need for their research project that’s been approved as per the bill.
This post is released under a CC-BY-SA 4.0 license. Please feel free to republish on your site, with attribution and a link. Adaptation and rewriting, though allowed, should be true to the original.
- Explained: Facebook’s tussle with researchers studying its algorithms and political ad data
- Meta says it will publish data on targeted ads from June
- Summary: What are the new Santa Clara Principles on transparency from social media platforms?
Have something to add? Subscribe to MediaNama here and post your comment.