- Issues with definition of communities: The definition of a community is too broad and ambiguous. Besides, several digital communities, which would produce data, may not even be real world communities.
- Need to seek more accountability from data trustees: Trusts don’t necessarily ensure transparency, and as a result, data trustees need to be held to a higher regard. They should publish annual transparency reports highlighting the high value datasets (HVD) they create, and should be open to judicial scrutiny.
- High value datasets may have privacy implications: Since the definition of a data trustee and a high value dataset are too broad, high value datasets might potentially contain personally identifiable information (PII), giving way to privacy concerns.
- Will data sharing actually help small companies? It appears that the report in the current version may end up favouring large software companies, as opposed to smaller firms. On the other hand, domestic oligopoly isn’t a viable alternative to challenge digital colonialism.
- A separate law should govern sovereign access to non personal data: As per the current report, the Non Personal Data Authority won’t adjudicate on data access for sovereign purposes. However, there is a need for a separate law to govern that.
“I think there are a couple of references still remaining to eminent domain and that’s where I start getting really uncomfortable because using a land rights framework, or a forest rights framework for data can actually set consumers back. This is because so many of the data communities aren’t real world communities. It is not like they are people of a particular tribe. They are just somebody in a dataset somewhere, who are vested with rights and expected to exercise them,” said Malavika Raghavan, a lawyer and researcher.
Raghavan was speaking at MediaNama’s January 15 discussion on the revised report by the Committee of Experts on a Non-Personal Data Governance Framework. The discussion was hosted with support from Facebook and Microsoft.
Lack of clarity on definition of communities, role of data trustees
- Definition of a community ‘ambiguous’: The way that the community concept is defined in this report and if these rights are to be vested in community, makes it a “slightly ambiguous concept”, and might get too wide is that because it says it’s geography, social interest, anybody bound by any interest, said Raghavan. “Many of these data driven communities aren’t even aware that they are a community because you do not know that your data is of high value, because you all own a particular kind of car, for instance. That kind of community, organically creating its own data trustee, registering it as a Section 8 company and then using that to protect its data is a pretty difficult thing for me to wrap my head around,” she added.
- ‘Community can’t be as concrete as an individual’: A community is defined in the report, but there is of course never such a concrete boundary of a community as of an individual, said Parminder Jeet Singh of IT for Change, a member of the Committee of Experts. “As far as who can set up high value datasets goes, actors who have a stake in that high value dataset and can get involvement of enough community actors into making that high value dataset, and gets an acceptance from the Non Personal Data Authority, can create a high value dataset,” he added. Itihaasa Research and Digital’s Dayasindhu N said that there might be interesting opportunities that might arise when businesses, especially all Indian businesses, combine multiple high-value datasets and get insights on the market, or on a segment that they are looking at to enter.
- Data trusts don’t always ensure transparency: “Trusts don’t always equal transparency,” said Raghavan. “A trust is often something that is created as an intermediate step, which does help liability management and risk management, but if done in a way that is not empowering for the trustees or the people who are beneficiaries of the trust, it in fact can be problematic,” she added. When an attendee asked how the gap between citizens and data trustees will be placed and the information asymmetry be reconciled, Raghavan said she currently didn’t see any safeguards for this, and that “somebody acting on behalf of a community can actually end up entrenching that distance”.
- In response, Singh contended that data trusts and a community rights approach reduces the government’s overbearing role in the entire process, while saying that some amount of mandatory data sharing will always take place. “The role of the government has been reduced by the community data, and community rights framework. Even the EU is asking for certain kinds of data, there is mandatory data sharing. Some mandatory data sharing is going to take place. Whenever mandatory data sharing takes place you will always say that the government is exercising its eminent domain rights. I think by putting a community framework it reduces that fear, it gives us more chances to fight back. It does not completely remove that fear, however,” said Singh.
- Need to ask more questions on functioning of data trustees: Singh also said that more questions should be asked about how community trustees will manage their activities. “One of the biggest things to be done next is to write a letter about the governance structure of the community trustees who manage their activities. And, I think my personal view is that a paragraph should be added to that,” Singh said.
- Data trustees should ensure transparency and accountability: “On data trustees, I’d say, your ownership and your other kind of operational links with any other groups, and specially those who might benefit from the creation of this high value dataset should be tested, because you don’t want that kind of conflict of interest,” said Subhashish Bhadra of the Omidyar Network. To make data trusts better, Bhadra suggested:
- There should be transparency; a data trustee should publish some sort of annual report, highlighting which high value datasets you created, who had access to it, and did you deny any request.
- To have an independent judicial arm to listen to matters on how a data trustee asks for data.
- The data trustee should not be discriminatory towards incoming data requests.
Privacy implications of high value datasets: “I think the role of the data trustee and the definition of the high value datasets is extremely open winded. Everything can basically be a high value dataset. And that would have been okay if it was only data which had nothing to do with the community or generated by the community, and was all just industry data. The issue is that a lot of that information does have PII underlying it, or it is generated in relation to anonymisation,” Raghavan said.
‘Public good’ too broad a term
- ‘Public good’ needs to be better defined: Public good in the report is a broad term, as per Raghavan. “It can’t just be public good for something totally disconnected from non-personal data sharing or the related objectives of this legislation. And that is why I think that needs to be clarified,” she said.
- Trilegal’s Jyotsna Jayaram reiterated this assessment, and called for better definition of public interest and public good. “I completely appreciate that something like public good or public purpose ought to be vague and broad, because you want it to evolve with the changing needs of the community, and the several considerations. But given that here, public good is the ground on which you will need to mandatorily share data, I think at the very least, it should not be as broad and as exhaustive and left to anybody’s guess”, she said.
- ‘Public good is an ever-evolving notion’: On the other hand, Singh said that public interest has always been open, and the whole idea of political process is that it defines public interest on and in different situations.
“Even private resources can be expropriated in the name of public interest. There is something called standard essentials patenting—some standards, if they are essential in public interest can be made available on a need basis. Now, what is public interest is in that context determined by that standards body or those bodies around it. In public interest, again compulsory licensing against patents can be done. Now what is public interest is always a time to time determination,” Singh said.
- Dayasindhu said that defining public good is a balancing act. “It is a difficult balance between making it broad so that you want to include as many public good purposes, and at the same time, define certain examples of public good. This will have to evolve, as there would be newer domains of public good which emerge,” he said.
- “Certain kinds of infrastructures are considered public interest—certain kinds of transportation data, certain kinds of weather data. If data in a sector is considered of a cross-cutting infrastructural nature, it can be considered to be in public interest,” said Singh
Does sovereign access to non personal data require a separate law?
“Conveniently or inconveniently, the committee has said that we are not talking about the sovereign access to non personal data. Now two things can be read about it: one is it is wrong, said Singh. He added:
“Non-personal data is very powerful data. It is powerful in the hands of corporations and it is powerful in the hands of the state. The intelligence which non-personal data gives, the patterns which non-personal data gives is also very powerful. Therefore, there needs to be a check and balance, vis-à-vis what can be shared with the government and what the government can ask for” — Parminder Jeet Singh, IT for Change on the government’s unabated access to non personal data under the revised report
Singh justified that if the committee were to say that the Non Personal Data Authority (NPDA) will adjudicate on the sovereign purpose, then it has to enter into some kind of principles development. “You cannot just set up an authority and say this authority will deal with what is happening. You will have to lay down principles on what it will deal with and how it will deal with those things. Since the committee did not enter into that, it simply had to say that this NPD Authority would not, on its own, be making those decisions because this committee gives no basis or principles or guidelines to do that,” Singh said.
“My personal opinion is that this is very important area, I think the law on data sharing for the domestic industry, and the high value data sets, for what I call a digital industrialisation purpose—which is the main purpose of this committee—should be treated in a separate way than sovereign purposes, but both requires laws.” — Parminder Jeet Singh, IT for Change on whether government access to non personal data should be restricted
However, Singh said that sovereign access to non personal data isn’t the only aspect that the Committee of Experts has not dealt with. “It is not only sovereign data, there is another kind of data which is the data that platforms like Amazon and Flipkart have on users and sellers, or food delivery platforms have on restaurants. This is a very important class of non personal data which the committee doesn’t deal with,” he said. However, in the EU, the latest Digital Markets Act says that restaurants will have access not only to its own data which is with a platform, but also aggregate non personal data, Singh added.
Data sharing: Data colonialism vs domestic oligopolies
- ‘Mandating data sharing not under eminent domain’: “The report very explicitly states that mandating data sharing is not under eminent domain. Data sharing is proposed under a community rights framework and an explicit connection has been made to different kinds of community resource management. So it is a community resource and anybody who holds the resource would be a trustee, Singh explained.
- The new Data governance act of the EU lays down that if you are a data sharing service and you declare yourself as a data sharing service, you have to be neutral, and this becomes a voluntary system which is stamped for quality by a government institution, Singh said. Google, Amazon have their own data sharing frameworks, which are by far better than anything which an independent or government system could put up.
“The society’s biggest data hoarders are currently a few global companies which are all US or Chinese, and all other countries including the EU are becoming like third world or developing countries, or colonies as the CEO of Naspers said when he was visiting India. If people across the world become dependent on these few companies then they will get de-industrialised, or de-digitalised.” — Parminder Jeet Singh, IT for Change
- Will data sharing even help smaller firms? Ragahavan said that perhaps only large Indian software firms may benefit from the current data sharing mechanism.“My concern is the people who will be able to benefit from this, the current way that it is set up is basically large Indian software firms who want to access other firms’ data sets, because having spoken to smaller Indian firms, they are very worried about this as well,” Raghavan said.
- Raghavan also said that we shouldn’t give rise to domestic oligopolies just because we want to solve the problem of data colonialism. She also said that in a lot of countries there is data sharing infrastructure, but by no means is it always a public data sharing infrastructure where the public itself sets it up. “I recommend the ACCC’s [Australia’s antitrust regulator] super interesting take on ensuring that market power, innovation happen and consumers are safe. That’s probably an interesting model that we’ve missed a trick on in that if you want to incentivise voluntary data sharing in a safe and secure way, but you also don’t want to create these concentrations of powers that you know is clearly the concern over here,” Raghavan said.
In this series: