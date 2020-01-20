When it comes to data, the primary aim of the government is to earn money, and “the only way they can earn is when they actually sell personal data in a non-personal way”, said a speaker at MediaNama’s roundtable discussion on non-personal data. Speculating about the intentions of the committee of experts that is working on non-personal data, especially Avanti Finance CTO Lalitesh Katragadda, a speaker said that the committee wants to “ring fence the Indian data economy so that data that is useful for India’s national development can be used by Indians”. They likened it to rise of similar debates in other parts of the world including Germany (“data sovereignty” and “community data”) and France. Another participant highlighted the few stated objectives of government when it comes to data governance: economic growth, its governance, and privacy protection.

(Note: The discussion was held under the Chatham House Rule; quotes have not been attributed to specific people. Quotes are not verbatim and have been edited for clarity and to preserve anonymity. Also note that this discussion took place before the PDP Bill, 2019, was made public.)

Why does the government want control over non-personal data?

Some private data is valuable to community: Globally, there is now an understanding that some companies dominate the global data market. From that, a speaker highlighted, has emerged an idea that “some of this data is valuable to the community”. While how this community data is defined remains up for debate, “some data that private companies hold can be used for social good/purpose” was potentially the primary goal behind creating the committee of experts, they said.

What kind of data does the government seek to control through non-personal data governance?

Data from IoT devices: Remarking that the “government wants to create a governance framework(s) for all data”, a speaker said that the government wants “personally sourced data which is not sourced from personally identifying information”, and thus means data from sensors and IoT devices. Thus, according to the speaker, “a lot of the argument is about how such data has to be governed, regulated and perhaps shared with and used by start-ups”.

Factors to consider while governing non-personal data

A participant speculated that the committee on non-personal data would probably propose a policy that “incentivises return of certain kinds of data sets to a marketplace of sorts”.

Use must govern policy: A number of participants agreed that the governance of non-personal data should be governed by how the data is used. That should include establishing how the data would/should be used, who its owner is, what kind of risks it poses to users and establishing thresholds for it, etc.

However, a lawyer said that “it is not about who owns the property, but what uses are made of that property which should determine how it is regulated”, with respect to data collected by instrumentation in smart cities. They clarified that even though the basic principle of jurisprudence, governed by John Locke’s theory of labour, says that “anybody who puts effort into collecting certain information or collecting certain data becomes the owner”, in this case, it should be regulated by use.

Benefits of processing non-personal data

Certain level of granularity can help in emergencies: In certain situations, more personally identifiable data is more useful, a person said. Taking the example of a flood, they said, if a government passes a law that says they need location information at an aggregated level for the lake for the last 24 hours. “We will store it in a particular manner, and once the situation is sorted, will delete the information. In emergency situations, certain kinds of location data, if governed well, is of immense use to society. And if that data is deleted immediately after that and not used for anything else, it is a positive net gain,” they said.

Harms of processing non-personal data

Biased, inaccurate data collection can lead to ‘horrific’ outcomes: If the aim is to create a kind of public commons of non-personal data, we need to be especially careful about the kind of inherent biases that may be built into databases, a participant warned. Cautioning that use of AI and ML would mean that a human might not even look at some of these decisions, they said that “if these data sets are not properly curated, if the data that is fed into these data bases is not accurate, significant decisions could be taken about entire populations and communities with inaccurate and biased data”.

A speaker commented on how a lot of aggregated data is sold by PR companies. The trends that then emerge are used to profile and specifically target specific people for a whole host of purposes. Group privacy remains ignored: A speaker cited the Sidewalk Labs project in Toronto to highlight how community data can be used for behaviour modelling and to profile communities. “Google’s open infrastructure and technology arm, Sidewalk Labs, entered into a public-private partnership with a neighbourhood in Toronto. Under this project, Google embedded passive sensors within urban infrastructure to decide how streets and traffic management is going to happen, what kind of neighbourhoods get developed, how do you locate the most essential communities to allocate resources to and so on,” they explained. While this may not look at individual data, but data is being processed to formulate public policy, group privacy is not considered and such schemes allow for greater state surveillance as well.

What can be done to prevent some of these harms?

A lot of the harms associated with non-personal data are also possible with PII, a speaker said. To that end, they suggested some ‘Lean Data Practices’ that can be implemented to mitigate these risks. These include norms around data collection, data storage, data processing, and deletion of useless data.

Not share certain personally identifiable information (PII) with government: A speaker suggested that we could come up with a list of PII that should never be present in data sets that are shared with the government.

How will consent work?

A speaker pointed out that the thus far, consent has not been taken into account when considering anonymised data. Another participant said that as we consider consent, it all boils down to who owns the data.

Consent doesn’t come into the picture: Since data is irreversibly anonymised by the data processor, it becomes the IP of the processor, and thus the question of consent does not really arise, argued one speaker.

Since data is irreversibly anonymised by the data processor, it becomes the IP of the processor, and thus the question of consent does not really arise, argued one speaker. Without consent, there can be other harms: Taking the instance of demographic information, a person asked if they could refuse to consent to it. “I don’t want things like my religion being used even in an aggregated way because aggregated anonymised non-personal data, which was originally PII, can lead to significant harms,” they said. They took examples of databases in Andhra Pradesh and Telangana that can be analysed for religion and yield gram panchayat-wise percentage of Hindus, Muslims, etc.

