“Personal data can properly be subject to Indian legal reach and protections only when it is in India,” said IT for Change, an NGO, in its response to the additional comments sought by the Ministry of Electronics and Information Technology (MEITY) on the Personal Data Protection Bill. The ministry had privately sought responses to fresh questions on the data protection bill from select stakeholders – a development we made public last month.

Some stakeholders have responded to the consultation despite not being contacted by MEITY for responses. S. Gopalakrishnan, Joint Secretary MEITY had said then that that there would be no public consultations, and the ministry has only sought clarifications from some people. Note that the questions asked by MEITY didn’t appear to be clarifications, and some of them covered new points of discussion not covered in the data protection bill consultation.

In its responses, IT for Change welcomed data localisation claiming that “localised data has immense legal and economical value”. It also said that any framework surrounding community data should lay a general claim of “Indian data being first for Indians and its various communities”.

Here are detailed notes from IT for Change’s responses:

Why Personal data needs to be stored in India

Data as an economic asset: ‘Once out of a country, a country retains no claim or control over data’

  • When personal data is anonymised, it becomes community data of, and about, the corresponding community from where it arises; this is a collective asset, with much commercial value, said IT for Change. “The first step, therefore, is to localise Indian data – personal as well as community, and then to explore its economic value and create wealth out of it,” it added.
  • It currently suits the two dominant digital global powers for there to be no legal frameworks around the economic value of data and digital intelligence; It is those who are losing or suffering outflow of data, and its enormous value, that need to seek a legal framework around economic value of data.
  • India’s draft e-­commerce policy rightly argued for localising community data. But there is no point in localising anonymised community data, when the personal data from which it is, derived is allowed to flow out. That renders community data localisation meaningless.
  • Formulations like ‘Data for Development’ advanced in the draft e­commerce policy, and by BRICS nations at the recent G-7 meeting in Osaka, Japan, asserting a country’s right to use its data for its development, become meaningless if the data does not even stay within the country.

Private entities collect more data than the government

Soon enough, privately collected digital data will be a necessary requirement for policy making, governance and public services in practically every area, the organisation noted. “Consider, for instance, smart traffic management, which will require community transportation data that is only with Google and Uber/Ola. Without localising such data, it is unclear how it can be mandated to be shared,” it said.

  • The organisation argued that there is a misconception that data collected by the government is the most important kind; and that it is such data to be primarily used for India’s development and wealth creation. “Data with the governments is useful, but it is today not even a fraction of the valuable data that is continuously collected in real time by various privately-owned digital platforms,” it noted.
  • In the urban transport sector, can one even begin comparing the data collected by Google and Uber with that the government has? Even in sectors like education, Google and Microsoft are siphoning off much more data than education authorities can ever imagine gathering.
  • Similar is, or will soon be, the case with health, agriculture, town planning, employment, migration, and so on. Much of this most valuable data immediately goes out of India. How is then India planning to create wealth from its data, and prevent its misuse?

Data can be retained close to origin points and still be used to train AI

  • An important argument made against data localisation is that it will disallow obtaining data value at global scale for India because that requires merging Indian data with data from other places. This is especially important in the context of AI. This has to be an important consideration, and mitigating ways found to prevent ‘intelligence’ loss for India and Indians. Technical work­arounds are emerging whereby data can be retained close to the points of its origins and still be used to train AI, like the concepts of federating learning and edge computing, the organisation noted.

Data as a ‘collective social resource’: What is community data, who owns it, and how to regulate it

What is community data?

The organisation lays down the following as constituents of what can be classified as community data:

  1. Data taken from publicly owned sources, like from various public infrastructures.
  2. Data taken from natural environment or sources, like weather, soil, vegetation and genetic data.
  3. AI training data, for instance, for self-driving vehicles vacuumed up from Indian roads, cars and car drivers.
  4. Behavioural data about Indians – and their various sub-groups – of a million different kinds, taken for instance from social media.
  5. E- commerce and other data collected by digital platforms in different sectors, which is anonymised.

Since such data arises from a given community, it can be considered as a collective social resource of that community. Further, such data’s main value is in that it contributes systemic intelligence about a group or community. It should be an easily acceptable moral principle that a group or community should primarily own and control systemic intelligence about and over it, argued the organisation.

Framework for managing and regulating community data

“A community data framing is important not just for the rights of Indians in general, but also the economic rights of its various groups and communities, including different economic groupings. How different economic actors and groups will fare in a digital economy is in significant ways tied to what rights they have over various data,” argued the organisation. It proposes a 2-step approach for creating a framework around community data law:

  • It should be based primarily on property rights of data collectors, over which public interest exceptions can be added as needed.
  • Or it should anchor in primary economic rights of the group or community whose data is collected.

The framework will have to make the following things clear:

  • What kinds of data are covered under community rights frameworks, and what are not?
  • What kind of limited data rights or incentives may be provided to the involved private players, and how, as per the circumstances and community/public interest needs?
  • What bodies ideally represent group/community ownership in any given case, and how it will be exercised.

The organisation also noted that the framework will have to form a legal basis for:

  • Localisation of community data.
  • Mandating the sharing of data.
  • Effective regulatory powers over digital platforms whose business is primarily based on employing such data.

Who owns community data?

E-commerce platforms use data about goods put on them by third party sellers and manufactures to replace the latter by taking on the same trading and/or manufacturing themselves in a manner that out-competes the original suppliers, said the organisation, before asking, who owns the data about goods put on e-commerce platforms, the platform or the original traders and manufacturers?

  • “Similar issues can be raised about farm data – whether drone pictures, or data about soil, local pests, or micro-climate. Do farmers owning the farms collectively own such data, or those who collected the data in various ways and would control digitally-enabled agriculture services employing it?” asked the organisation.

Data Sharing: ‘Extensive national sharing and availability of data is necessary’

IT for Change argued that “extensive national sharing and availability of data is necessary for any chance of India succeeding in a digital and AI economy”. Almost all national AI strategies including those from the UK, France and the Niti Aayog, focus on data sharing being central to AI success, it noted. “Global digital corporations’ main strategy is to maintain exclusive access to data, as their key resource. It is absolutely vain to keep imagining – as most AI strategies unfortunately do – that these main hoarders of data will voluntarily share their data, or even sell it in some open and fair data market,” it said.

The organisation said that extensive data sharing can be ensured by:

  • A framework law/policy on community data which provides the legal and policy basis for mandating data sharing.
  • Localising all such data so that any mandate for sharing is possible to actually apply meaningfully.
  • Determining various rights and privileges vis a vis different types of data, which may have different kinds or proportions of community-hood and private-hood, and set up a framework, and perhaps also a dedicated body, for this purpose.
  • Developing actual mechanisms for safe sharing of such data among the concerned community. “The government of India is already working on some such data infrastructures in areas ranging from transport, to agriculture, health and education, although these efforts have not quite been pulled into an explicit framework of community data policy and data infrastructures programs,” noted the organisation.

Should the DPA regulate all non-personal data?

  • At this point perhaps a single Data Protection Authority (DPA) may be fine. The government could consider setting up another executive, independent, body for promoting data sharing. Such an agency would work directly and immediately under the DPA’s remit to ensure that its activities are in accordance with data protection laws and best practices, said IT for Change.

*