The government can sell ‘select’ citizens’ data to private companies and data analytics firms for “commercial use” and “generating insights” for “profits”. It also pointed out that “in thinking about data as a public good, care must also be taken to not impose the elite’s preference of privacy on the poor, who care for a better quality of living the most” according to the Economic Survey 2018-19, which was tabled in Parliament on Thursday. The survey, which has an entire section on data, titled “Data of the people, by the people, for the people” , pitches for creating a Central Welfare Database of citizens — by merging different databases maintained by separate Ministries and departments. Here’s what the document said about selling citizens’ data to private companies.

“The private sector may be granted access to select databases for commercial use. Consistent with the notion of data as a public good, there is no reason to preclude commercial use of this data for profit. Undoubtedly, the data revolution envisioned here is going to cost funds. Although the social benefits would far exceed the cost to the government, at least a part of the generated data should be monetised to ease the pressure on government finances. Given that the private sector has the potential to reap massive dividends from this data, it is only fair to charge them for its use”

Data could also be sold to data analytics firms:

“Datasets may be sold to analytics agencies that process the data, generate insights, and sell the insights further to the corporate sector, which may in turn use these insights to predict demand, discover untapped markets or innovate new products.”

Central Welfare Database and Data Access Fiduciary Architecture

The need to create a central database arises from the fact that “data collection in India is highly decentralised” according to the survey. While the government already holds a rich repository of administrative, survey, institutional and transactions data about citizens, these data are scattered across numerous government bodies; hence the need to bring it all together, it says. Acknowledging that the idea of providing the government  with such comprehensive information about every citizen may “sound alarming”, the objective is only to use this data in a “more efficient way”. In addition, the database itself will not collect any data – it will only be updated as ministries collect more data.

The survey proposes setting up a:

  • Data Access Fiduciary Architecture: Each government department is responsible for making available the data they hold as a data provider and treat private data and public data with the “standards they require”.
  • “Data requestors” can get access to this data through the Data Access Fiduciary.
    • Data requestors may be public or private institutions but can only access the data if they have appropriate user consent.
  • Data Access Fiduciary themselves will have no visibility on the data due to end-to-end encryption.

How the govt’s proposed central database repository looks like

“The data system envisioned here involves predominantly data that people share with government bodies with fully informed consent or is data that is legally sanctioned to be collected by the state for an explicit purpose such as tax collection, or delivering welfare.”

Data the repository will contain-

The government will collect four distinct sets of data about people:

  • Administrative data: Birth and death records, pensions, tax records, marriage records, etc.
  • Survey data: Census data, National Sample Survey data, etc.
  • Institutional data: Public school data on pupils, public hospital data on patients, etc.
  • Transactions data: e-National Agriculture Market data, United Payments Interface data etc.

How the central database would function

This repository’s efficiency, according to the document, would depend largely on three features:

  • Any government ministry would be allowed to view the complete database, but can manipulate the data for which it is responsible.
  • Updating of the database would happen in real time, but it should happen such that one ministry’s engagement doesn’t affect other ministries’ access to it.
  • The database should be secure with no “room for tampering”.

“There have been some discussions around the “linking” of datasets – primarily through the seeding of an Aadhaar number across databases such as PAN database, bank accounts, mobile numbers, etc. When one adds an Aadhaar number to an existing database such as a database of bank accounts, it is only one more column that is added to the table. The linking is so to speak “one-way” … This doesn’t mean that the UIDAI or government can now read the bank account information or other data related to the individual”

Ensuring Privacy

The paper states that benefits of creating data as a public good can be generated within the legal framework of data privacy. This is how the government proposes to ensure privacy after the creation of a central repository:

  • People can opt out of divulging data to the government, but only “where possible”.
    • Opting out of administrative data would not be possible since the government sees this data as being critical to enforcement of rights.
    • People can choose to opt out of institutional, survey and transactions data.
  • Even if there is no viable private market choice of certain public services, the choice to share the individually linked data from such services will always be with the citizen.
    • Furthermore, “immutable access logs” would be provided to people so that they can track who has seen their data and why.

The paper makes only one slight reference to the Data Protection Bill:

“Even if not explicitly mentioned every time data is talked about in this chapter, it is assumed that the processing of data will be in compliance with accepted privacy norms and the upcoming privacy law, currently tabled in Parliament.”

The need for a central data repository; the whole is larger than the sum

The survey says that the the need for a government-driven data revolution is motivated by three key characteristics:

  • Data is more useful when it is married with other data – “the whole is larger than the sum”.
  • Data needs to cover a critical mass of individuals and firms so that comparisons and correlations can be assessed to generate useful policy insights.
  • Data needs to cover a critical mass of individuals/firms so that comparisons and correlations can be assessed among individuals/firms to generate useful policy insights.
  • The private sector may not have the risk appetite or the capital to make the necessary investments required for generating data that has the three characteristics stated above.

“Even if private sector were to put such rich data together, this would result in a monopoly that would reduce citizen welfare, on the one hand, and violate the principle of data by citizens, and, therefore, for citizens.”

Data as a public good

The survey keeps referring to treating data as a public good. Precisely, there are 13 different mentions of this idea.  It says that since the storage costs have decreased from $ 61,050 per gigabyte in 1981 to less than $3.48 today, it is important for the government to intervene and start treating “data as a public good for the poor in the society”. It says that since data is generated by the people, it should also be used for the people, hence the title of this document: “Data of the people, by the people, for the people”.

  • The survey states that being able to retrieve authentic data and documents instantly, governments can improve targeting in welfare schemes and subsidies by reducing both inclusion and exclusion errors.
  • Citizens would be the largest group of beneficiaries of the proposed data revolution as they would no longer need to run from pillar to post to get “original” documents.
  • It would help citizens to piece together their financial lives.