By Varunavi Bangia
Technology brings a world of new possibilities and at the same time a need for new defences. Certain technological innovations in the past few decades have often evaded legal scrutiny. This has left a gap in the regulatory understanding of what it is that needs protection and from what. To understand the gap between legal regulation and technological advancement, one must grapple with the reality of how the data economy operates. Presently, the starting point of data protection regulation across jurisdictions is informed consent for collection of data. However, the data economy has moved far beyond monetising individual personal data in the form in which it is collected from the source. Inferential analytics, big data, and knowledge discovery have challenged the starting point of data protection law, i.e. a narrow definition of what constitutes “personal data”. Scholars have pointed out that such knowledge discovery leads to “deindividualization of the person”, which is the tendency to treat people as members of certain groups based on characteristics associated with a group instead of treating people as individuals based on their inferred characteristics. In other words, the emphasis has shifted from drawing inferences about an individual through their behavioural patterns to drawing inferences about individuals for the purpose of grouping them into categories. Collection of massive amount of data from several and diverse sources, coupled with sophisticated analytical capabilities makes profiles more complete and more invasive. Moreover, anonymised data sets may be cross-referenced with another data set to enable re-identification. Even if information is obtained initially through consent, singular streams of data may be corroborated with other streams of data or insights that potentially go beyond the initial stated purposes to which individuals may have consented. In the age of big data analytics, the most consequential harms to an individual’s privacy arise from the application of inferential data for discriminatory purposes.
Never miss out on important developments in tech policy, whether in India or across the world. Sign up for our morning newsletter, with a “Free Read of the Day”, to experience MediaNama in a whole new way.
Is non-personal data not personal?
The major challenge with the current conceptualisation of personal data as “data relating to a natural person who is directly or indirectly identifiable”, is that it is limited to personally identifiable information and that it excludes anonymised data. The narrow formulation of personal data, however, forms the exclusive domain of regulation under data protection law. Once the data has been anonymised or processed and generalised, individuals do not have any control over its application because it ceases to be personally identifiable. However, the impact of such use is felt equally, if not more critically, by individuals because the profiling of individuals provides an opportunity to corporations, governments, and other entities to make discriminatory decisions. For instance, the information that an individual belongs to a category of people susceptible to a certain disease may result in stigmatisation and discrimination, denial of access to insurance or certain jobs, often without even to the notice of an individual that the reason for such denial is their categorisation as a person susceptible to a disease.
Further, while these “inferences” are included within the definition of what constitutes “personal data”, the ambit of inferential data generated is far broader than what is protected. Inferences drawn from personally identifiable information are regarded as personal data. Profiling, however, can also be done without identifying an individual and without using any personal data. In fact, even irreversibly de-identified or anonymised data can be used to build profiles. The definition of profiling in the Data Protection Bill 2021, however, limits itself to processing of personal data that analyses or predicts aspects concerning the behaviour, attributes, or interests of a data principal. Therefore, whatever protection relates to the profiling of data is limited to profiling of only personal data, rendering data protection law inapplicable to an entire class of profiling. It is argued that inferential data should be categorised as personal data not on the basis of the data it uses to draw inferences but on the basis of whether or not the inference is itself of a nature that can be used to identify a person or a group, and similarly as sensitive personal data, based on whether it relates to a sensitive attribute or not.
Big data challenges the fundamental distinction between personal and non-personal data. The massive challenge that regulators across the globe face is the inability to neatly categorise non-personal data from personal data. While the government has moved in the right direction firstly to proactively regulate non-personal data as well, and secondly by seeking to regulate it under the same statute, a lot is left to be desired when it comes to the actual content of the regulation. The difference in the understanding of personal and non-personal data is what has prompted different approaches to regulating them. While personal data has been seen as the subject of data protection laws, non-personal data has been seen as an economic resource of untapped value and potential. Further, these two objectives have been dichotomised such that privacy protection necessary comes at the cost of economic development.
Regulatory regimes must therefore, move beyond treating personal and non-personal data as mutually exclusive. I argue here that non-personal data must be categorised into two main heads. Firstly, data that is anonymised personal data, or can be used directly or indirectly to identify a person and/or a group (hereinafter referred to as human non personal data). Secondly, non-personal data that is non-human in nature both at the time of collection and in its use (hereinafter referred to as non-human non personal data). We argue that human non-personal data should be categorised as personal data.
Moving beyond the “individual”
We must fundamentally alter our conception of the subjects of the right to privacy from protecting individuals to protecting groups. This must come with the understanding that members of a ‘group’ have often no knowledge leave aside ties of loyalty among one another, or organisational support structures, and therefore stewardship models of collective rights would not work. Instead, what comes first, is the interest or purpose of the entity profiling to create a group by clustering people based on shared (existing or perceived) behavioural patterns and attitudes. This may or may not overlap with pre-existing groups. Then comes the violation of the group’s privacy as a group, actionable as the violation of an individual’s privacy as a direct consequence of being associated with that group (and not as a consequence of their individual identity). Finally, we need to identify the remedy for the group through its members for the violation of the right. What gives the group the ability to redress profiling is the interested/self-serving practice of profiling and not the “ontological status” of the group as one that predates the profiling interest. The focus must, therefore, be on conceptualizing a right available to a group not merely because each individual in that group has an independent right to privacy, but a right that belongs to the group as a group.
The focus needs to also shift from considering privacy as a function of confidentiality and control over personal information to a function of protecting from consequences of bias and discrimination through tools of social surveillance. Therefore, collective privacy needs to be described as a tool to limit the harms that a group faces due to discriminatory and invasive data processing.  The most important regulatory intervention is to ensure that collective rights are neither subordinated to nor seen in conflict with individual rights. They must be viewed as an independent right available to each individual by virtue of being members of a generated group profile, irrespective of whether or not they directly or indirectly associate with those attributes.
As an alternative to the traditional notion of collective rights, a conception of “categorical privacy” has been recommended. It applies to data that was originally personally identifiable but after processing and aggregation can no longer identify individual persons but retains identifiers of group identity, and when such identifiers are disclosed the information has the potential to have similar adverse consequences or impact as the violation of an individual’s privacy would.  In addition to this individualised post-facto assessment, scholars also recommended that governments must focus on creating disincentives that induce wrongful uses of aggregation and profiling technologies. It is recommended that as an ex-ante measure, regulators implement certain forms of impact assessment studies to check for bias and discrimination. The emphasis must be on hard accountability, robust regulatory oversight, and the ability to audit algorithmic decisions and their impact on society. These prior impact assessment studies must focus on privacy and security but also on the ethical and social consequences of data processing.
In Indian Constitutional jurisprudence, this recognition of individual privacy as an expression of one’s identity, which is the first step in conceptualizing collective privacy rights, has not only been recognised explicitly in the plurality opinion given by Chandrachud J. in Puttaswamy v. Union of India, but has also been subsequently applied in the Navtej Johar v. Union of India case. Therefore, it is evident that the theoretical basis for envisioning a group privacy right exists in the Indian constitutional framework. The Court also recognises the harm of aggregation, especially aggregating information that may not be personally identifiable or “private” in isolation, but combined with swathes of other data can help in creating detailed profiles of behaviours and attitudes of individuals. Therefore, even from the regulatory perspective, the focus should be on regulating aggregated data and inferential data. They also emphasise on data mining and knowledge discovery resulting in “creation of new knowledge” about individuals of which they are themselves unaware. This is an extremely important observation that reflects the reality of evolving technology, that was unfortunately missed out upon by the Srikrishna Committee Report and subsequent Parliamentary Committee Reports as well. However, despite the forewarnings of the Court of the potential insufficiency of safeguards in force in other jurisdictions due to big data capabilities, the Personal Data Protection bill 2017 and all subsequent versions of the Bill, including the most recent Data Protection Bill 2021, fail to address critical concerns that arise in the age of big data and inferential analytics. The recognition by the Union Government and the Joint Parliamentary Committee that non-personal data is an important subject of regulation and must be regulated by the same framework as personal data, did not translate into robust protections against the harm from misuse of non-personal data. This is because of a lack of understanding of the value of non-personal data to corporation in creating group profiles.
Further elaboration on how these groups rights need to be developed, their jurisprudential underpinnings and whether these rights can be developed in the Indian framework, will be published in a full length paper which is forthcoming in August 2022.
This research was supported by an unrestricted scholarship pursuant to the Facebook India Tech Scholars Program 2021-2022. The work was conducted under the research mentorship of Software Freedom Law Center and without any oversight from Meta. The views expressed herein are those of the author(s) and are not necessarily those of Meta or Software Freedom Law Center .
Varunavi Bangia is a 5th year student of BA LLB (Hons) at West Bengal National University of Juridical Sciences (WBNUJS), Kolkata
 Vedder, A. KDD: The challenge to individualism. Ethics and Information Technology 1, 275–281 (1999).
 Vedder, A. KDD: The challenge to individualism. Ethics and Information Technology 1, 275–281 (1999).
 Ira S. Rubinstein, Big Data: The End of Privacy or a New Beginning?, 3(2) International Data Privacy Law 77 (2013); See Paul Ohm, ‘Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization’ (2010) 57 UCLA Law Review 1701
 Nathaniel A. Raymond, Beyond “Do No Harm” and Individual Consent: Recknoing with Emerging Ethical Challenges of Civil Society’s Use of Data in Taylor, L., Floridi, L., van der Sloot, B. eds. (2017) Group Privacy: new challenges of data technologies. Dordrecht: Springer p.92.
 Section 3(33) Data Protection Bill, 2021.
 §2(37) Data Protection Bill 2021.
 Sandra Wachter, Data protection in the age of big data, 2 NAT. ELECTRON. 6, at 7 (2019).
 Ira S. Rubinstein, Big Data: The End of Privacy or a New Beginning?, 3(2) International Data Privacy Law 77 (2013).
 §2(37) Data Protection Bill 2021.
 Ira S. Rubinstein, Big Data: The End of Privacy or a New Beginning?, 3(2) International Data Privacy Law 77 (2013); Hildebrandt, M. (2008). Defining Profiling: A New Type of Knowledge? Profiling the European Citizen, 17–45.
 Luciano Floridi, Group Privacy: A defence and an Interpretation in Taylor, L., Floridi, L., van der Sloot, B. eds. (2017) Group Privacy: new challenges of data technologies. Dordrecht: Springer p.109.
 Alessandro Mantelero, From Group Privacy to Collective Privacy: Towards a New Dimension of privacy and Data Protection in the Big Data era in Taylor, L., Floridi, L., van der Sloot, B. eds. (2017) Group Privacy: new challenges of data technologies. Dordrecht: Springer p.183.
 Vedder, A. KDD: The challenge to individualism. Ethics and Information Technology 1, 279(1999).
 Vedder, A. KDD: The challenge to individualism. Ethics and Information Technology 1, 280(1999)
 See H.R.2231 – Algorithmic Accountability Act of 2019 116th Congress (2019-2020).
 Alessandro Mantelero, The future of consumer data protection in the E.U. Rethinking the “notice and consent” paradigm in the new era of predictive analytics. Comp. Law & Sec. Rev. 30 (2014)30: 643-660.; Alessandro Mantelero, From Group Privacy to Collective Privacy: Towards a New Dimension of privacy and Data Protection in the Big Data era in Taylor, L., Floridi, L., van der Sloot, B. eds. (2017) Group Privacy: new challenges of data technologies. Dordrecht: Springer p.189.
 Justice KS Puttaswamy and Anr v. Union of India and Ors, Writ Petition (Civil) No. 494 of 2012 DY Chandrachud J. ¶84.
 Navtej Johar and Ors v. UOI, Writ Petition (Criminal) No. 76 of 2016, Dipak Misra J. ¶129,132.
 Justice KS Puttaswamy and Anr v. Union of India and Ors, Writ Petition (Civil) No. 494 of 2012 DY Chandrachud J ¶175; see also Daniel J. Solove, Understanding Privacy 70 (2008).
 Justice KS Puttaswamy and Anr v. Union of India and Ors, Writ Petition (Civil) No. 494 of 2012 DY Chandrachud J ¶174; See also Christina P. Moniodis, “Moving from Nixon to NASA: Privacy’s Second Strand- A Right to Informational Privacy”, Yale Journal of Law and Technology (2012), Vol. 15 (1), at page 154.
Note: The headline of the story was updated on July 19th for greater clarity.
This post is released under a CC-BY-SA 4.0 license. Please feel free to republish on your site, with attribution and a link. Adaptation and rewriting, though allowed, should be true to the original.