On 5th October, MediaNama held a #NAMAprivacy conference in Bangalore focused on Privacy in the context of Artificial Intelligence, Internet of Things and the issue of consent, supported by Google, Amazon, Mozilla, ISOC, E2E Networks and Info Edge, with community partners HasGeek and Takshashila Institution. Part 1 of the notes from the discussion on AI and Privacy are here. Part 2: 

In a matter of decades, algorithms have gone from solving simple math equations to processing immense volumes of data to throw up analytics that even the creators of the algorithms have trouble dealing with. So what do you do when these algorithms create and process some of our most sensitive data? How do we regulate them, if we can do so at all?

How algorithms deepen bias

Beni Chugh, a Research Associate at the IFMR Research Foundation, explained how algorithms can amplify prejudices that people already have: “Our data is our window to our informational privacy. Around ourselves, we see that a lot of aggregation, deep machine learning, algorithms, AI, and such semantics are now the templates of how businesses work. So how does it impact the customer, and is it really a point of tension? The point of tension is only this: that it amplifies certain biases that humans already had; however, data networks, the way algorithms are designed in stages, they tend to be very highly path-dependent. Therefore some of these biases tend to get amplified over stages, and they become structured biases that can lead to discrimination, which may not be always nefarious or intentional, but it does happen, and even more, unfortunately, it is very hard to detect.

“So that is one kind of harm and challenge, that while in the absence of traditional financial data, our financial access is the poster-child of the marvels of technology, really. So we’ve been able to generate credit for those who were left out. However now that we have so much more data about them, are we at the risk of excluding the excluded once more?

Regulating outcomes

Every regulation has to come from a sense of what is fair and what is unfair — how is that possible with algorithms which often run opaquely with unclear outcomes? Chugh suggested that the constitution and the law provided a decent foundation to understand what is fair or unfair, and provided the example of how usury — lending at really high-interest rates — was outlawed because the consensus was that it was unfair.

But stepping back, should fairness be evaluated while an algorithm is fed data, or while it is processing that data, or while it is spitting out results?

Pranesh Prakash, Policy Director at the Centre for Internet & Society (CIS), argued: “I wouldn’t like to define fairness itself but rather than doing that I would like to point out that perhaps we shouldn’t really be looking for fairness in the algorithm, but rather in outcomes. So the outputs of algorithms are used. Algorithms in and of themselves don’t do things. It’s what the algorithms spit out which are then used. Are we using them in a fair manner so it — at the end of the day boils down to human beings and how we as a society may look at fairness and not really at algorithms themselves. With algorithms what we need is awareness. We shouldn’t look at algorithms as black boxes where things happen, we don’t know what is happening really, and something comes out. If that is the situation, then we shouldn’t ever be using that kind of an algorithm in institutions where we need notions of fairness. For instance, while providing credit, we can’t use black box algorithms. While delivering justice in a court setting, we can’t use black box algorithms.”

Sunil Abraham, also from CIS, said: “Unlike something much simpler from a regulatory standpoint like child pornography, AI is a full-spectrum regulatory question. With child pornography, it is an absolute prohibition for everybody. So it’s a very simple regulatory regime. With AI, we need absolute prohibition in some cases and absolute forbearance in other cases. In some cases, we need absolute prohibition in some cases because we can’t use algorithms which we don’t understand, but in some cases transparency would be sufficient, to have an open source code.”

Regulating it all

Asked which needed to be regulated — input, process, or outcome — Chugh replied, “The answer is a very sad all of the above.”

Every jurisdiction, given their own ethical understandings, arrives at a common public policy stance. Maybe for India, it may make sense not to use caste data for commercial purposes. So maybe the government is the sole custodian of caste data. In that instance, there may be merit in limiting something right at the collection stage, because once it is collected it is very difficult to ensure that it’s never used. So there may be genetic information, maybe. Could be another example which is never collected. 

“Then you come to the processing stage, and you realize certain kinds of intended business reasons themselves are not legitimate enough so these are these new studies that are coming out which can actually predict the sexual orientation of a person by just looking at their photographs. Do we find any social or business merit compelling enough for us to identify people’s such private personality traits by just looking at their photographs? So that’s a processing kind of principle that you will need there that what is it that you’re processing it for? And ultimately outcomes, because systems are not perfect. There will be biases, unintentionally, which may be visible only over time, and we may find some merit in regulating for certain kind of outcomes and say they are not permissible.”

Who is being regulated

Vinayak Hegde a consultant for big data and associated with ZoomCar, pointed out that regulation would be targeting only those institutions which can deal with regulation sufficiently: “Who are your adversaries? It’s the question of the individual against state or corporation. Who’s the powerful one, and how is the balance of power changing really fast? The adversaries are well-funded, have no incentive to listen to you. In such a world, the question for me honestly is — and we often have this conversation — are we moving to a world that looks like 1984 designed by George Orwell, or is it like the one called Brave New World which is by Aldous Huxley? And the answer is maybe a bit of both. So for me I think the best that we can do is probably we are on a slippery slope, we can make it less slippery. Because the problems are very fundamental, what is fairness is very hard to define. The second thing is we don’t even know how algorithms work. And increasingly moving to a world where the cost of producing data is very low, the cost of processing it is really low, but historically we have done really badly at agreeing at some certain principles, that is why we have constant wars going around.”

Alok Prasanna Kumar from the Vidhi Centre for Legal Policy said: “We shouldn’t be focused on ‘oh, but private companies are so powerful’. Actually, the government is very powerful. Do you want the government to put in place a horribly discriminatory system of denying people benefits just because it uses an algorithm? I don’t think so. I think we should really be worried about how the law regulates the individuals and institutions who create and use algorithms, not the algorithms themselves. There’s no way the law is going to tell you, this is a good algorithm and this is a bad algorithm.”

Regulation is possible

Alok Prasanna Kumar also pointed out that similar complex industries have already successfully faced regulation: “Pharmaceutical drugs in the sense that given the complexity involved in discovering a molecule and putting it to use, there were questions like, ‘can regulators really do it?’ They can. You get it right sometime, you get it wrong sometimes, but it’s possible to do it, there’s no question of impossibility. Today, we have the US FDA which comes in and inspects factories in India and we’re perfectly okay with it — so much for jurisdiction and all of that — and there are global treaties for it.

“Likewise, as Beni [Chugh] mentioned, who knows, tomorrow we may have a data treaty, that all countries sign up to. I think there are several models for us to choose from. Technology will always be something that the law looks back on and goes, ‘oh no, what do we do about this?’ but in any reasonably functioning society, we’ll figure out the answers depending on what we want to be done.”

Sunil Abraham from CIS pointed out: “If there is a requirement that you must have a discrimination impact assessment as part of the regulatory regime, it’s just like an environmental impact assessment. Before you launch the dam or the hotel, you have to demonstrate yourself that you have checked whether your algorithms have any kind of disparate impact on protected classes.

For environment, we have managed, which is much more complex. If you were to think about it from the perspective of datasets and types of data and types of impact, this is just asking, ‘does the algorithm have a different impact on upper and lower caste people?’, much simpler to prove as a techie than to prove that the dam will do this to the forest, or the dam will do something else. It’s a much more simpler question to answer, according to me.”

Formats of regulating algorithms

Beni Chugh from IFMR Finance Foundation gave an idea of what regulating algorithms might look like: “Principle-level understanding is essential but I think that a regulatory toolkit will need to first have principle-level abstraction bounds, the first step to which was the Supreme Court privacy judgement, and there is a lot of thinking around what should be permissible, what should not be permissible. But the ideal toolkit will have technology fixes, data security fixes. There is no one single answer, so we could come in and say that the principle-level understanding is privacy by design and baked-in privacy, and that’s where the data security people become absolutely indispensable, and that’s when they ensure that we are able to deal with that in varying contexts.”

Chugh acknowledged that most of this would have to be self-regulation that data controllers themselves take upon themselves, and in “designing algorithms for de-biasing algorithms”.

Akash Mahajan from Appsecco had a skeptical view of regulation: “Policy is awesome. Enforcement is not. Process is brilliant. There is hardly anyone who’d disagree that process doesn’t work. But only when it works. The real world is just bonkers.”

*

#NAMAprivacy Bangalore:

  • Will artificial Intelligence and Machine Learning kill privacy? [read]
  • Regulating Artificial Intelligence algorithms [read]
  • Data standards for IoT and home automation systems [read]
  • The economics and business models of IoT and other issues [read]

#NAMAprivacy Delhi:

  • Blockchains and the role of differential privacy [read]
  • Setting up purpose limitation for data collected by companies [read]
  • The role of app ecosystems and nature of permissions in data collection [read]
  • Rights-based approach vs rules-based approach to data collection [read]
  • Data colonisation and regulating cross border data flows [read]
  • Challenges with consent; the Right to Privacy judgment [read]
  • Consent and the need for a data protection regulator [read]
  • Making consent work in India [read]