- Non-personal data sharing should be voluntary and not mandatory
- Let culture drive behaviour, not compliance
- A universal framework should not be created for regulating non-personal data
- Not all start-ups have the resources to undertake massive data cataloguing exercises
- Raw data in itself is not useful and there is a need for curation
- Financial incentives are not the need of the hour as they do not address concerns
“The biggest concern that came out was that one cannot apply a universal framework for all data sets and use cases; this will backfire because universal standards cannot be created for all scenarios,” said Zainab Bawa, Chief Operating Officer, Hasgeek, when asked about concerns surrounding the non-personal data framework proposed by the Kris Gopalakrishnan committee.
She was speaking at MediaNama’s ‘Regulating Non-Personal Data‘ event held on February 18, 2022. The session, Start-ups and Non-Personal Data saw participation from the following speakers: Kartik Raghupathy from PhonePe, Anal Ghosh from Google Maps, Sijo Kuruvilla George from Alliance of Digital India Foundation, and Zainab Bawa from Hasgeek; with MediaNama’s editor Nikhil Pahwa as moderator.
The Committee in its revised report on the draft Non-Personal Data Framework directed that all public and private entities should share data at an aggregate level for public purposes. The impact of an NPD governance framework will be sweeping as India is witnessing accelerated growth in the start-up ecosystem, and start-ups count among the largest generators of non-personal data.
This event was organised with support from Google, PhonePe, Amazon, Meta, and Microsoft. To support future MediaNama discussions, please let us know here.
Concerns highlighted around Non-Personal Data
“The definition of a start-up seems to be a uniform definition whereas it is not uniform in real life.” — Zainab Bawa
Bawa added that companies which are starting out are not aware whether the data they’re collecting is useful or valuable, suggesting that they only come to understand it after a couple of years. It is, therefore, not useful either from a society or a business perspective, according to Bawa. She was also against the binary classification of personal and non-personal data.
Here are some of the other concerns raised:
Lack of resources: “It’s evident that a company like PhonePe has resources to take into account considerations such as what data to share, the social ramifications, and the business implications. Can you imagine a company with 12 people spending time on all these considerations? I would never be doing business development. The uniform sentiment among most startups was that there is no consultation with any of the stakeholders in terms of implementation,” Bawa revealed.
Data trustees may not be reliable: There is concern among start-ups about the framework which has been put into place with respect to data trustees, Bawa revealed. “These trustees are not necessarily stewards of rights as stewards of data itself. “Do you really expect a trustee to have the trust of a community?” she commented, adding that she was not sure how this implementation will take place in real life. A Data Trustee is either a government organisation or a non-profit private organisation (like a company, society, or trust) that is responsible for the creation, maintenance, and data-sharing of high-value datasets in India.
Implementation will be burdensome: Bawa was of the opinion that this massive pool of data will result in a very technical implementation. “Do we have some reasonable standards in place to do this kind of implementation? Where is the interface? What is the thought process about the interface from which one will access this data? It is very nice of Mr. (Amar) Patnaik to say: let’s make a marketplace. Where is the test prototype to understand how this access will take place?” she questioned.
No strategy on handling metadata: “Metadata can reveal the strategy of a company, do companies want to share this metadata?” Bawa asked. She said that there is a need for two things from a real-life implementation point of view when one has to maintain so much data and metadata,
- The data needs to be catalogued and data cataloguing is a massive exercise; not every organisation has the ability to maintain data catalogues.
- Companies need computing resources and power to maintain the data. They will have to sink a lot of time in maintaining this data.
Data is distributed in nature: “Non-personal data tries to centralise data creation and aggregation. It is evident that NPD has been conceptualised in a way where it overlooks that there are communities which have no geographical boundaries,” Bawa suggested.
Wafer-thin margins: Raghupathy said that digital payments companies are operating on extremely low margins. “Now, if the economics of our business are artificially suppressed due to laws, data becomes a core part of our IP as much as the tech stacks.” Raghupathy said, adding that forcing companies to disclose data puts shackles on a business model which is already suppressed.
No IPR protection: Ghosh said that IP is important for companies in addition to the data being an asset. He added that intellectual property involves processes that include how data is collected, the way it is stored, and the way it surfaces, among other things. “We will need to evaluate what IP has been put into place in the data value chain, and then see whether there is a way to share data and in what form. It needs to be voluntary because mandatory makes it arbitrary,” he explained
George said that the availability of data for start-ups to build and innovate is “a very good idea”.
“Every business has a right to stay in business. The prime responsibility of business is for creation of IP. In this particular scenario, mandatory data (sharing), if it is infringing on their rights to protect IP, who’s in the best position to decide what is best for the business? It is the business itself. Whatever they are mandated to share should be more by exception and not by default.” — Sijo Kuruvilla George
- Guard against perverse incentives: George cited the example of how during the British Raj, the government created an incentive to catch snakes as their number was high but people apparently started breeding snakes. “The moment you make it mandatory, it goes against the ethos of whatever we stand for in start-up culture and the economy. It is fairly non-negotiable that sharing should be voluntary,” he said.
- Hold big-tech companies to a higher standard of accountability: George said that such companies should be held to a high standard from the perspective of data-sharing practices. But he argued that forcing them to open up their IP and making it available to everyone goes against the established norms of a strong IP regime.
- Framework is a great starting point: George explained that we are in the early stages of a data regime. “There is a lot of negative talk about data sharing and data selling. The public associates databases with a shady business so having a framework around facilitating data sharing in a soft-touch regulated framework is a great starting point to move towards a regulated trust-based industry.”
- Let culture drive behaviour: “Culture is the ultimate driver of behaviour, not compliance. When it comes to sharing data and innovation, culture should be the way to look at it. It can’t be financial incentives. The government should take a hands-on approach,” George asserted in his remarks.
How should companies deal with raw data?
In its report, the expert committee suggested that raw data should be freely available. There was consensus among panellists that raw data in and of itself will not be of much use. Here is what they said:
Karthik Raghupathy, PhonePe: Raghupathy said that sharing data for the sake of data sharing without enough context is not very helpful. “There is a need for curation and commentary. Even simple things, like an active user, for example, just varies, the definition varies across companies, what an active user is, and sharing data, Sharing data without context, from multiple sources, leads to more confusion than clarity.”
Anal Ghosh, Google Maps: Ghosh opined that raw data is a general statement that lacks the end use case, the context, and the background. “There is subjectivity involved in how you’re defining and there can’t be a universal definition of raw data that cuts across all types of non-personal data,” he said, batting for a collaborative approach with concerned organisations.
Sijo Kuruvilla George, ADIF: He objected to the word ‘free’. He was not in favour of taking away the choice from companies in terms of agreements that they can enter. “It’s actually coercion. Free, for some reason, has a captivating value from a policymaking perspective but free doesn’t necessarily mean good quality services; there is a flip side. These should be choices that people have in terms of whether they want to give it for free,” he advised.
Zainab Bawa, Hasgeek: “Data is a liability. When the government says: ‘share data in whatever form’, has the government thought about security? Has it been thought that data can eventually go in contradiction to data protection?” she asked.
Data as public property
Parminder Jeet Singh of IT for Change argued that everybody accepts that data is the most important resource of the present. Every epoch has had a governance regime around its central resource, whether it was land, intellectual property, industrial capital, or data. He suggested that voluntary sharing is not possible as was the case with land and when the land ceiling act came in and with it, India’s industrial development. He said that data by law is not intellectual property.
How did the panellists respond?
Zainab Bawa, Hasgeek: “It’s unfair to compare the two assets because land is fixed in ground; data is not a fixed asset. There is a concern that NPD comes with this assumption that data is fixed in time, in place. It is not the case because communities are not fixed in time and place. Data is something that’s mouldable, acquires value, even land by itself acquires value over a period of time by circulation; regulations have tried to fix the idea of land into ground in the past. We’re doing the same thing with data right now. It neither fuels innovation nor does it create marketplaces,” she said.
Sijo Kuruvilla George, ADIF: “The challenge is that we want to have innovation. I have not seen instances where we have achieved innovative open societies through coercion. I don’t think we have reached that reality from a data standpoint, to classify data as a public asset or a public good,” He was not in favour of a sweeping law mandating everything.
Karthik Raghupathy of PhonePe: “There are a lot of negative ramifications of sharing data openly. We would like to work with policymakers in an inclusive and transparent process to address needs and various use cases so that companies can be encouraged to give back and foster innovation. This notion of an inclusive and transparent process in developing a framework would satisfy some of the concerns that have been raised versus it being done in a sweeping manner,” he suggested.
Anal Ghosh of Google Maps: “The point that data needs to be free and open because it’s a core asset in today’s world, is being too generic about it. It has only been effective when you have taken a collaborative approach. Not only in terms of sharing data, but also in terms of defining the framework, there needs to be an inclusive, and collaborative approach. Data is fungible, data is evolving, it’s not a fixed asset, it acquires value, a lot of data at the start is not valuable,” Ghosh explained.
Should companies share data after time has elapsed?
George said that it should be left to the companies to make such decisions. He provided the example of the trade secrets framework where there is no mandated need for anybody to disclose secrets. He remarked that it should be considered only in case there is a public need for it. The public good should not be broadly defined, as per George.
Route of incentivisation for data sharing
- Raghupathy said that he was not in favour of incentivisation. He said that if sharing is part of a company’s culture then it will not need any incentives.
- Ghosh said that it should be inherent in the DNA of big companies, as they are amassing data, to start seeing how their data can serve the community; it needs to be a part of the journey from the start.
Should organisations have access to data from big-tech companies?
George said: “Most founders I know are absolute hoarders of data. The amount of data most young companies try to hoard and what they put to tangible use, the proportion is very low. We had an issue where academic institutions and researchers were not getting quality data. A lot of the data is sitting with private companies. We are seeing scenarios where researchers are leaving universities to go to private companies to be able to work on such datasets. It can have a lot of ripple effects if universities are able to access data and build on top of it, as they will have more safeguards,” he added.
George was of the opinion that merely opening up data is not going to be sufficient. He recommended that policy should enable creation of IP which will provide an impetus for distribution of data. He explained that for good actors, a good framework provides protection. “Good actors were struggling but it’s a free market for bad actors, and at the same time, even if you’re a good actor, the probability of you being perceived as a bad actor is high,” he said.
- Regulating Non-Personal Data: How to share data voluntarily and mitigate the attendant risks #NAMA
- Regulating Non-Personal Data: Why it might not address antitrust concerns like data monopolies and barriers to entry #NAMA
- MP Amar Patnaik on Non-Personal Data: Different DPAs would impede protection of citizens’ rights #NAMA
- A Guide To Non Personal Data Regulation In India
Have something to add? Subscribe to MediaNama here and post your comment.