Data buckets containing sensitive information used by companies like Swiggy, Gromor Finance, and a Mysore-based health startup have been exposed and are publicly available, Business Standard reports. grayhatwarfare, a third party site, posted a dump of links to these publicly available buckets. The exposure raises significant questions of data protection hygiene companies, even as the Justice Srikrishna Committee is yet to table its report on recommended data protection norms for India.

The leaked information includes diagnostic reports by doctors, job offer letters, bank statements, and much more, according to the BS report. Srikanth Lakshmanan, a programmer, discovered the existence of Indian companies’ data in the buckets, the report says. About 1.8 million people’s data is reported to have been exposed.

Swiggy told Business Standard that it takes data security very seriously. HireXP, whose buckets exposed Swiggy data, denied to the newspaper that any of the leaked files were real, and that said that the documents were ‘dummy’ letters. BS noted that a lot of the documents contained detailed summaries of employee histories and pay.

What are data buckets?

Data buckets are a cloud-computing construct, where companies store documents and files in large servers maintained elsewhere. Amazon Web Services is a common provider that many businesses like Swiggy use to store their data in such buckets for ready retrieval. This is especially useful, for example, when Swiggy wants to show users pictures of dishes in a restaurant, but doesn’t have the server infrastructure in-house to manage millions of such image requests every day. Since data buckets let you link to every public resource (like a food photo from a restaurant) with a standard image URL, they are very useful when a web service is scaling.

In this case, Swiggy et al should have stored private data behind password-protected servers that should have ideally been restricted to the company. Instead, they ended up putting those files on buckets that are used for public resources, which meant that anyone looking hard enough could find those files. And they did.

Amazon Web Services said in a comment to BS that most buckets are set to extremely private by default, indicating that they were set by these companies to publicly available, for unkown reasons.

A common problem

Publicly available data buckets are a common source of such exposures, says Kiran Jonnalagadda*, a director at HasGeek. “The history of such breaches is very long. It’s just that it’s finally now in the news,” he said. “Most of these companies don’t have proper breach handling mechanism, nor do they have internal detection or knowledge, and secondly they don’t know what to do when it happens.”

“One common problem that a lot of companies have is that they have no idea how much infrastructure they have on AWS,” Jonnalagadda said. This makes it that much more likelier that some buckets that are supposed to be private will be publicly available. “It’s very simple to fix this, but it’s also very simple to make this mistake.”

Kiran Jonnalagadda is a founding member of the Internet Freedom Foundation, of which MediaNama editor Nikhil Pahwa is chairman and a fellow co-founder.