A Chinese company with links to that country’s government has been monitoring public figures across the world, including 10,000 Indians, reported a consortium of media organisations, including India’s The Indian Express, Australia’s Financial Review and the Australian Broadcasting Corporation (ABC), Italy’s Il Foglio and United Kingdom’s The Daily Telegraph. Zhenhua Data, headquartered in Shenzen, has reportedly built a global database by collating information from social media IDs, news reports and other publicly available data.

However, reaction to these news reports has been lukewarm. Many commentators have noted that Zhenhua’s activities are not as alarming as they have been made to seem (more on this below).

What is the story?

Zhenua Data, with links to the People’s Liberation Army (PLA) and the Chinese Communist Party (CCP), has reportedly been collecting information on various Indian public figures including Prime Minister Narendra Modi, President Ramnath Kovind, chief ministers Mamata Bannerjee, Ashok Gehlot, Amarinder Singh, Uddhav Thackeray, Cabinet ministers Rajnath Singh, Nirmala Sitharaman, former and incumbent military officials, senior members of opposition parties such as Congress’ Sonia Gandhi and industrialists Ratan Tata and Gautam Adani. Over 10,000 Indians were reportedly being targeted by the firm.

The database was leaked to an academic Christopher Balding, a US national working in Vietnam. Balding later provided access to this database to members of the media consortium for the story.

How, and what, data was collected?

Zhenhua Data, according to The Indian Express, has built an Overseas Key Information Database (OKIDB) which monitors a subject’s digital footprint across social media platforms. It reportedly built an “information library” by scraping content from sources such as social media platforms, news, forums, research papers, patents and bidding documents. The company built a “relational data”, which it does by tracking the subject’s friends, family and followers on social media platforms. It reportedly used artificial intelligence tools to collect “private information about movements such as geographic location.”  It is interesting to note that the data collected by Zhenhua Data is already available on the public domain.

The ABC’s report notes that the Zhenhua Data’s main database that leaked to Balding has more than 2.4 million names and profiles, including those of 35,000 Australians. The information includes dates of births, addresses, marital status, photographs, political associations, relatives and social media IDs. Zhenhua reportedly collates data from Twitter, Facebook, LinkedIn, Instagram, TikTok, news stories, criminal records and corporate misdemeanours. The ABC, too, notes that much of the information is collected (“scraped”) from openly-available material. However, it adds, that some profiles have information which “appears” to have been sourced from confidential bank records, job applications and psychological profiles. It speculated that the company could have sourced such information from the “dark web”. This is perhaps the only claim by any report that non-public, confidential data was also collected. 

The Financial Review reported that the OKIDB ascribes a numerical ranking to each person. This ranking is reportedly used by China’s security agencies to better understand influential figures in a country.  Zhenhua Data reportedly has 20 information collection centres around the world.

ABC also reported that only 10% of the database of 2.4 million records of individuals was recovered by Internet 2.0, a Canberra cybersecurity firm, which worked with journalists on the story. Of the 250,000 records recovered, 52,000 were of Americans, 9,700 British, 5,000 Canadians, 2,100 Indonesians, 1,400 Malaysians and 138 from Papua New Guinea.

What is ‘Hybrid Warfare’?

According to multiple reports, Zhenhua claims it is a pioneer in “hybrid warfare”, a term used to refer to unconventional forms of warfare. The company talks of waging manipulating reality via social media, and how it will use big data for the “great rejuvenation of the Chinese nation”.

Christopher Balding, in a statement posted on his personal website on Monday, tried to explain the importance of the “Zhenhua Data Leak”. He said the data confirms long-standing beliefs of activities that China had been believed to engage, but never had proof of.

Balding said the data leak had proven that even experts on China had “radically” underestimated the country’s investment in monitoring and surveillance tools dedicated to controlling and influencing not just its domestic citizens but assets outside of India. “The world is only at the beginning stages of understand how much China invests in intelligence and influence operations using the type of raw data we have to understand their targets.”

So, was this ‘hybrid warfare’?

Almost all reporting on the Zhenhua Data Leaks has been indicative of the fact that the company, though it might be working with Chinese government agencies, is only collecting data that is freely available in the public domain. Reacting to the reports, several commentators noted that Zhenhua Data’s actions were not as alarming as have been made to look, and terms like “Hybrid Warfare” were exaggerations. Others noted the collected data would only qualify as open source intelligence (OSINT) — data that is collected by publicly available sources, as opposed to covert or clandestine sources.

Jeremy Kirk, executive editor of Australia-based Information Security Media Group, noted that his assessment of the dataset suggested that none was the data was “necessarily sensitive”. “If you put it on social media and your privacy settings are open, well, you’ve been warned for ages that this was a bad idea,” he said on Twitter.

In a subsequent story published in Data Breach Today. Kirk said that he had first received a tip on OKIDB earlier in late December 2019 or early January, but he hadn’t followed it up. He noted that the database had been left unsecured on the internet. He said Zhenhua Data was not very good at securing its own data.

While admitting that the database itself was “impressive” for its size, the data in it didn’t appear sensitive. He wrote: “All of it seemed to be public. For example, there were bits from U.S. Navy press releases announcing deployments of ships, some of which had been translated into Mandarin.” He added that he didn’t see anything that raised alarms. “There was a huge amount of data, some obvious link to China but nothing that was nefarious.” He added that though Zhenhua had used phrases such as “hybrid warfare”, it was just “marketing puffery”.

Kirk noted his discomfort with the Australian Financial Review’s headline that the database was a “social media warfare database”. He said that if someone had put any personal information on social media or on the internet, it was bound to be scraped. “Zhenhua Data feels like a company that has done what countless other Western companies have done in the age where data is the new oil: collect it and sell it. It wasn’t trying to hide.”

Maya Mirchandani, a former journalist who currently works with the Observer Research Foundation, noted that countries and intelligence agencies collect data across the world to track anyone with influence using data. In a tweet, she indicated that Zhenhua’s data collection was simply open source intelligence (OSINT), which was neither new nor unique.

Indeed, there exist several companies that offer OSINT tools to both governments and private citizens. They collate openly-available data to prepare analytics for a variety of purposes. Prominent examples include Maltego, Shodan and Recon-ng, and many others that work in the areas of cybersecurity.

Company denies allegations: Meanwhile, a representative of Zhenhua Data told the Guardian that the report was “seriously untrue”. The representative said that the company was only engaged in “data integration” but not collection, adding that all the data was already public on the internet. She also denied that the database had information of 2 million people.

The representative also told Guardian that Zhenhua Data did not have links to the Chinese government or military, and its customers were research organisations and business groups.  

Were Zhenhua Data’s actions legal?

Based on a cursory reading of the Information Technology Rules, 2011, the collection of publicly available data by Zhenhua Data does not seem to be breaking any laws. The Rules places restrictions on the usage of “sensitive personal data or information”, which is defined as that which consists of fields such as passwords, financial details, medical records, biometric and sexual orientation, among others. It specifically states that information that is freely available or accessible in the public domain, or furnished under the Right to Information Act, 2005 or other laws would not be regarded as sensitive.

Similarly, the proposed Personal Data Protection Bill, 2019, also allows for the processing of publicly available personal data, and it does not required the consent of the data principal (the person to whom the data relates to).