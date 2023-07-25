wordpress blog stats
Connect with us

Hi, what are you looking for?

Discover more:, , , , , ,

The Gap Between Responsibility and Liability of AI

In India, Open source AI can prove to be crucial for building AI-based tools tailored for local needs. However, at the same time, regulations should be put in place to prevent questionable use cases.

Published

The end of the last week was tricky for us at MediaNama: we shifted servers and were inundated with bot activity once we went live, and RAM usage was off the charts. What was interesting for us was that the activity we saw was substantially from bots looking to scrape content from MediaNama for training AI. A noteworthy bot was from bytedance, but there were several bots with the suffix AI. It’s likely that we’re not alone here, and most news publications get this too. That’s not what surprised us.

What surprised us was that, over the weekend, even when we edited our robots.txt file to disallow all bots (even those for search), the AI bot activity continued. For the uninitiated, robots.txt is a web standard: a simple text file you can upload to your website servers to tell it to not index your website.

A couple of months ago, when there was a high profile visit from a popular AI company to India, at a private meeting, I asked an executive about copyright issues and whether they have a robots.txt exclusion mechanism. They carefully evaded the question.

Here’s the thing: just because something is public or publicly available, doesn’t mean that it’s not protected by copyright or privacy. This is at the heart of the issues with Clearview AI using social media images to train its facial recognition software, and publishers and content creators suing OpenAI for using copyrighted material as training data.

To their credit, Google began a process earlier this month about figuring out robots.txt exclusion from AI, and while robots.txt is a web standard respected by search engine bots, based what we experienced at MediaNama last week, it doesn’t mean that robots.txt is necessarily respected by bots.

While I support the idea of open-sourcing AI — it is essential for development of AI, and especially ensure that Indians can participate in building AI based tools for India (for example, allowing someone to write code in their local language, instead of English), I’m reminded about a comment someone made at that discussion, that open sourcing AI is like giving everyone a nuclear weapon. While that is an extremist position, there is a need for balance here. If tools can be used to translate Telugu into Bodo, they can also create deep fakes that could lead to violence in a country lacking adequate law enforcement resources. The uncanny valley is a thing of the past.

Advertisement. Scroll to continue reading.

Article continues below ⬇, you might also want to read:

There’s also talk of watermarking AI developed information, but there are also tools for removing watermarks. Something similar happened with DRM: just because you have tools for preventing copying of content doesn’t mean that tools don’t exist for removing technological safeguards. Bots can always choose to not respect the robots.txt exclusion, and that will mean that some parts of the web will choose not to be public.

Should the tools be blamed? Is open sourcing AI enabling an exponential growth in the problematic use cases? While it’s easy to blame the enablers, the fact remains that technology tools cannot be held liable for how we use them. However, they do tend to be seen as enablers, and thus there’s a legal gap between responsibility and accountability of platforms, and with time, law will seek to fill that gap. Just wait for the volume of harms to escalate.

I doubt that we have solutions right now. I definitely don’t think a solution lies in preventing open sourcing. Centralisation of AI with a few players will lead to an uncompetitive oligopoly, and the tech will leak anyway. At the same time, I do think it’s necessary to have these debates so that we can identify solutions. There will be no silver bullet, and we’ll need regulation to address issues related to intellectual property, competition, electoral interference, and multitudes of user harms. What shape those regulations take, and whether there should be a UN Body or local legislation is something that still needs to be fleshed out.

P.s.: On the robots.txt exclusion, I queried ChatGPT and Bard a few weeks ago about how they might be excluded. Here’s what I got:

  1. Use a robots.txt file.** A robots.txt file is a text file that tells search engines which parts of your website they can and cannot crawl. To prevent Bard from crawling your website, you can add the following line to your robots.txt file:
    Code snippet:
    User-agent: Bard
    Disallow: /User-agent: ChatGPT-User
    Disallow: /
  2. You can also use a content delivery network (CDN) to make it more difficult for ChatGPT to crawl your website. A CDN is a network of servers that deliver content to users from the closest server. This can help to improve the performance of your website and make it more difficult for ChatGPT to crawl your website.

STAY ON TOP OF TECH NEWS: Our daily newsletter with the top story of the day from MediaNama, delivered to your inbox before 9 AM. Click here to sign up today!

 

Advertisement. Scroll to continue reading.
Discover more:, , , , , ,
Written By

Founder @ MediaNama. TED Fellow. Asia21 Fellow @ Asia Society. Co-founder SaveTheInternet.in and Internet Freedom Foundation. Advisory board @ CyberBRICS

MediaNama’s mission is to help build a digital ecosystem which is open, fair, global and competitive.

Views

News

Obscene Takedowns: OTT Self Regulating Body DPCGC Learns Function Creep From the Censor Board

The Central Board of Film Certification found power outside the Cinematograph Act and came to be known as the Censor Board. Are OTT self-regulating...

July 12, 2023

News

Views: Should CCI Hold Jio Bharat to the Same Competition Standards as Google Android?

Jio is engaging in many of the above practices that CCI has forbidden Google from engaging in.

July 4, 2023

News

Views: Can generative AI collect our data from the Internet?

Is it safe to consider all "publicly available data" as public?

April 28, 2023

News

Views: Why PhonePe should build a seller app for ONDC

PhonePe launched an e-commerce buyer app for ONDC called Pincode. We, however, believe that it should also launch a seller app.

April 5, 2023

News

Views: Why Amazon joining ONDC is not the win it sounds like or needs right now

Amazon announced that it will integrate its logistics network and SmartCommerce services with the Open Network for Digital Commerce (ONDC).

February 27, 2023

Please subscribe to MediaNama. Don't share prints and PDFs.

You May Also Like

News

Search queries for international air tickets growing at 43% – Google

Google has released a Google Travel Trends Report which states that branded budget hotel search queries grew 179% year over year (YOY) in India, in...

March 23, 2016

Advert

Advertisement: 135 Digital Job Listings at JobNama – 9th June 2010

135 job openings in over 60 companies are listed at our free Digital and Mobile Job Board: If you’re looking for a job, or...

June 9, 2010
Twitter Twitter

News

Twitter takes down tweets from MP, MLA, editor criticising handling of pandemic upon government request

By Aroon Deep and Aditya Chunduru You’re reading it here first: Twitter has complied with government requests to censor 52 tweets that mostly criticised...

April 24, 2021

News

Ola, Uber drivers say they are exhausted, fear being wiped out

Rajesh Kumar* doesn’t have many enemies in life. But, Uber, for which he drives a cab everyday, is starting to look like one, he...

February 24, 2021

MediaNama is the premier source of information and analysis on Technology Policy in India. More about MediaNama, and contact information, here.

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ

Subscribe to our daily newsletter
Name:*
Your email address:*
*
Please enter all required fields Click to hide
Correct invalid entries Click to hide
No spam, ever. Promise.

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ