wordpress blog stats
Connect with us

Hi, what are you looking for?

Generative AI for Bharat: Notes from a conference by Full Stack Capital

Most visual models don’t have enough Indian data and they do a poor job of generating Indian visuals.

Last Friday, Full Stack Capital held a workshop and a conference on Generative AI for Bharat. We didn’t discuss regulation of Artificial Intelligence, but I was on a Mirror Now panel about regulating AI (where there was a substantial amount of scaremongering)

The slides from the workshop, by Varshul CW, Co-founder of Dubverse, which works on Indian text to speech, are available here. 

STAY ON TOP OF TECH POLICY: Our daily newsletter with top stories from MediaNama and around the world, delivered to your inbox before 9 AM. Click here to sign up today! 

Some notes to consider from the workshop and the conference, put together by Anurag Saxena, Founder of Digital Economy Foundation and COO at EasyGov, Aashay Sachdeva from Rebright partners, and me (Nikhil Pahwa, Founder of MediaNama):

  1. AI language models in Indian languages and for Indian contexts are scarce. Most visual models don’t have enough Indian data and they do a poor job of generating Indian visuals.
    Anurag adds: For the same reason, AI doesn’t provide accurate information about India to the global audience in many areas. For example, if one asks ChatGPT4 to create an image of a school in India, most results show poor school infrastructure. This limits our soft power as a country and limits the visibility of the development to the global audience.

Example from latest stable diffusion model –

  1. Access to Indian language tokens for existing tools like chatgpt etc., for experimentation, are more expensive than English.
    Anurag adds: The API access to Hindi content is almost double that of English, and it goes up to eleven times the cost of English, for regional Indian languages. This inhibits deployment of AI in Indian languages.
    Aashay adds: Check out the number of token used by chatgpt for english vs your local language – https://platform.openai.com/tokenizer
  2. There’s support among businesses for how Japan is considering approaching AI regulation, especially the idea that there should be no copyright applicable for creating language models.
  3. Indic languages are being seen as a business moat for generative AI.
  4. The AI stack for India is not ready. There’s a supply problem in terms of language solutions. It’s broken across modalities, whether text, visual or audio, etc. For many Indian languages, even across text to speech. Speech to text and text to speech remain big opportunities, especially for regional Indian languages: There’s a saying in North India, that कोस कोस मे बदले पानी, चार कोस मे बदले वाणी। (Water changes every kilometer, dialect changes every four kilometers).
    Aashay adds: Few open source initiatives are being taken to bridge the gap. AI4Bharat lab at IIT Madras has indic specific model across modalities, have multiple datasets as well, collaborating with govt agencies as well. They plan to release their own LLM soon.
  5. Cost of supply is coming down. Means of production are changing.
  6. We’re still struggling with answers in terms of use cases.
    Aashay adds: We are seeing across our portfolio companies and ecosystem the number of use cases exploding across healthcare, education, content creation etc, only currently limited by either the quality of the results not being at par with other languages or cost
  7. Open sourcing of market data will help.

Anurag adds: Democratization of data helps in scaling business. One can monetize through the increase in transactions. However, the data should be freely shared. (Nitin, OfBusiness was talking about the same as he intends to do the same in the next phase of growth.)

  1. Vertical application of AI will help move the adoption curve, in legal, medical, etc
  2. AI can give non-English speakers access to no code programming and gives Bharat the opportunity to create.
    Anurag adds: Anybody can write down instructions in the form of a paragraph in any language, and AI models will be able to write code. Thus, the dependency on knowing the English language will reduce.
  3. Marginal cost of software development will drop. 

Anurag adds: Domain experts will have an edge as the cost of technology development will reduce significantly. Many builders will solve complex problems with limited technology resources.

This post is released under a CC-BY-SA 4.0 license. Please feel free to republish on your site, with attribution and a link. Adaptation and rewriting, though allowed, should be true to the original.

Advertisement. Scroll to continue reading.
Written By

Founder @ MediaNama. TED Fellow. Asia21 Fellow @ Asia Society. Co-founder SaveTheInternet.in and Internet Freedom Foundation. Advisory board @ CyberBRICS

MediaNama’s mission is to help build a digital ecosystem which is open, fair, global and competitive.



Factors like Indus not charging developers any commission for in-app payments and antitrust orders issued by India's competition regulator against Google could contribute to...


Is open-sourcing of AI, and the use cases that come with it, a good starting point to discuss the responsibility and liability of AI?...


RBI Deputy Governor Rabi Shankar called for self-regulation in the fintech sector, but here's why we disagree with his stance.


Both the IT Minister and the IT Minister of State have chosen to avoid the actual concerns raised, and have instead defended against lesser...


The Central Board of Film Certification found power outside the Cinematograph Act and came to be known as the Censor Board. Are OTT self-regulating...

You May Also Like


Google has released a Google Travel Trends Report which states that branded budget hotel search queries grew 179% year over year (YOY) in India, in...


135 job openings in over 60 companies are listed at our free Digital and Mobile Job Board: If you’re looking for a job, or...


By Aroon Deep and Aditya Chunduru You’re reading it here first: Twitter has complied with government requests to censor 52 tweets that mostly criticised...


Rajesh Kumar* doesn’t have many enemies in life. But, Uber, for which he drives a cab everyday, is starting to look like one, he...

MediaNama is the premier source of information and analysis on Technology Policy in India. More about MediaNama, and contact information, here.

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ

Subscribe to our daily newsletter
Your email address:*
Please enter all required fields Click to hide
Correct invalid entries Click to hide

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ