Perhaps the time for Indic languages is now: mobile Internet and e-commerce are driving Internet penetration to towns that earlier didn’t find much need for it. India, as someone mentioned during #NAMA Indic, is an English-tolerant country, and our devices don’t mandate Indic language support. However, we don’t understand why this country needs to wait for a generational change for more people to get access to the Internet. The language barrier needs to be addressed.
#NAMA: The Digital Future of Indic Languages was a discussion that should have happened when we first tried to organize it, in 2009, but it’s never too late to do the right thing. It’s not a topic that is particularly “hot”, but it is an important one.
We’re thankful to Google India for supporting it. Do read our coverage from the compelling discussions that took place. Below is a summary of key points made:
Devices & Fonts
– Parity is the word: if I can use a device in English, we should make it easy for everyone to use the same way in Indic languages
– People don’t have money for desktops. Tablets and mobile phones and soft keyboards have made a difference.
– Google has released some exceptionally high quality Indic fonts.
– Work required in Indic language interface.
– Fonts don’t go back to legacy devices.
– Fonts don’t render properly on different screen sizes, across devices.
“What we develop for a bigger size device is not suitable for lower size device. Sizes change from one tablet to another. They’re not fitting well with designs and interfaces. If we can standardize on screen sizes and font rendering…it is more severe in Indian languages than in English.” – Ravi Hegde, Group Editor of Udayavani.
– Demand for Indic language apps on mobile exists: a Hindi language phonebook pre-installed for Micromax on 5 devices has seen 450,000 new users in the last month without the app being available on the Play Store.
– A switch to Web-based operating systems (Firefox OS) might help address font issues, via webapps and HTML5.
– Typography and fonts will help grow content creation, and CSS3 has helped.
– Amazon is looking at enabling Indic language fonts on Indic.
– Keypads for Indic need to be standardized. People will get confused with different layouts from different handset manufacturers or soft keypad firms.
– We have approached Indic language technology in a very fragmented manner. Need to look at quality of content and consumption, ability to create content easily with tools and access process for these tools.
– Can’t base Indic language keyboards on how English behaves. The design of Indian language technologies and tools have to be done ground up. Swipe will not necessarily work on an Indian language keypad since Indian characters combine to form different shapes.
– A central repository of fonts is needed, because publishers find it difficult to find fonts.
– A font for print will not work on the web and will not work on mobile.
– No one wants to install a font to view text.
– Google is working on Input mechanisms in Indic.
– Adding Quillpad (Indic) input to a publisher’s website increased average number of comments from 15 to 300.
Publishers and advertising
– Traffic is growing. Growth for Manorama online over last year is around 50-60%. OneIndia does 460 million pageviews.
– Most people thought that English speaking people cared for fashion etc. Everyone wants good phones and cars. Growth in Indic vertical sites in indicative of that. Consumption of lifestyle, gadgets etc has increased.
– Publishers have legacy platforms which they use to publish data into the digital platform. They need pluggable solutions for publishing to the web.
– Reverie is processing 70 to 80 million records through its transliteration API.
– Publishers don’t work with Ad networks because of issues regarding payments and timeframes, and lower rates.
– Direct advertisers willing to pay same CPM rates for Indic sites, if they have high traffic.
– Adwords contextual advertising doesn’t support Indic languages, which is a challenge for bloggers. No takers for Kannada Adwords.
– Clickthrough rates for Indic ads are 2x to 4x of English.
– Getting Indic language creatives are tough, but some advertisers like Cathay Pacific have advertised in Indic, for example, when targeting Hyderabad.
– Telecom operator billing works better for subscription. Newshunt has seen 3-3.5 million e-books downloaded in a short time, mostly Indic. Hundreds of e-books have crossed the 10,000 mark.
– URL’s in Indic cannot be shared as per one publisher.
– At times, companies don’t send out ads in Indic languages because they worry about whether the text will render on receiving device.
– E-commerce players not advertising on Indic despite 4x click-through rates because they’re unable to close the loop without translated websites.
– Indic language mobile apps for e-commerce firms are in the works.
– There is a need for an Indic language ecosystem approach to e-commerce: from advertising to landing page to payment gateway.
– Alibaba size companies will not be possible without Indic. Tier 1 and 2 cities are covered, and for Tier 3, we need Indic sites.
– Need to avoid archaic language translation. Machine translation needs to improve, to avoid the use of Sanskritised Hindi (languages) which no one understands.
– E-commerce will find it tricky to have user reviews and comments in multiple Indic languages because no one understands all languages. Translating comments to the language being displayed is important.
Social vs Search
– English publishers get 60% traffic from Google Search. Indic publishers get 20%.
– Indic publishers get most of their traffic directly and via Facebook.
– Search based discovery has improved over the last two years, but is still not good enough. “It is nowhere near where it should be.”
– Google has been working on improving search capabilities over the past couple of years.
– Google shut Google News Kannada.
– Traffic from Google News is seasonal: if a publishers story reaches the top, it is great traffic. If it doesn’t, there’s hardly any.
– Facebook doesn’t yet have Indic input mechanisms, but saw Hindi double after they started began showing registration pages in Indic languages on the basis of carrier, rather than the browser (which showed English)
– Facebook began exposing locale data to advertisers to allow them to target users in Indic languages
Indic language comments and legal issues
– Publishers face legal challenges that arise from abusive comments in Indic languages, and do not have mechanisms to prevent these.
– Publishers aren’t aware that Facebook pages have an option for filtering words in Indic languages.
– Google has a spam and word filter for search suggest.
– Publishers want an open sourced comment system for filtering comments in Indic languages (and want Google to do this).
– Plustxt messaging app saw a 10x increase after enabling four Indic languages – Hindi, Tamil, Kannada and Malayalam.
– Google used to have a transliteration API, for which there is now a sunset date, in a couple of months. That is a problem.
YouTube and Indic languages
– YouTube allows metadata in Indic, but publishers don’t use it.
– English content on YouTube in India is number 4 in terms of consumption. Telugu, Tamil, Hindi are the top three.
– In the last 6 months, Google has received queries from Indic authors and publishers wanting to create YouTube channels for archiving content.
Government Policy & Wishlist
– Schools teaching how to use the Internet is making a difference.
– Government has legacy data which it wants to convert to indexible, searchable data. They haven’t found good quality solutions for that.
– Government needs to mandate Indic languages on devices, and is working on a policy. Bangladesh is an example.
– Need language dictionaries, thesaurus. There is no way of monetizing this. Government has invested in it, spent Rs 50-60 crore in Indic language computing, OCR and transliteration. These investments will become redundant, and need to be made public & open source before they become redundant.
– Success of government investments needs to be measured in terms of how many people use it. Accessibility is important, millions need to use these. People who don’t know English risk becoming third class digital citizens. Need to move from investing in technology to making it available.
– Government should not buy devices that don’t have Indic support, even if they can’t mandate Indic capability for imported handsets.
– All government tenders should also be published in Indic languages. Kerala government has content in Indic, but Karnataka doesn’t.
– All the e-governance should be done using a 3 language formula: State language, Hindi and English.
– Any law that is not published in the language that I understand should not be applicable to me. If I dont understand the law, how can I abide by it?
– Parliamentary proceedings should be available in Indic languages.
– DAVP limit of minimum 5 million pageviews should be reduced for Indic languages.
– Central Government should allocate 70% of DAVP spends to Indic language sites for 3 years. State Government should allocate 90% of spends to Indic language sites.
– 95% of TV is Indic languages, while 95% of Web in English.
– Google has just launched Maps in Indic.
– Google needs to create a data report on the Indic languages on the web. Lack of data is a problem. Big gap between comScore and Google Analytics numbers for Indic publishers, and comScore doesn’t track mobile.
– More agencies and advertisers need to advertise in and on Indic. Indic now has big traffic, but little advertiser interest.
– Google needs to create more awareness of Indic tools.
– Bhasha Niti is recommending a multi-year language policy, including programming in Indic languages.
– For all the Indic languages, the number of people who make more than 5 edits 5 times in a month is in double digits and less than 100. Malayalam tops, and Hindi is high, but Tamil, Kannada, Telugu is low.
– Telecom operators want to make services that are popular in English available in Hindi. When they tried with one, “In about 2 weeks, the subscriber user base jumped 3x”.
#NAMA Indic: The Digital Future of Indic Languages, was supported by Google India