wordpress blog stats
Connect with us

Hi, what are you looking for?

Wikipedians Digitizing Out-Of-Copyright Texts In Eight Indian Languages

 In what is a painstaking process, Wikipedians are digitizing Indian language, out-of-copyright texts online, trying to address the comparative paucity of Indic language texts online. Wikisource is a repository of documents and archived material that serves as a reference source for Wikipedia, and a means of improving access to information sources. Of the 64 languages Wikisource is available in,  8 are Indian: Tamil (stats), Malayalam (stats), Telugu (stats), Kannada (stats), Sanskrit (stats), Marathi (stats), Bengali (stats) and Gujarati (stats). What's particularly notable about this digitization is that the texts are being typed out by volunteers on their own time, one word at a time. How It Began Users were adding bhajans of Mirabai to Wikipedia, but according to Wikipedia's policies, recipes, poems and song lyrics belong to Wikibooks or Wikisource, Noopur Raval, Communications Consultant (India Program) at the Wikimedia Foundation told MediaNama. One user raised this issue, and following discussions, it was decided to create a Wikisource for Gujarati. The first text to be digitized, though, was Rachnatmak Karyakram, a book by Mahatma Gandhi. The project, involving the digitization of 60 pages, took six volunteers a week. This was followed by another project, the digitization of Gandhi's autobiography, with a group of 13 people typing out the book over a month. Identification & Prioritization Of Texts For Digitization Selection of text for digitization is entirely community driven: they decide what is important. Editors put up a notice for the project, and user participation is sought. For example, the Gujarati Wikisource editors chose…

Please subscribe/login to read the full story.
Written By

Founder @ MediaNama. TED Fellow. Asia21 Fellow @ Asia Society. Co-founder SaveTheInternet.in and Internet Freedom Foundation. Advisory board @ CyberBRICS

MediaNama’s mission is to help build a digital ecosystem which is open, fair, global and competitive.

Views

News

Studying the 'community' supporting the late Sushant Singh Rajput (SSR) shows how Twitter was gamed through organized engagement

News

Do we have an enabling system for the National Data Governance Framework Policy (NDGFP) aiming to create a repository of non-personal data?

News

A viewpoint on why the regulation of cryptocurrencies and crypto exchnages under 2019's E-Commerce Rules puts it in a 'grey area'

News

India's IT Rules mandate a GAC to address user 'grievances' , but is re-instatement of content removed by a platform a power it should...

News

There is a need for reconceptualizing personal, non-personal data and the concept of privacy itself for regulators to effectively protect data

You May Also Like

News

Google has released a Google Travel Trends Report which states that branded budget hotel search queries grew 179% year over year (YOY) in India, in...

Advert

135 job openings in over 60 companies are listed at our free Digital and Mobile Job Board: If you’re looking for a job, or...

News

By Aroon Deep and Aditya Chunduru You’re reading it here first: Twitter has complied with government requests to censor 52 tweets that mostly criticised...

News

Rajesh Kumar* doesn’t have many enemies in life. But, Uber, for which he drives a cab everyday, is starting to look like one, he...

MediaNama is the premier source of information and analysis on Technology Policy in India. More about MediaNama, and contact information, here.

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ

Subscribe to our daily newsletter
Name:*
Your email address:*
*
Please enter all required fields Click to hide
Correct invalid entries Click to hide

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ