In what is a painstaking process, Wikipedians are digitizing Indian language, out-of-copyright texts online, trying to address the comparative paucity of Indic language texts online. Wikisource is a repository of documents and archived material that serves as a reference source for Wikipedia, and a means of improving access to information sources. Of the 64 languages Wikisource is available in, 8 are Indian: Tamil (stats), Malayalam (stats), Telugu (stats), Kannada (stats), Sanskrit (stats), Marathi (stats), Bengali (stats) and Gujarati (stats). What's particularly notable about this digitization is that the texts are being typed out by volunteers on their own time, one word at a time. How It Began Users were adding bhajans of Mirabai to Wikipedia, but according to Wikipedia's policies, recipes, poems and song lyrics belong to Wikibooks or Wikisource, Noopur Raval, Communications Consultant (India Program) at the Wikimedia Foundation told MediaNama. One user raised this issue, and following discussions, it was decided to create a Wikisource for Gujarati. The first text to be digitized, though, was Rachnatmak Karyakram, a book by Mahatma Gandhi. The project, involving the digitization of 60 pages, took six volunteers a week. This was followed by another project, the digitization of Gandhi's autobiography, with a group of 13 people typing out the book over a month. Identification & Prioritization Of Texts For Digitization Selection of text for digitization is entirely community driven: they decide what is important. Editors put up a notice for the project, and user participation is sought. For example, the Gujarati Wikisource editors chose…
