Facebook

Speaking at the IAMAI Conference on Mobile Content & Services, Dr. Nadeem Akhtar of the Centre of Excellence in Wireless Technology (CEWiT) said that a new standard for sending SMS in Indian languages is likely to be implemented soon. “The new scheme was proposed to the standards body 3GPP, it has been accepted at the first level. We are expecting a final approval in the next couple of weeks. There’s a meeting from the 16th-18th (of September), and it’s likely to be accepted. It might take 2-3 years, optimistically, for the new schemes to be accepted in the market,” Dr. Akhtar told MediaNama.

Need For The New Standard

“Indic languages are complex, and the current standard specifies an encoding scheme which is unicode based, which is 2 bytes per character. You cant get more than 70 characters, or 7-8 Indic words in an SMS today, less than half the SMS size for English. The operators who deploy this in India use picture messaging, and send it out as an image, as an EMS. The receiver also has to be able to display that. The image is so large, that it has to be split into 2-3 messages. The cost of sending that 160 character message is thus tripled, and it’s Rs. 3 for an Hindi SMS in case of Indian operators. The other issue is of Keypad layouts – every vendor has his own way of indic keys, there is no standardization.”

CEWiT focused on Indic encoding part, and defined a 7 bit encoding scheme, which allows a 155 characters in an Indic SMS, and is comparable with an English SMS. This has been done for 10 major Indic scripts – Bengali, Gujarati, Hindi, Kannada, Malayalam, Oriya, Punjabi, Tamil, Telugu and Urdu – and it also supports transliteration. Akthar suggests that with 10 languages tables, all 22 official languages in India can be supported. Dr. Akhtar said that the company hasn’t tried to patent the standard, because they’re a government funded entity, and this is in public interest.

“More than Peer to Peer, we believe that push services will drive the uptake. Because now you have more space, you can push out more information. There’s 1/3rd cost reduction for them as well. Public service messages can now be in local languages – at least all handsets will be able to receive it,” Dr. Akhtar said.

Challenges & Need For Handset Manufacturers & Telcos

Most of the challenges are related to the acceptance of encoding among handset manufacturers. “Keypad layouts vary from device to device. We need vendor support, in 6 months, one year, two years – there needs to be support for this encoding in new handsets. This will incentivize people to use these handsets. There’s value for the handset manufacturers, particularly for rural areas. The manufacturers have to figure out how they will implement the solution. The challenge is that with the new handserts coming out, it should have both sending and receiving – encoding and decoding of the messages.”

“The bigger challenge is with the devices already in the market, which will not support this scheme. For them, we are trying to find a way by which we can convert from the new scheme to the legacy scheme. You have one mapping, and it converts for receiving. However, legacy devices will not be able to send using the new system, that conversion can be done on the network. Instead of the legacy support in the device, you can also do the conversion on the network, but that still means that 3 messages are being sent, and costs are higher. We feel that a terminal (Handset) based solution is most important.”

Post a Comment  |   Share this on : buzz facebook facebook facebook Stumbleupon Delicious Yahoo Buzz
Newsletter Newsletter
Subscribe buzz facebook facebook facebook

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.



8 Comments until now.

Shashi + September 2nd, 2009 (#):

How can one support so many languages and characters in 7 bit? If the code is 7 bit long, it can support only 128 character, which is barely sufficient for 2 regional languages.

Ankit + September 2nd, 2009 (#):

@Shashi: probably using some sort of code pages.

This standard seems pointless. The 140 characters per SMS limit is arbitrary (I feel). Wouldn't it be better to increase the limit, and use Unicode, than defining yet-another-encoding-scheme, whose implementation will require device changes?

Nadeem + September 2nd, 2009 (#):

A few points that need to be clarified:

1. The 7-bit code is used for each language, so there are separate character tables for each language.

2. The 140 byte limit is not arbitrary. Short messages are transported on signalling channels normally used for control message and that limits the payload.

Serge + September 3rd, 2009 (#):

try out 44 languages more:
http://ok-board.com

@kaippally + September 4th, 2009 (#):

This is regressive and down right stupid. This sort of short term thinking is what brought indian language computing to such a dilapidated state. it took over 10 years for indic-unicode to stabilize to a standard we see today.

Instead of changing the GSM- SMSs limitation our sort sighted geniuses are designing an ALL NEW encoding standard. yeah that's right lets cut the feet to fit the shoe.

First there were a hundred different ASCII – HACK fonts to download just to read a newspaper. And now we have yet another encoding scheme to add to the confusion.

Is the GSM standard such a huge brick wall that Indian developer's can't propose an expansion.

Shweta + September 5th, 2009 (#):

As mentioned in the article, today indic language messages are sent using image. Then what are the existing standards we are talking about? Is there any std exiting for indic languages now? unicode is a stad for sms in eng and other languages or is it just for one?

some1 + October 13th, 2009 (#):

unicode is used not used for english …. english sms uses GSM 7 bit encoding while indian languages presently use unicode 16 bit encoding …..hence the reduction in no of characters to 70 as opposed to 160 in english….

ardibehest + February 4th, 2010 (#):

So what's new. See 7 bit iscii code chart available as a Bis document and every thing is accommodated on it. BIS document was first framed in 88 and then in 92. So the claim that all 10 languages can be accomodated should be emended, since ISCII supports not only languages but SCRIPTS.
My advice look and see at what has been already designed