[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gnokii charset problems

From: Osma Suominen
Subject: gnokii charset problems
Date: Sat, 31 May 2003 11:27:35 +0300 (EEST)


I inspected the gnokii source for how it handles different charsets and
locale. There are some problems. Here are my notes that I prepared
offline (I'm doing most work offline as I'm travelling and have a
connection only sporadically)

So here is a description of what gnokii does, what's wrong with it and
how it probably should be fixed. My assumption is that the charset
associated with locale should be used when handling input and output;
what happens inside gnokii and between the phone and gnokii is another
issue and can be done in whatever way works best.

Enjoy. Develop an itch. Fix. ;)


Character set conversions used by gnokii when handling text SMS messages
(xgnokii not examined, only gsm-sms.c in libgnokii and CLI gnokii)


        - assume Latin1 input regardless of locale (NOT OK)
        - if contains char not in Latin1/DA map, set enc to UCS2 (NOT OK)
                -> gsm-sms.c translates from <locale> to UCS2 (OK)
        - if req. encoding is DA, convert from Latin1 to def.alph. (NOT OK)
        - if req. encoding is UCS2, convert from <locale> to UCS2 (OK)


        - print user_data[0].u.text to stdout; no charset conversion (OK)
        - if encoding is UCS2, convert from UCS2 to <locale> (OK)
        - if encoding is DA, convert from DA to Latin1 (NOT OK)


gnokii.c assumes input to be in Latin1 regardless of locale; it is
possible to have a non-Latin1 locale with a character that should be UCS2
encoded but happens to match a Latin1/DA mapping, so DA is used instead.

If DA encoding is specified when calling gn_sms_send, input is assumed
to be in latin1 regardless of locale.

If message is DA encoded, output is in latin1 regardless of locale.


gn_char_def_alphabet should be made locale-aware; this would fix gnokii.c

gsm-sms.c should convert from <locale> to DA (or Latin1 on 3110 series)
and back


DA              Default Alphabet (as per GSM spec, 128 characters)
Latin1          ISO Latin-1 aka ISO-8859-1 charset
UCS2            Unicode encoding with 16-bit wide characters
<locale>        Character set specified in locale

*** Osma Suominen *** address@hidden *** ***

reply via email to

[Prev in Thread] Current Thread [Next in Thread]