Aragon Gouveia wrote:
So I take it this means that if one were writing a locale aware application,
the application's ability to function predictability is very much upto the
platform and system on which it runs? ie. one can't rely on just ensuring
gettext is installed correctly...
Yes. gettext does not replace the system's locales. If you are on a system
with broken locales, then either you have a localedef command (like on
glibc or Solaris systems), or you are hosed (that's the case on most
other systems, including *BSD, Cygwin, mingw).
I use FreeBSD primarily
You might want to try GNU/kFreeBSD instead: a glibc system with FreeBSD
kernel - and so it supports 'localedef'.
And be aware that the <ctype.h> functions are meaningless in multibyte locales
Does this apply to all systems? I use FreeBSD primarily, and their locales
are named, for example, "ja_JP.UTF-8" - this makes me think the FreeBSD
ctype functions will be multibyte aware...
FreeBSD <ctype.h> are certainly multibyte aware. But isalnum() is not
sufficient for testing whether 'ü' is a lower-case or upper-case letter
because often strlen("Ü") == 2.
edit: just noticed FreeBSD has ctype functions like iswalnum() for handling
"wide characters" and are declared in wctype.h. Cool! :)
Yes, mbtowc() + iswalnum() together are a working replacement for isalnum().
But I would not recommend to use functions which work on wide character
*strings* (wchar_t*) - doing so causes more problems that it solves. The
preferred representations for strings continue to be char* strings,
either in locale encoding (the default) or in UTF-8 encoding (see also
the unistr/u8* functions in gnulib).
Bruno