[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] [PATCH] OS/2 patches for libiconv
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] [PATCH] OS/2 patches for libiconv |
Date: |
Sun, 12 Jun 2011 12:56:15 +0200 |
User-agent: |
KMail/1.9.9 |
Hi,
KO Myung-Hun wrote:
> >> 0002-If-codeset-is-not-set-by-the-user-use-a-codepage-for.patch
> >
> > This one has the effect that when the user has set the environment variable
> > LC_ALL or LC_CTYPE or LANG to a value that contains no dot, then the program
> > will use the encoding from the codepage in the OS.
> >
> > This is not good, because that's not how POSIX programs are supposed to
> > behave
> > (see
> > <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html>):
> >
> > - If LC_ALL, LC_CTYPE, or LANG are set to a non-empty value, this value
> > holds.
> > What localcharset.c does additionally is that it maps locale names to
> > encodings. But in any case it is important that in the "C" locale the
> > results don't depend on operating system settings, because users on
> > different
> > machines should get the same results.
> >
>
> I could find that C or POSIX locale should be assumed if charset is not
> specified by those environmental variables.
... and if the platform has no working locales. Yes. But your patch modified
the behaviour of locale_charset() when environment variables are set.
> > - If LC_ALL, LC_CTYPE, LANG are not set, _then_ the program is free to use
> > the settings from the operating system (see POSIX, above):
> > "All implementations shall define a locale as the default locale, to
> > be invoked when no environment variables are set, or set to the empty
> > string. This default locale can be the POSIX locale or any other
> > implementation-defined locale. Some implementations may provide
> > facilities for local installation administrators to set the default
> > locale, customizing it for each location."
> >
> > So, if you found that localcharset did not return the encoding you expected,
> > then either
> > - unset some environment variables, or
> > - add a mapping from locale name to encoding in the file charset.alias.
>
> Codepage can be used as well as charset.alias on OS/2.
The codepage is to be used when no environments are set (and setlocale is
non-functional or has not been called). charset.alias is to be used when
the user has specified a locale through environment variables.
> There is no need to refuse to use another choice obstinately.
Your patch has the effect that when the user has set the environment variable
LC_ALL or LC_CTYPE or LANG to a value that contains no dot, then the program
will use the encoding from the codepage in the OS. This makes no sense,
because charset.alias is designed to handle this case.
> Even more, WIN32_NATIVE implementation does not care about the
> environmental variables at all.
You're right, that may be a bug. gnulib now has a setlocale() for native
Win32 that looks at the environment variables and does the necessary
conversions between Unix locale names and Win32 locale names. It is well
possible that locale_charset() and nl_langinfo(CODESET) should use the
charset that matches the locale set via setlocale().
But this applies only to platforms that have a full, working set of locales.
I think OS/2 (emx+gcc) is not in this camp.
Bruno
--
In memoriam Medgar Evers <http://en.wikipedia.org/wiki/Medgar_Evers>