bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

POSIX gettext(): messages catalog lookup when LANGUAGE is not set


From: Bruno Haible
Subject: POSIX gettext(): messages catalog lookup when LANGUAGE is not set
Date: Thu, 12 May 2022 01:19:30 +0200

https://posix.rhansen.org/p/gettext_draft
Lines 335, 344

  "For portable applications, only the LANGUAGE search supports searches
   across multiple locale names."
  "For the LANGUAGE search, ... if a locale name has the format
   language[_territory][.codeset][@modifier], additional searches of locale
   names without .codeset (if present), without _territory (if present),
   and without @modifier (if present) may be performed; if .codeset is not
   present, additional searches of locale names with an added .codeset may
   be performed. For the single-locale search, the localename part is the
   name of the current locale, or the locale specified in an *_l() function
   call, for the category named by categoryname."

As explained in my mails from 2021-05-04 and 2022-01-16, it is important to
support people who live in communities which often (but not always) have
translations of their own but can read translations for other locales.
While, at the same time, it is important allow a translator for say, German,
to produce a translation that is useful for users in Germany, Austria, and
Switzerland, if no other (more specific) translation is available.

So, while the user may be working in either of the locales
  de_DE.UTF-8
  de_AT.UTF-8
  de_CH.UTF-8
they SHOULD see the translations that have been installed at
  dirname/de/LC_MESSAGES/textdomainname.mo

This is true also if the LANGUAGE environment variable has not been set.
Most operating systems set the LANG or LC_ALL environment variable for the
user, but do not set LANGUAGE.

In this situation, the current text mandates(!) that for a user in the
de_DE.UTF-8 locale
  - dirname/de/LC_MESSAGES/textdomainname.mo gets always ignored, and
  - dirname/de_DE.UTF-8/LC_MESSAGES/textdomainname.mo gets used - but this
    messages object file almost never exists.

This is NOT how GNU gettext behaves. If POSIX standardizes it like this,
GNU libc and GNU gettext will have the choice among
  (a) looking in different (and fewer) directories than they do today,
      causing major i18n dysfunctionality to users, until the users
      have set up lots of symbolic links between directories or set
      LANGUAGE to a (redundant) value, or
  (b) violating POSIX in this point.

I will vote for (b).

If above text was adopted, it would have the consequences that

  1) Users will have to set LANGUAGE.

       LANG=de_DE.UTF-8

     will not be sufficient; instead the user will have to set

       LANG=de_DE.UTF-8
       LANGUAGE=de

For those users who don't do this:

  2) Many symbolic links are needed in /usr/share/locale/. Solaris 11.4
     is a system that implements gettext() as described in above text,
     and it has the links shown below [1].

  3) Users who want to create a new locale (e.g. for English in Australia)
     will have to create a symlink
     /usr/share/locale/en_AU -> /usr/share/locale/en
     and so on for each custom locale.

  4) Users who install packages in non-privileged directories (for GNU
     programs, that's the --prefix=PREFIX option) will have to create the
     same amount of symbolic links in their PREFIX/share/locale/ directory.

This is BAD, BAD, BAD.

Suggestion:
In line 344, make the
   "if a locale name has the format language[_territory][.codeset][@modifier],
    additional searches of locale names without .codeset (if present), without
    _territory (if present), and without @modifier (if present) may be
    performed; if .codeset is not present, additional searches of locale
    names with an added .codeset may be performed."
text apply also to the single-locale case.
In line 335, remove the sentence "only the LANGUAGE search supports searches
across multiple locale names."

Bruno

[1]
$ ls -l /usr/share/locale
total 102
drwxr-xr-x   3 root     other          3 Oct 13  2018 C
drwxr-xr-x   3 root     other          4 Oct 13  2018 de
lrwxrwxrwx   1 root     root           2 Oct 13  2018 de_DE -> de
lrwxrwxrwx   1 root     root           2 Oct 13  2018 de_DE.ISO8859-1 -> de
lrwxrwxrwx   1 root     root           2 Oct 13  2018 de_DE.ISO8859-15 -> de
lrwxrwxrwx   1 root     root           2 Oct 13  2018 de_DE.UTF-8 -> de
lrwxrwxrwx   1 root     root           2 Oct 13  2018 de.ISO8859-15 -> de
drwxr-xr-x   3 root     other          3 Oct 13  2018 de.us-ascii
lrwxrwxrwx   1 root     root           2 Oct 13  2018 de.UTF-8 -> de
drwxr-xr-x   3 root     other          3 Oct 13  2018 en
drwxr-xr-x   3 root     other          3 Oct 13  2018 en_US
drwxr-xr-x   3 root     other          3 Oct 13  2018 en@boldquot
drwxr-xr-x   3 root     other          3 Oct 13  2018 en@quot
drwxr-xr-x   3 root     other          3 Oct 13  2018 en@shaw
drwxr-xr-x   3 root     other          4 Oct 13  2018 es
drwxr-xr-x   3 root     other          3 Oct 13  2018 es_ES
lrwxrwxrwx   1 root     root           2 Oct 13  2018 es_ES.ISO8859-1 -> es
lrwxrwxrwx   1 root     root           2 Oct 13  2018 es_ES.ISO8859-15 -> es
lrwxrwxrwx   1 root     root           2 Oct 13  2018 es_ES.UTF-8 -> es
lrwxrwxrwx   1 root     root           2 Oct 13  2018 es.ISO8859-15 -> es
lrwxrwxrwx   1 root     root           2 Oct 13  2018 es.UTF-8 -> es
drwxr-xr-x   3 root     other          4 Oct 13  2018 fr
lrwxrwxrwx   1 root     root           2 Oct 13  2018 fr_FR -> fr
lrwxrwxrwx   1 root     root           2 Oct 13  2018 fr_FR.ISO8859-1 -> fr
lrwxrwxrwx   1 root     root           2 Oct 13  2018 fr_FR.ISO8859-15 -> fr
lrwxrwxrwx   1 root     root           2 Oct 13  2018 fr_FR.UTF-8 -> fr
lrwxrwxrwx   1 root     root           2 Oct 13  2018 fr.ISO8859-15 -> fr
lrwxrwxrwx   1 root     root           2 Oct 13  2018 fr.UTF-8 -> fr
drwxr-xr-x   3 root     other          4 Oct 13  2018 it
lrwxrwxrwx   1 root     root           2 Oct 13  2018 it_IT -> it
lrwxrwxrwx   1 root     root           2 Oct 13  2018 it_IT.ISO8859-1 -> it
lrwxrwxrwx   1 root     root           2 Oct 13  2018 it_IT.ISO8859-15 -> it
lrwxrwxrwx   1 root     root           2 Oct 13  2018 it_IT.UTF-8 -> it
lrwxrwxrwx   1 root     root           2 Oct 13  2018 it.ISO8859-15 -> it
lrwxrwxrwx   1 root     root           2 Oct 13  2018 it.UTF-8 -> it
drwxr-xr-x   3 root     other          4 Oct 13  2018 ja
lrwxrwxrwx   1 root     root           2 Oct 13  2018 ja_JP.eucJP -> ja
lrwxrwxrwx   1 root     root           2 Oct 13  2018 ja_JP.PCK -> ja
lrwxrwxrwx   1 root     root           2 Oct 13  2018 ja_JP.UTF-8 -> ja
drwxr-xr-x   3 root     other          4 Oct 13  2018 ko
lrwxrwxrwx   1 root     root           2 Oct 13  2018 ko_KR.EUC -> ko
lrwxrwxrwx   1 root     root           2 Oct 13  2018 ko_KR.UTF-8 -> ko
lrwxrwxrwx   1 root     root           2 Oct 13  2018 ko.UTF-8 -> ko
drwxr-xr-x   3 root     other          4 Oct 13  2018 pt
drwxr-xr-x   3 root     other          4 Oct 13  2018 pt_BR
lrwxrwxrwx   1 root     root           5 Oct 13  2018 pt_BR.ISO8859-1 -> pt_BR
drwxr-xr-x   3 root     other          3 Oct 13  2018 pt_BR.us-ascii
lrwxrwxrwx   1 root     root           5 Oct 13  2018 pt_BR.UTF-8 -> pt_BR
lrwxrwxrwx   1 root     root           2 Oct 13  2018 pt.ISO8859-15 -> pt
drwxr-xr-x   3 root     other          3 Oct 13  2018 pt.us-ascii
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh -> zh_CN
drwxr-xr-x   3 root     other          4 Oct 13  2018 zh_CN
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_CN.EUC -> zh_CN
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_CN.GB18030 -> zh_CN
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_CN.GBK -> zh_CN
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_CN.UTF-8 -> zh_CN
drwxr-xr-x   3 root     other          4 Oct 13  2018 zh_TW
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_TW.BIG5 -> zh_TW
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_TW.EUC -> zh_TW
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh_TW.UTF-8 -> zh_TW
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh.GBK -> zh_CN
lrwxrwxrwx   1 root     root           5 Oct 13  2018 zh.UTF-8 -> zh_CN






reply via email to

[Prev in Thread] Current Thread [Next in Thread]