bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gettext] fewer open() calls done by gettext()


From: Bruno Haible
Subject: [bug-gettext] fewer open() calls done by gettext()
Date: Tue, 17 Jan 2012 00:17:26 +0100
User-agent: KMail/4.7.4 (Linux/3.1.0-1.2-desktop; KDE/4.7.4; x86_64; ; )

Hi Ulrich,

Since the beginning, gettext()'s lookup of message catalogs has
searched the paths
  $LOCALEDIR/$ll_$CC.$CHARSET/LC_MESSAGES/$domain.mo
  $LOCALEDIR/$ll_$CC/LC_MESSAGES/$domain.mo
  $LOCALEDIR/$ll.$CHARSET/LC_MESSAGES/$domain.mo
  $LOCALEDIR/$ll/LC_MESSAGES/$domain.mo
if the locale is specified as $ll_$CC.$CHARSET.

In a typical program (attached below), this leads to 6 system calls,
and the .mo file is usually only found at the last of these 6 calls:

$ strace ./prog 2>&1 | grep ^open | grep prog.mo
open("/tmp/./fr_FR.UTF-8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such 
file or directory)
open("/tmp/./fr_FR.utf8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such 
file or directory)
open("/tmp/./fr_FR/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or 
directory)
open("/tmp/./fr.UTF-8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file 
or directory)
open("/tmp/./fr.utf8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file 
or directory)
open("/tmp/./fr/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or 
directory)

I would suggest to reduce this to 2 calls:

$ strace ./prog 2>&1 | grep ^open | grep prog.mo
open("/tmp/./fr_FR/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or 
directory)
open("/tmp/./fr/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or 
directory)

Rationale:

The use-case of storing different .mo files in
  fr/LC_MESSAGES/prog.mo and fr.UTF-8/LC_MESSAGES/prog.mo
or
  fr/LC_MESSAGES/prog.mo and fr.ISO-8859-1/LC_MESSAGES/prog.mo
or
  fr.UTF-8/LC_MESSAGES/prog.mo and fr.ISO-8859-1/LC_MESSAGES/prog.mo
is when translators would want to use different kinds of characters
(quotation characters or so), i.e. have one PO file for the UTF-8
locale and a different PO file for the more restricted character set.
Or when Japanese people did not trust the conversion between JISX character
sets and Unicode and therefore wanted to maintain a separate PO file
for EUC-JP.

But
  1. Translators never did this.
  2. In the future, translators will even less need it than in the past.
     Nowadays most PO files (even Japanese ones) are submitted in UTF-8
     encodings, and most users are in UTF-8 locales. It will therefore
     never make sense any more to have a PO file specialized for a non-
     Unicode locale charset.

Do you think this optimization is worth doing?

If this is OK with you, I can prepare the patch of intl/l10nflist.c
(of course, taking care to not modify the behaviour of locale/findlocale.c).

Bruno


How to reproduce:
$ gcc -Wall prog.c -o prog
$ strace ./prog 2>&1 | grep ^open | grep prog.mo

============================== prog.c ================================
#include <libintl.h>
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>

int main ()
{
  int n = 2;

  setenv ("LC_ALL", "fr_FR.UTF-8", 1);
  if (setlocale (LC_ALL, "") == NULL)
    /* Couldn't set locale.  */
    exit (77);

  textdomain ("prog");
  bindtextdomain ("prog", ".");

  printf (gettext ("'Your command, please?', asked the waiter."));
  printf ("\n");

  printf (ngettext ("a piece of cake", "%d pieces of cake", n), n);
  printf ("\n");

  printf (gettext ("%s is replaced by %s."), "FF", "EUR");
  printf ("\n");

  exit (0);
}
======================================================================




reply via email to

[Prev in Thread] Current Thread [Next in Thread]