qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH v2] vl: set LC_CTYPE early in main() for all code


From: Daniel P . Berrangé
Subject: [Qemu-devel] [PATCH v2] vl: set LC_CTYPE early in main() for all code
Date: Mon, 15 Apr 2019 15:15:47 +0100

Localization is not a feature whose impact is limited to the UI
frontends. Other parts of QEMU rely in localization. In particular the
USB MTP driver needs to be able to convert filenames from the locale
specified character set into UTF-16 / UCS-2 encoded wide characters.
setlocale() is only set from two of the UI frontends though, and worse,
there is inconsistent behaviour with GTK setting LC_CTYPE to C.UTF-8,
while ncurses honours whatever is set in the user's environment.

This causes the USP MTP driver to behave differently depending on which
UI frontend is activated.

Furthermore, the curses settings are dangerous because LC_CTYPE will affect
the is{upper,lower,alnum} functions which much QEMU code assumes to have
C locale sorting behaviour. This also breaks QMP if the env requests a
non-UTF-8 locale, since QMP is defined to use UTF-8 encoding for JSON.
This problematic curses code was introduced in:

  commit 2f8b7cd587558944532f587abb5203ce54badba9
  Author: Samuel Thibault <address@hidden>
  Date:   Mon Mar 11 14:51:27 2019 +0100

    curses: add option to specify VGA font encoding

This patch moves the GTK frontend setlocale() handling into the main()
method. This ensures QMP and other QEMU code has a predictable C.UTF-8.

Eventually QEMU should set LC_ALL, honouring the full user environment,
but this needs various cleanups in QEMU code first. Hardcoding LC_CTYPE
to C.UTF-8 is a partial regression vs the above curses commit, since it
will break the curses wide character handling for non-UTF-8 locales but
this is unavoidable until QEMU is cleaned up to cope with non-UTF-8
locales fully.

Setting of LC_MESSAGES is left in the GTK code since only the GTK
frontend is using translation of strings. This lets us avoid the
platform portability problem where LC_MESSAGES is not provided by
locale.h on MinGW. GTK pulls it in indirectly from libintl.h via
gi18n.h header, but we don't want to pull that into the global
QEMU namespace.

Signed-off-by: Daniel P. Berrangé <address@hidden>
---

Changed in v2:

 - Leave LC_MESSAGES setting in gtk code to avoid platform
   portability problems.

 ui/curses.c |  2 --
 ui/gtk.c    | 32 +++++++++++---------------------
 vl.c        | 40 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+), 23 deletions(-)

diff --git a/ui/curses.c b/ui/curses.c
index cc6d6da684..403cd57913 100644
--- a/ui/curses.c
+++ b/ui/curses.c
@@ -27,7 +27,6 @@
 #include <sys/ioctl.h>
 #include <termios.h>
 #endif
-#include <locale.h>
 #include <wchar.h>
 #include <langinfo.h>
 #include <iconv.h>
@@ -716,7 +715,6 @@ static void curses_display_init(DisplayState *ds, 
DisplayOptions *opts)
     }
 #endif
 
-    setlocale(LC_CTYPE, "");
     if (opts->u.curses.charset) {
         font_charset = opts->u.curses.charset;
     }
diff --git a/ui/gtk.c b/ui/gtk.c
index e96e15435a..e6a41c79ab 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -2208,12 +2208,12 @@ static void gtk_display_init(DisplayState *ds, 
DisplayOptions *opts)
 
     s->free_scale = FALSE;
 
-    /* Mostly LC_MESSAGES only. See early_gtk_display_init() for details. For
-     * LC_CTYPE, we need to make sure that non-ASCII characters are considered
-     * printable, but without changing any of the character classes to make
-     * sure that we don't accidentally break implicit assumptions.  */
+    /*
+     * See comment in main() for why it has only setup LC_CTYPE,
+     * as opposed to LC_ALL. Given that we need to enable
+     * LC_MESSAGES too for menu translations.
+     */
     setlocale(LC_MESSAGES, "");
-    setlocale(LC_CTYPE, "C.UTF-8");
     bindtextdomain("qemu", CONFIG_QEMU_LOCALEDIR);
     textdomain("qemu");
 
@@ -2262,22 +2262,12 @@ static void gtk_display_init(DisplayState *ds, 
DisplayOptions *opts)
 
 static void early_gtk_display_init(DisplayOptions *opts)
 {
-    /* The QEMU code relies on the assumption that it's always run in
-     * the C locale. Therefore it is not prepared to deal with
-     * operations that produce different results depending on the
-     * locale, such as printf's formatting of decimal numbers, and
-     * possibly others.
-     *
-     * Since GTK+ calls setlocale() by default -importing the locale
-     * settings from the environment- we must prevent it from doing so
-     * using gtk_disable_setlocale().
-     *
-     * QEMU's GTK+ UI, however, _does_ have translations for some of
-     * the menu items. As a trade-off between a functionally correct
-     * QEMU and a fully internationalized UI we support importing
-     * LC_MESSAGES from the environment (see the setlocale() call
-     * earlier in this file). This allows us to display translated
-     * messages leaving everything else untouched.
+    /*
+     * GTK calls setlocale() by default, importing the locale
+     * settings from the environment. QEMU's main() method will
+     * have set LC_MESSAGES and LC_CTYPE to allow GTK to display
+     * translated messages, including use of wide characters. We
+     * must not allow GTK to change other aspects of the locale.
      */
     gtk_disable_setlocale();
     gtkinit = gtk_init_check(NULL, NULL);
diff --git a/vl.c b/vl.c
index c696ad2a13..87a55cddb3 100644
--- a/vl.c
+++ b/vl.c
@@ -49,6 +49,7 @@ int main(int argc, char **argv)
 #define main qemu_main
 #endif /* CONFIG_COCOA */
 
+#include <locale.h>
 
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
@@ -3022,6 +3023,45 @@ int main(int argc, char **argv, char **envp)
     char *dir, **dirs;
     BlockdevOptionsQueue bdo_queue = QSIMPLEQ_HEAD_INITIALIZER(bdo_queue);
 
+    /*
+     * Ideally we would set LC_ALL, but QEMU currently isn't able to cope
+     * with arbitrary localization settings. In particular there are two
+     * known problems
+     *
+     *   - The QMP monitor needs to use the C locale rules for numeric
+     *     formatting. This would need a double/int -> string formatter
+     *     that is locale independant.
+     *
+     *   - The QMP monitor needs to encode all data as UTF-8. This needs
+     *     to be updated to use iconv(3) to explicitly convert the current
+     *     locale's charset into utf-8
+     *
+     *   - Lots of codes uses is{upper,lower,alnum,...} functions, expecting
+     *     C locale sorting behaviour. Most QEMU usage should likely be
+     *     changed to g_ascii_is{upper,lower,alnum...} to match code
+     *     assumptions, without being broken by locale settnigs.
+     *
+     * We do still have two requirements
+     *
+     *   - Ability to correct display translated text according to the
+     *     user's locale
+     *
+     *   - Ability to handle multibyte characters, ideally according to
+     *     user's locale specified character set. This affects ability
+     *     of usb-mtp to correctly convert filenames to UCS16 and curses
+     *     & GTK frontends wide character display.
+     *
+     * The second requirement would need LC_CTYPE to be honoured, but
+     * this conflicts with the 2nd & 3rd problems listed earlier. For
+     * now we make a tradeoff, trying to set an explicit UTF-8 localee
+     *
+     * Note we can't set LC_MESSAGES here, since mingw doesn't define
+     * this constant in locale.h Fortunately we only need it for the
+     * GTK frontend and that uses gi18n.h which pulls in a definition
+     * of LC_MESSAGES.
+     */
+    setlocale(LC_CTYPE, "C.UTF-8");
+
     module_call_init(MODULE_INIT_TRACE);
 
     qemu_init_cpu_list();
-- 
2.20.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]