bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

POSIX msgfmt and universal-character-name escape sequences


From: Bruno Haible
Subject: POSIX msgfmt and universal-character-name escape sequences
Date: Thu, 23 Jun 2022 08:01:27 +0200

https://posix.rhansen.org/p/gettext_draft
Line 1031

"except that universal-character-name escape sequences need not be supported."

Neither GNU msgfmt nor Solaris msgfmt treat universal-character-name
escape sequences specially. If an msgstr contains e.g. "\\u20AC", the
resulting string in the .mo file is
{ '\\', 'u', '2', '0', 'A', 'C', '\0' }.

Issue: Leaving it undefined whether \u escape sequences are recognized can
lead to mutual incompatibility of msgfmt implementations: Implementations
would differ in their interpretation of the dot-po file.

There is no good reason for leaving it undefined: There is already a
mechanism for specifying an encoding (charset=... in the header), and the
UTF-8 encoding is in widespread use for more than 10 years.

Suggestion: Change
"except that universal-character-name escape sequences need not be supported."
to
"except that universal-character-name escape sequences are not supported."






reply via email to

[Prev in Thread] Current Thread [Next in Thread]