bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments


From: Harald van Dijk
Subject: Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Date: Thu, 12 May 2022 19:55:39 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Thunderbird/100.0

On 12/05/2022 18:19, Steffen Nurpmeso via austin-group-l at The Open Group wrote:
Bruno Haible wrote in
  <4298913.vrqWZg68TM@omega>:
  |Steffen Nurpmeso wrote:
  |>  ...
  |>| [.] "UTF-7"."
  |>
  |> That is overshoot.
  |
  |No. UTF-7 is invalid here because it produces output that is not NUL
  |terminated. See:
  |
  |$ printf 'ab\0' | iconv -t UTF-7 | od -t c
  |0000000   a   b   +   A   A   A   -
  |0000007
  |
  |strlen() on such a return value makes invalid memory accesses.
  |You can convince yourself by running
  |$ OUTPUT_CHARSET=UTF-7 valgrind ls --help

This is then surely bogus?  UTF-7 is a normal single byte
character set and is to be terminated like anything else.  Nothing
in RFC 2152 nor RFC 3501 if you want makes me think something
else.

RFC 2152's rules 1 and 3 only allow specifying the listed characters as their ASCII form. All other characters, including U+0000, must be encoded using rule 2. GNU iconv is doing what the RFC specifies here.

Cheers,
Harald van Dijk



reply via email to

[Prev in Thread] Current Thread [Next in Thread]