bug-gettext
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments


From: Steffen Nurpmeso
Subject: Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Date: Fri, 13 May 2022 00:10:42 +0200
User-agent: s-nail v14.9.24-239-gac783eda4b

Harald van Dijk wrote in
 <bd336669-960b-1f5f-fffc-30905d4c8e82@gigawatt.nl>:
 |On 12/05/2022 18:19, Steffen Nurpmeso via austin-group-l at The Open 
 |Group wrote:
 |> Bruno Haible wrote in
 |>   <4298913.vrqWZg68TM@omega>:
 |>|Steffen Nurpmeso wrote:
 |>|>  ...
 |>|>| [.] "UTF-7"."
 |>|>
 |>|> That is overshoot.
 |>|
 |>|No. UTF-7 is invalid here because it produces output that is not NUL
 |>|terminated. See:
 |>|
 |>|$ printf 'ab\0' | iconv -t UTF-7 | od -t c
 |>|0000000   a   b   +   A   A   A   -
 |>|0000007
 |>|
 |>|strlen() on such a return value makes invalid memory accesses.
 |>|You can convince yourself by running
 |>|$ OUTPUT_CHARSET=UTF-7 valgrind ls --help
 |> 
 |> This is then surely bogus?  UTF-7 is a normal single byte
 |> character set and is to be terminated like anything else.  Nothing
 |> in RFC 2152 nor RFC 3501 if you want makes me think something
 |> else.
 |
 |RFC 2152's rules 1 and 3 only allow specifying the listed characters as 
 |their ASCII form. All other characters, including U+0000, must be 
 |encoded using rule 2. GNU iconv is doing what the RFC specifies here.

No really, please.  And please do not strip important content,
i am neither Chinese nor Russian, and especially not one of the
other 7 billion that do not count.
(I said surely bogus because i alone see the shiny light of having
found give-me-five GNU iconv errors.  Or even beyond that.)

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]