Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments

bug-gettext

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments

From:	Harald van Dijk
Subject:	Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Date:	Fri, 13 May 2022 09:05:25 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:100.0) Gecko/20100101 Thunderbird/100.0

On 12/05/2022 23:10, Steffen Nurpmeso wrote:

Harald van Dijk wrote in
  <bd336669-960b-1f5f-fffc-30905d4c8e82@gigawatt.nl>:
  |On 12/05/2022 18:19, Steffen Nurpmeso via austin-group-l at The Open
  |Group wrote:
  |> Bruno Haible wrote in
  |>   <4298913.vrqWZg68TM@omega>:
  |>|Steffen Nurpmeso wrote:
  |>|>  ...
  |>|>| [.] "UTF-7"."
  |>|>
  |>|> That is overshoot.
  |>|
  |>|No. UTF-7 is invalid here because it produces output that is not NUL
  |>|terminated. See:
  |>|
  |>|$ printf 'ab\0' | iconv -t UTF-7 | od -t c
  |>|0000000   a   b   +   A   A   A   -
  |>|0000007
  |>|
  |>|strlen() on such a return value makes invalid memory accesses.
  |>|You can convince yourself by running
  |>|$ OUTPUT_CHARSET=UTF-7 valgrind ls --help
  |>
  |> This is then surely bogus?  UTF-7 is a normal single byte
  |> character set and is to be terminated like anything else.  Nothing
  |> in RFC 2152 nor RFC 3501 if you want makes me think something
  |> else.
  |
  |RFC 2152's rules 1 and 3 only allow specifying the listed characters as
  |their ASCII form. All other characters, including U+0000, must be
  |encoded using rule 2. GNU iconv is doing what the RFC specifies here.

No really, please.  And please do not strip important content,

I didn't think I did. You didn't read the RFC properly, I replied toshow where and how the RFC specifies exactly what GNU iconv does, therest of your message looks like it's based on the false assumption thatthe RFC specifies something other than what it does, which becomesirrelevant when that assumption is corrected. Looking in more detail,there is one thing I should have responded to. Included here.

UTF-7.  Heck, how about that, for example:

  LC_ALL=C printf 'ab\0' |  iconv -f iso-8859-1 -t utf-16 | od -t c
  0000000  \0  \0   a  \0   b  \0  \0  \0

Two leading NULs?

This is not what GNU iconv prints at all, at least not on my system,which just uses the GNU version unmodified. Rather, it prints


0000000 377 376   a  \0   b  \0  \0  \0
0000010

That is, it includes a BOM, just like it showed in your SunOS output.Both the GNU iconv that is shipped as part of GNU libc 2.35, and the GNUiconv that is shipped as part of GNU libiconv 1.16, print this. Thoseare the current releases. If you are testing an older release, or amodified version, that is important information missing from yourmessage. If you are seeing the leading null bytes in a current version,you may want to report this, including steps on how to get a GNU iconvthat behaves this way.

i am neither Chinese nor Russian, and especially not one of the
other 7 billion that do not count.
(I said surely bogus because i alone see the shiny light of having
found give-me-five GNU iconv errors.  Or even beyond that.)

This makes absolutely zero sense. I am including it only to pre-empt youagain claiming I am stripping important content.


Cheers,
Harald van Dijk

[Prev in Thread]

Current Thread

[Next in Thread]

POSIX bind_textdomain_codeset(): some invalid codeset arguments, Bruno Haible, 2022/05/11
- Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Steffen Nurpmeso, 2022/05/11
  - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Bruno Haible, 2022/05/11
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Steffen Nurpmeso, 2022/05/12
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Harald van Dijk, 2022/05/12
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Steffen Nurpmeso, 2022/05/12
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Harald van Dijk <=
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Steffen Nurpmeso, 2022/05/13
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Harald van Dijk, 2022/05/13
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Steffen Nurpmeso, 2022/05/13
    - Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments, Steffen Nurpmeso, 2022/05/13

Prev by Date: Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Next by Date: Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Previous by thread: Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Next by thread: Re: POSIX bind_textdomain_codeset(): some invalid codeset arguments
Index(es):
- Date
- Thread