[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS
From: |
Bruno Haible |
Subject: |
Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS |
Date: |
Sun, 22 Jul 2018 23:40:35 +0200 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-130-generic; KDE/5.18.0; x86_64; ; ) |
Pádraig Brady wrote:
> > This patch is correct (because the characters that you test for in c_iscntrl
> > are 0x00..0x1F, 0x7F, which don't occur as second or later byte in a
> > multibyte
> > character in the EUC-JP, EUC-KR, GB2312, EUC-TW, GB18030, SJIS encodings).
>
> ... It might be worth mentioning this subtle point in the c_iscntrl() docs?
> "Note this identifies all single byte control chars even in multibyte
> encodings".
Only in the multibyte encodings that are currently in use. We never know what
kinds of features or misfeatures new multibyte encodings will come up with:
Before GB18030 was introduced, it was a common feature of all multibyte
encodings
(including SJIS) that ASCII characters in the range 0x00..0x3F never occur as
second or later byte in a multibyte character. Well, GB18030 broke this
assumption.
So, it is dangerous to rely on this property. Therefore I wouldn't like to
document it in the c_iscntrl() documentation.
Bruno
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Pádraig Brady, 2018/07/21
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Bruno Haible, 2018/07/21
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Pádraig Brady, 2018/07/22
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS,
Bruno Haible <=
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Chih-Hsuan Yen, 2018/07/25
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Paul Eggert, 2018/07/26
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Bruno Haible, 2018/07/26
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Pádraig Brady, 2018/07/26
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Paul Eggert, 2018/07/26
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Bruno Haible, 2018/07/27
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Paul Eggert, 2018/07/27
- Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Chih-Hsuan Yen, 2018/07/29
Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS, Paul Eggert, 2018/07/22