bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS


From: Chih-Hsuan Yen
Subject: Re: bug#32236: df header corrupted with LANG=zh_TW.UTF-8 on macOS
Date: Mon, 23 Jul 2018 00:09:45 +0800

2018-07-22 23:12 GMT+08:00 Paul Eggert <address@hidden>:
> Pádraig Brady wrote:
>>
>> I've also attached an alternative patch for df (in your name).
>
>
> That still has problems, since it can generate improperly-encoded strings in
> UTF-8 locales (if the inputs are improperly encoded), and can replace parts
> of multibyte characters with '?' in non-UTF-8 locales. Please try the
> attached patch instead, which attempts to address these issues. This is more
> along the lines that Bruno suggested, except it doesn't use mbsiter as I
> figured it was simpler overall just to use mbrtowc directly for this one
> thing.

Here's the result of df:

$ df
檔案系統        容量  已用  可用 已用 掛載點
/dev/disk1s1    234G  137G   95G  60% /
/dev/disk1s4    234G  2.1G   95G   3% /private/var/vm
chyen.cc:        25G   12G   12G  51% /private/tmp/abc def ghi

$ df | xxd
00000000: e6aa 94e6 a188 e7b3 bbe7 b5b1 2020 2020  ............
00000010: 2020 2020 e5ae b9e9 878f 2020 e5b7 b2e7      ......  ....
00000020: 94a8 2020 e58f afe7 94a8 20e5 b7b2 e794  ..  ...... .....
00000030: a820 e68e 9be8 bc89 e9bb 9e0a 2f64 6576  . ........../dev
00000040: 2f64 6973 6b31 7331 2020 2020 3233 3447  /disk1s1    234G
00000050: 2020 3133 3747 2020 2039 3547 2020 3630    137G   95G  60
00000060: 2520 2f0a 2f64 6576 2f64 6973 6b31 7334  % /./dev/disk1s4
00000070: 2020 2020 3233 3447 2020 322e 3147 2020      234G  2.1G
00000080: 2039 3547 2020 2033 2520 2f70 7269 7661   95G   3% /priva
00000090: 7465 2f76 6172 2f76 6d0a 6368 7965 6e2e  te/var/vm.chyen.
000000a0: 6363 3a20 2020 2020 2020 2032 3547 2020  cc:        25G
000000b0: 2031 3247 2020 2031 3247 2020 3531 2520   12G   12G  51%
000000c0: 2f70 7269 7661 7465 2f74 6d70 2f61 6263  /private/tmp/abc
000000d0: e280 a864 6566 e280 a967 6869 0a         ...def...ghi.

Chinese header names are correct, and U+2028 and U+2029 are written
as-is. All tested with LANG=zh_TW.UTF-8 LC_COLLATE=C
LC_CTYPE=zh_TW.UTF-8.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]