[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LC_COLLATE in the C locale
From: |
Bruno Haible |
Subject: |
Re: LC_COLLATE in the C locale |
Date: |
Wed, 18 Dec 2019 11:29:46 +0100 |
User-agent: |
KMail/5.1.3 (Linux/4.4.0-166-generic; KDE/5.18.0; x86_64; ; ) |
Hi Paul,
> I do have a qualm in that coreutils (and I assume others) interpret
> !hard_locale
> (LC_COLLATE) as meaning that the locale is unibyte and uses native byte
> comparison.
Isn't this warranted by section "LC_COLLATE Category in the POSIX Locale" in
<https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html> ?
> As I recall on some platforms (macOS maybe?), the C locale uses
> UTF-8 so this interpretation isn't correct.
UTF-8 has the nice property that byte-per-byte comparison and codepoint-per-
codepoint comparison are equivalent. If the encoding was not UTF-8, but
e.g. GB18030, I would agree that there is a problem. But there is no C
locale with GB18030 encoding on any platform.
Bruno