bug#22838: New 'Binary file' detection considered harmful

From: Hans Pelleboer
Subject: bug#22838: New 'Binary file' detection considered harmful
Date: Tue, 1 Mar 2016 05:01:55 +0100
On 03/01/2016 12:55 AM, Eric Blake wrote:
I _think_ the Austin Group is leaning towards requiring the "C" locale to always be a unibyte locale with all 256 bytes as valid characters, so neither strict 7-bit ASCII nor UTF-8 would be usable as the "C" locale; but for that to happen, POSIX would also need to allow a way to get a UTF-8 locale easily accessible and
You do realize that this leaves all _non-US_users_, who rely on diacritics or even different character sets entirely
for their language, completely out in the cold.

describe how it differs from the "C" locale under such a ruling. But it's still all conjecture on what the final results will be - even in the standards committee, gracefully documenting how locale corner cases must behave vs. leaving implementations some latitude is tricky business; and any such change is at least 3 or 4 years down the road before it could be standardized in Issue 8 (right now, the focus is on Technical Corrigendum 2 for Issue 7).
Already back in _1987_, an IT professor in Leiden was especially appointed for the streamlining of all the competing character sets that later were merged to become Unicode. Given the current state of affairs, nearly thirty years down the road, I do not share your optimism that this issue
will be resolved in the next couple of years.

Hans Pelleboer

