bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24924: GNU pr only working with singlebyte 1-width characters


From: Stephane Chazelas
Subject: bug#24924: GNU pr only working with singlebyte 1-width characters
Date: Thu, 1 Dec 2016 06:32:22 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

2016-11-30 18:37:05 -0800, Paul Eggert:
> On 11/30/2016 03:30 AM, Stephane Chazelas wrote:
> >That can also be seen as a POSIX conformance bug
> 
> Not really, as POSIX does not require support for UTF-8 (except in
> the pax utility, which is not part of coreutils).
[...]

POSIX does not require support for any charset. It only
specifies one locale (C/POSIX), doesn't specify the charset in
that locale  other than it should be a single byte charset that
covers the portable character set. Examples of such charsets are
ASCII, iso8859-x or EBCDIC. In practice, that tends to be ASCII
(except for some rare EBCDIC based IBM systems) as tha

But it does support a localisation API and allows system to
support other locales with other charsets. That API does support
multi-byte encodings, including stateful ones (though how they
are /defined/ is implementation defined for lock-shift ones and
in practice those are unworkable so I'd expect those would
eventually be removed from the standard). It doesn't require
compliant systems to have locales with multi-byte character sets,
but if they have (if they show up in the output of locale -a),
then they have to be supported throughout (as specified, for all
the utilities for instance).

Basically, on systems that have locales with multi-byte
encodings --UTF-8 or other-- (most Unix-like ones including GNU
systems like Debian), GNU pr (and many other GNU utilities) is
not POSIX compliant.

See
http://pubs.opengroup.org/onlinepubs/9699919799.2016edition/basedefs/V1_chap06.html

for details.

-- 
Stephane





reply via email to

[Prev in Thread] Current Thread [Next in Thread]