bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] [External] Re: Invalid Characters Causing Problems in awk


From: Eli Zaretskii
Subject: Re: [bug-gawk] [External] Re: Invalid Characters Causing Problems in awk 4.0.2
Date: Fri, 24 Aug 2018 12:33:24 +0300

> From: Wolfgang Laun <address@hidden>
> Date: Fri, 24 Aug 2018 08:28:07 +0200
> Cc: "address@hidden" <address@hidden>
> 
> File diacrit.txt contains all the 20 non-ASCII characters you need for 
> Spanish in one line (including \n) with
> UTF-8 encoding:
> 
> ¡¿ªºÁáÉéÍíÑñÓóÚúÜüÇç
> 
> $ wc -c diacrit.txt 
> 41 diacrit.txt
> $ wc -m diacrit.txt 
> 21 diacrit.txt

Of course, this only works if the file is encoded in the same encoding
as specified by the current locale.  Because 'wc' doesn't detect the
encoding, it assumes the locale's codeset.

E.g., try the same in a locale whose codeset in ISO 8859-1, while the
file is still UTF-8 encoded.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]