bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Weird printf '%c' behaviour under non-C locale


From: Aharon Robbins
Subject: Re: [bug-gawk] Weird printf '%c' behaviour under non-C locale
Date: Fri, 27 Apr 2012 11:23:53 +0300
User-agent: Heirloom mailx 12.4 7/29/08

Hi.

> Date: Mon, 23 Apr 2012 14:09:27 +0200
> To: address@hidden
> From: Jeroen Schot <address@hidden>
> Subject: [bug-gawk] Weird printf '%c' behaviour under non-C locale
>
> Hello,
>
> A Debian user reported a bug [1] about gawk's printf '%c' behaviour,
> which has changed in 4.0. It is no longer possible to write single
> bytes when using a multibyte locale.
>
> From what I understand the new behaviour is in line POSIX, but I
> believe there should still be a way to write single bytes (other than
> changing the locale of the entire program).
>
> I am interested in your opinion on this.
>
> [1]: http://bugs.debian.org/669714
>
> Kinds regards,
>
> Jeroen Schot

Thanks for the report. I reviewed the test case you pointed to. I would
not call it a bug, per se, but it is a corner case. ("Damned if you do
and damned if you don't.")

As we discussed offline, I think the -b option is the correct answer
here. I have updated the doc to reflect that -b also affects output.
I tested that if the poster changes his script to

        #! /usr/bin/gawk -bf
        ....

that the correct result is produced, even in a UTF-8 locale.

Another option, portable across awks and other Unix systems, would be
to rework the script as a shell script:

        #! /bin/sh
        LC_ALL=C
        export LC_ALL   # override all inherited locale settings
        awk '....' "$@"

Hmmm. Might need a little more shell scripting to get the -v options before
the program, but you get the idea.

Thanks!

Arnold



reply via email to

[Prev in Thread] Current Thread [Next in Thread]