|
From: | Eric Blake |
Subject: | Re: sed issue? [was: Subject: [GNU Autoconf 2.67] testsuite: 233 failed] |
Date: | Tue, 21 Sep 2010 15:50:27 -0600 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100907 Fedora/3.1.3-1.fc13 Mnenhy/0.8.3 Thunderbird/3.1.3 |
On 09/21/2010 01:49 AM, Paolo Bonzini wrote:
On 09/21/2010 02:37 AM, Eric Blake wrote:Maybe the sed script in file.sed is non-portable? It's certainly more complex than the normal run-of-the-mill sed script. Or maybe it is that the regex '.' has problems matching non-characters, and the definition of the various locales determine whether 8-bit bytes are characters or not. Is there any portable way to guarantee a single-byte locale where '.' matches all possible 8-bit bytes?More testing shows that 'LC_ALL=en_US.ISO8859-1 sed' on Darwin gives the desired results, so the problem is definitely a matter of whether the C locale treats all 256 byte values as potential matches to '.'.I think that's a (pretty serious) Darwin bug.
The bug is limited to GNU sed, which happened to be first in PATH on the machine where I reproduced the problem (and I'm guessing that the same thing happened to rochan):
$ printf '\200\n' | LC_ALL=C /usr/bin/sed -n /./p | wc -l 1 $ printf '\200\n' | LC_ALL=C sed -n /./p | wc -l 0 $ which sed /usr/local/bin/sed $ sed --version | head -n1 GNU sed version 4.2It's nice that the system sed is immune, and I wonder what GNU sed is getting tripped up on? Maybe the autoconf fix is a matter of doing a best-tool search for a sed that handles 8-bit bytes, which would reject this broken GNU sed build in favor of the system sed, even with its other limitations?
-- Eric Blake address@hidden +1-801-349-2682 Libvirt virtualization library http://libvirt.org
[Prev in Thread] | Current Thread | [Next in Thread] |