bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34316: sed misbehavior on BRE's


From: Lange, Markus
Subject: bug#34316: sed misbehavior on BRE's
Date: Wed, 6 Feb 2019 07:54:59 +0000

Hi,

thanks for your response.
Even if the sed version is quite old it might be very uncommon to deal
with pica formatted (format common in library-oriented environments) 
files, so it could be an problem not seen elsewhere.

Running the original command ( sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*$/\2 \1\3/p' )
with LC_ALL=C on the new system resolves the problem.

On the old system LC_* and LANG are not set at all (should default to
C, if I'm not wrong), on the new machine LANG and LC_CTYPE is set to
en_US.UTF-8, what I falsely assumed to be alike C.

I'm going to check if the behavior still exists in the current sed
version soon and call back afterwards.

Thank you for your help and best regards
Markus Lange
-- 
***Lesen. Hören. Wissen. Deutsche Nationalbibliothek***

Deutsche Nationalbibliothek               
Fachbereich IT, Informationsinfrastruktur
Adickesallee 1
60322 Frankfurt am Main
Tel: +49 69 1525 -1786
mailto:address@hidden
http://www.dnb.de

On Tue, 2019-02-05 at 16:12 -0700, Assaf Gordon wrote:
> tags 34316 moreinfo
> stop
> 
> Hello,
> 
> On 2019-02-04 6:42 a.m., Lange, Markus wrote:
> > I'm currently migrating processes from an old SuSE 9 Linux to an
> > new
> > CentOS 7 Linux and observed some unexpected behavior changes on
> > sed.
> 
> [...]
> > old:~ # sed --version
> > GNU sed version 4.0.6
> 
> [...]
> > new:~ # sed --version
> > sed (GNU sed) 4.2.2
> 
> Please note that sed 4.2.2 is also very old (7 years old).
> The latest sed is version 4.7, released in December 2018.
> 
> There's limited amount of support we can help with sed-4.2.2 .
> 
> 
> Before digging further, I notice that the file you're dealing with
> has non-ascii characters in it, evident by some of the example text
> you pasted (and also in the attached file):
> 
> > 9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*/\2 \1\3/p'
> > Fehlerpica.dat
> > 138742c156c1445f8bdc3a7845548c00 9783507435339020F
> > a19.04.03�208@
> > a30-01-19bc
> > 18290030a02544e6a451538b0e44f9e2 9783507435377020F
> > a19.04.03�208@
> > a30-01-19bc
> > 4c7ff6d790b34470852434f5ee41200b 9783034312189020F
> > a12.12.11�208@
> > a30-01-19bc
> 
> And such characters can cause unexpected results, depending on the
> active locale.
> 
> Can you please re-run the tests on the new machine with the same
> locale as the old machine, and again with LC_ALL=C (forcing C/POSIX
> locale), to ensure that locale and invalid characters are not the
> problem ?
> 
> Also, even if you're 'stuck' with sed-4.2.2, can you try with
> sed-4.7 (perhaps compiled from source code), to see if this is an
> existing problem, or perhaps it was resolved in the meantime?
> 
> 
> regards,
>   - assaf
> 
> 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]