[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23234: unexpected results with charset handling in GNU grep 2.23

From: Bjoern Jacke
Subject: bug#23234: unexpected results with charset handling in GNU grep 2.23
Date: Thu, 7 Apr 2016 01:04:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 07.04.2016 00:33, Eric Blake wrote:
> That behavior complies with POSIX requirements.

can you give a quote here? One thing which is not POSIX compliant is
that the diagnostic messages is given back on stdout.
http://pubs.opengroup.org/onlinepubs/9699919799/ says:

    Determine the locale that should be used to affect the format and
contents of diagnostic messages written to standard error.

which implies that diagnostic messages should be given back to standard

> Again, a script SHOULD
> NOT be grepping binary files (POSIX only defines grep on text files)
> without knowing the ramifications.  Meanwhile, 'grep -a' guarantees you
> won't get the "Binary file" message.

if you consider grepping text files with mixed encodings as invalid use
of grep, then you should not return 0 and/or output the "Binary file
(standard input) matches" on stdout. This makes the output of GNU grep
look like a valid match.

You say "grep -a" is your friend to all the users, who want to grep log
files (cause they tend to conain mixed encodinds). Sure, -a is a
workaround to make GNU grep work as before again. Realisically 99.99 of
the users will not know that though, because this is the first grep
version ever I guess, that requires this. Also -a is a GNU option only,
so portable scripts will not be able to use that.

I guess you are aware, that you will break a lot of existing scripts
with that change of treating mixed encoding input files as binary like
the way you do it now with GNU grep >= 2.23 ?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]