[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#30326: grep not searching through a text file (thinking it binary)

From: Paul Jackson
Subject: bug#30326: grep not searching through a text file (thinking it binary)
Date: Mon, 05 Feb 2018 15:27:53 -0600

Paul Eggert wrote, in response to my suggestion to filter grep output,
not input, for "binary junk":>> We've done that already, if memory serves.

I don't think so :).

The installed grep on the system I'm typing on right now is "grep (GNU
grep) 3.0".I've not checked closely, but I believe that should be a fairly
recent grep.
I created a large file ("/tmp/pjbb")  by concatenating:
1) a big plain ASCII file of C source code,
2) a small ELF executable, and
3) another big plain ASCII file of C source code.

Then I grep'd in this big file for the string "address@hidden", which
appeared twice in  the first file of C source code,  and once
again in the second file of C source code.

Here's what I see:

*$* grep --version | head -1
grep (GNU grep) 3.0

*$* grep address@hidden /tmp/pjbb
* address@hidden
* address@hidden
Binary file /tmp/pjbb matches

*$* grep -a address@hidden /tmp/pjbb
* address@hidden
* address@hidden
* address@hidden

By default, grep sees the first two "address@hidden",
then abandons the search before seeing the third
such, when it first encounters the ELF binary.

Using "grep -a" to ask grep to persist, it sees all
three "address@hidden" strings.


My ancient home-brew hack that provides ASCII trimmed
output when scanning binary files for ASCII strings, contains
custom code to buffer the already scanned input, in order
that it can then scan backwards, once it finds a match.

The usual line oriented buffering doesn't work so well when
the input file might have no, or at least infrequent, line breaks.

                Paul Jackson

reply via email to

[Prev in Thread] Current Thread [Next in Thread]