[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grep: 'binary files' where matches are text
From: |
eavis |
Subject: |
Re: grep: 'binary files' where matches are text |
Date: |
Fri, 7 Feb 2003 10:33:44 +0000 |
Stepan Kasal <address@hidden> wrote:
>>what really matters is
>>whether the _matches_ are binary, not the input files.
>
>I cannot fully agree. If an executable contains string
>
> "a few\nlines about penguins\nsre here."
>
>and the grep finds te word ``penguins'' and happily prints
>
> lines about penguins
>
>the user could then be surprised when he opens the file in his pico
editor.
If the user's editor is jumping straight to the line number found by grep,
there shouldn't be a problem. If the user is just loading the file, then
yes he could be surprised to see binary data. But why? Only because of
an assumption that grep's matches will be only in text files. There is no
particular reason, IMHO, why that assumption should be correct or why grep
should hold to it. After all, traditional Unix grep would happily scan
both text and binary files. So any users surprised by loading binary
files are surprised only because they have gotten used to a new GNU
behaviour, one which is not necessarily the best way.
My feeling is that grep's job is to look for and print matches in all
files specified, and it should try to stick to that. To avoid
unpleasantness on the user's terminal grep can think twice before printing
binary garbage, but this necessary evil should interfere as little as
possible with the job of printing matches. So better to print as much as
possible, and only give the 'binary match found' message when it is really
necessary.
>OTOH, you are right that it's unpleasant when a file is treated as binary
>even though it in fact isn't.
>
>So I see no nice solution. Perhaps the
``--binary-files=print_text_matches''
>is the best alternative.
Perhaps. We could think of more complex schemes like 'if all the matches
in a file are text, treat it as text, but if some contain binary
characters, treat it as binary'. But I don't think that would be
particularly helpful or worth the extra complexity. Printing all text
matches has the benefit of being simple to explain.
>But I don't know when I get to it. Are you willing to donate a patch?
Yes, I will make a patch, but I cannot promise any particular timescale
myself. Perhaps this weekend I will get around to it. I will not include
in my patch making the new behaviour the default, I can only suggest it
and leave that for you to decide.
--
Ed Avis <address@hidden>