[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Grep --include does not work
Re: Grep --include does not work
Wed, 13 Jun 2007 03:06:25 -0600
> Each grep.exe as you are aware contains the word 'help'. I have also
> deliberately modified ./GnuWin32Grep/TestGrep.txt to have a line
> containing the word help. All other text files do not have this.
Seems like a good test case.
> When I run grep in GnuWin32 v2.5.1a without the --include but with the
> '.' as you recommended like this:
> grep -R -P help .
I did not recommend that. I recommended this:
grep -R --include="*.txt" "include" .
> I have got this:
> [...all of the binaries match...]
That is as expected. Right?
> Now if I run it with
> grep -R --include="*.txt" -P help .
> I have got this:
> ./GnuWin32Grep/TestGrep.txt: Hello with Tab+space help Perl
That is also as expected, right? You said that you modified that file
to contian the help word. It matches. None of the other files match
the '*.txt' pattern and so were not considered as part of the grep.
So far all looks as expected. Agreed?
> I am not disputing the fact that with or without --include a much
> larger dataset (file list) is generated. The fact is that all I care
> in the output are those in the text file. Moreover, the --help message
> for --include says this:
> --include=PATTERN files that match PATTERN will be examined
> So my reading tells me that it will generate a long list of files but
> only those that matches this pattern (in my case *.txt) will be
> examined. So without --include, everything will be examined and the
> result seems to agree with this interpretation.
> If I replace the end '.' with '*' I get the same result.
Huh? This is not what you reported previously. Previously you said
that grep ignored the --include option and searched all files in the
directory tree and printed all matches. But all of this depends upon
what files are in the current directory for the '*' to be expanded
into when running the command. This is probably causing confusion.
> So I am confused. Ideally, I should be able to specify this:
> grep -R -P help *.txt
Nope. That won't do what you are wanting it to do. "Ideally" has
different ideals depending upon what operating model is desired. The
above is not ideal for a Unix-like operating model. If it is a MS
native program with a MS native paradigm then sure. But on a
Unix-like system it is the shell's job to expand wildcards like
'*.txt' (aka file globs). Since grep is a Unix program I expect it to
behave Unix-like. Something different could behave MS-like but then
it would be something different.
> To a Windows program, this is the most logical specification. But it
> does not work as it produces no results.
Grep was developed on the Unix system in 1973 and behaves as expected
on a Unix system. MS-Windows had not been invented yet. Grep is not
doing bad for a 34 year old paradigm!
> Do you have a good tutorial site devoted for recursive search?
The find manual is a good place to start. The find command is the
tool used to "find files". On a GNU system the documentation will be
This is actually a common FAQ and also shows up for other commands
too. Here is one that talks about 'rm' but also applies to your
question too. (Full disclosure: I wrote that FAQ entry.)
In summary the '*.txt' is expanded by the shell into a list of
matching files. If no files match then a literal *.txt is passed to
the command. But all it takes is one .txt file in the directory and
then the command shell will replace the *.txt with the file names of
all matching files.
The separation of file wildcard matching into two parts, the shell and
the application program, means that all programs get the same wildcard
matching. It is outside of any application. It is part of the
programmable command shell. All applications then behave uniformly.
On MS the command line shell is not the same command line shell as I
am describing. File globs (e.g. '*') are not natively expanded by the
command.com shell. There are limitations on the length of the
arguments passed to the application. Native MS commands behave
differently. (More like CPM!) The commands there must do the file
wildcard expansion themselves. Therefore ports of GNU command line
applications to MS must adapt in some way.
I don't know the methods that GnuWin32 uses but I assume they have a
library that is linked with the application that simulates the Unix
behavior. GNU grep does not natively expand file globs and so that
must be added to port this to MS. This mapping of functionality can
be pretty good but it is still a mapping and there are always corner
cases where things don't work the same as on a GNU/Unix system.
> I have even used the Windows XP "for /R" command and then get grep to
> process file by file in the for command like this:
> for /R . %A in (*.txt) do type %A |grep -P help
Sure. On a Unix or GNU system this is very similar to the following.
The 'find' command is used to find files.
find . -name '*.txt' -print0 | xargs -r0 grep help
Here I am using zero terminated strings (the -print0 and -0 options)
and so arbitrary filenames (with spaces, newlines, etc.) are handled.