bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] grep: -r no follows symlinks


From: Jim Meyering
Subject: Re: [PATCH] grep: -r no follows symlinks
Date: Sat, 10 Mar 2012 23:19:35 +0100

Paul Eggert wrote:
> Here's a proposed change to GNU grep so that 'grep -r'
> no longer follows symlinks, and by default does not
> read devices, which is more convenient in the typical
> use case.  Symlinks and devices on the command line are
> still dereferenced; the change affects only symlinks and
> devices encountered recursively.  Users who want dereferencing
> for all files can use -R, which retains its old meaning.
>
> I considered adding --dereference, --no-dereference,
> --dereference-command-line, etc, but that was a lot of extra complexity
> (particularly in the documentation) for little payoff --
> anybody who wants something that complicated can use 'find',
> as the whole point of 'grep -r' is that it should be simple.
>
> As a side effect this patch changes grep to use fts,
> which should make it more robust with large directory
> hierarchies.  This is why the patch subtracts more lines
> than it adds.
>
> This assumes the latest gnulib, so the first patch
> syncs to gnulib, and the second one does the real work.

Thanks for working on this.
Just a couple of nits, so far.
I ran out of time before I got into the meat of it,
not that I expect to find anything...

> Subject: [PATCH 1/2] build: update gnulib submodule to latest
>
> ---
>  gnulib |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/gnulib b/gnulib
> index eb21377..f9d2fe2 160000
> --- a/gnulib
> +++ b/gnulib
> @@ -1 +1 @@
> -Subproject commit eb213779301aa663ab84ac947e8e181e9ad554d0
> +Subproject commit f9d2fe251f3a104df656ab6ffc64821893ab9003
> --
> 1.7.6.5
>
>
>>From 7a97e788391b13ebf6242ff74d7a19a1944d56c1 Mon Sep 17 00:00:00 2001
> From: Paul Eggert <address@hidden>
> Date: Sat, 10 Mar 2012 11:29:32 -0800
> Subject: [PATCH 2/2] grep: -r no follows symlinks
>
> Change -r to follow only command-line symlinks, and by default to
> reads only devices named on the command line.  This is a simple

s/reads/read/

> way to get a more-useful behavior when searching random
> directories; the idea is to use 'find' if you want something fancy.
> -R acts as before and gets a new alias --dereference-recursive.
> The code now uses fts internally, so it should be more robust and
> faster with large hierarchies.
> * .gitignore: Remove lib/savedir.c, lib/savedir.h.
> * tests/symlink: New file
> * Makefile.boot (LIB_OBJS_core): Remove isdir.o, savedir.o.
> Perhaps other changes are needed too, but I'm not sure what
> this makefile is for.
> * NEWS: Document changes.
> * doc/grep.texi (File and Directory Selection): Likewise.
> * bootstrap.conf (gnulib_modules): Remove dirent, dirname, isdir, open.
> Add fstatat, fts, openat-safer.
> * lib/Makefile.am (libgreputils_a_SOURCES): Remove savedir.c, savedir.h.
> * lib/savedir.c, lib/savedir.h: Remove.
> * po/POTFILES.in: Add lib/openat-die.c.
> * src/main.c: Include fcntl-safer.h, fts_.h.  Don't include
> isdir.h, savedir.h.
> (struct stats, stats_base): Remove.
> (long_options, usage, main): Add --dereference-recursive and
> implement -r vs -R.
> (filename_prefix_len, fts_options): New static vars.
> (basic_fts_options, READ_COMMAND_LINE_DEVICES): New constants.
> (devices): Now defaults to READ_COMMAND_LINE_DEVICES.
> (reset, grep): Now takes just struct stat rather than file name and
> struct stats.  All callers changed.
> (fillbuf): Now takes struct stat reather than struct stats.
> All callers changed.
> (grep): Don't worry about recursing too deeply; fts and grepdesc
> handle this now.
> (grepdirent, grepdesc, grep_command_line_args): New functions.
> (grepfile): New args DIRDESC, FOLLOW, COMMAND_LINE.  Remove struct stats
> arg.  All callers changed.  Use openat_safer rather than open.
> Use desc == STDIN_FILENO to tell whether we're reading "-".
> Don't worry about EINTR when closing -- not possible, since we're
> not catching signals.
> * tests/Makefile.am (TESTS): Add symlink.
> * tests/symlink: New file.
> ---
>  .gitignore        |    2 -
>  Makefile.boot     |    2 -
>  NEWS              |   11 ++
>  bootstrap.conf    |    7 +-
>  doc/grep.texi     |   26 +++-
>  lib/Makefile.am   |    2 +-
>  lib/savedir.c     |  163 ----------------------
>  lib/savedir.h     |   11 --
>  po/POTFILES.in    |    1 +
>  src/main.c        |  400 +++++++++++++++++++++++++++-------------------------
>  tests/Makefile.am |    1 +
>  tests/symlink     |   65 +++++++++
>  12 files changed, 312 insertions(+), 379 deletions(-)
>  delete mode 100644 lib/savedir.c
>  delete mode 100644 lib/savedir.h
>  create mode 100755 tests/symlink
>
> diff --git a/.gitignore b/.gitignore
> index 35f5d10..0b195d9 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -64,5 +64,3 @@ TAGS
>  !/lib/colorize-posix.c
>  !/lib/colorize-w32.c
>  !/lib/colorize.h
> -!/lib/savedir.c
> -!/lib/savedir.h
> diff --git a/Makefile.boot b/Makefile.boot
> index 043429b..4414110 100644
> --- a/Makefile.boot
> +++ b/Makefile.boot
> @@ -41,10 +41,8 @@ LIB_OBJS_core =  \
>        $(libdir)/error.$(OBJEXT) \
>        $(libdir)/exclude.$(OBJEXT) \
>        $(libdir)/hard-locale.$(OBJEXT) \
> -      $(libdir)/isdir.$(OBJEXT) \
>        $(libdir)/quotearg.$(OBJEXT) \
>        $(libdir)/regex.$(OBJEXT) \
> -      $(libdir)/savedir.$(OBJEXT) \
>        $(libdir)/strtoumax.$(OBJEXT) \
>        $(libdir)/xmalloc.$(OBJEXT) \
>        $(libdir)/xstrtol.$(OBJEXT) \
> diff --git a/NEWS b/NEWS
> index d0a63d5..3a752b1 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -2,6 +2,17 @@ GNU grep NEWS                                    -*- outline 
> -*-
>
>  * Noteworthy changes in release ?.? (????-??-??) [?]
>
> +** New features
> +
> +  The -R option now has a long-option alias --dereference-recursive.
> +
> +** Changes in behavior
> +
> +  The -r (--recursive) option now follows only command-line symlinks.
> +  Also, by default -r now reads only devices named on the command
> +  line; this can be overridden with --devices.  -R acts as before, so
> +  use -R if you prefer the old behavior of following all symlinks and
> +  defaulting to reading all devices.
>
>  * Noteworthy changes in release 2.11 (2012-03-02) [stable]
>
> diff --git a/bootstrap.conf b/bootstrap.conf
...
> diff --git a/doc/grep.texi b/doc/grep.texi
...
> diff --git a/lib/Makefile.am b/lib/Makefile.am
...
> diff --git a/lib/savedir.c b/lib/savedir.c
...
> diff --git a/lib/savedir.h b/lib/savedir.h
...
> diff --git a/po/POTFILES.in b/po/POTFILES.in
...
> diff --git a/src/main.c b/src/main.c
...
> @@ -369,6 +360,7 @@ unsigned char eolbyte;
>  /* For error messages. */
>  /* The input file name, or (if standard input) "-" or a --label argument.  */
>  static char const *filename;
> +static int filename_prefix_len;

I'd use an unsigned type, since it will never be negative.

>  static int errseen;
>  static int write_error_seen;
>
> @@ -392,14 +384,19 @@ ARGMATCH_VERIFY (directories_args, directories_types);
>
>  static enum directories_type directories = READ_DIRECTORIES;
>
> +enum { basic_fts_options = FTS_CWDFD | FTS_NOSTAT | FTS_TIGHT_CYCLE_CHECK };
> +static int fts_options = basic_fts_options | FTS_COMFOLLOW | FTS_PHYSICAL;
> +
>  /* How to handle devices. */
>  static enum
>    {
> +    READ_COMMAND_LINE_DEVICES,
>      READ_DEVICES,
>      SKIP_DEVICES
> -  } devices = READ_DEVICES;
> +  } devices = READ_COMMAND_LINE_DEVICES;
>
> -static int grepdir (char const *, struct stats const *);
> +static int grepfile (int, char const *, int, int);
> +static int grepdesc (int, int);

I see legacy types like the latter "int" that should be bool,
but that can obviously wait.

>  #if defined HAVE_DOS_FILE_CONTENTS
>  static inline int undossify_input (char *, size_t);
>  #endif
> @@ -473,7 +470,7 @@ static off_t after_last_match;    /* Pointer after last 
> matching line that
>  /* Reset the buffer for a new file, returning zero if we should skip it.
>     Initialize on the first time through. */
>  static int
> -reset (int fd, char const *file, struct stats *stats)
> +reset (int fd, struct stat const *st)

Nice simplification.

I'll have to stop here for now.
Haven't even tested yet.

Thanks again.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]