bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] grep: -r no follows symlinks


From: Paul Eggert
Subject: [PATCH] grep: -r no follows symlinks
Date: Sat, 10 Mar 2012 12:47:49 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120209 Thunderbird/10.0.1

Here's a proposed change to GNU grep so that 'grep -r'
no longer follows symlinks, and by default does not
read devices, which is more convenient in the typical
use case.  Symlinks and devices on the command line are
still dereferenced; the change affects only symlinks and
devices encountered recursively.  Users who want dereferencing
for all files can use -R, which retains its old meaning.

I considered adding --dereference, --no-dereference,
--dereference-command-line, etc, but that was a lot of extra complexity
(particularly in the documentation) for little payoff --
anybody who wants something that complicated can use 'find',
as the whole point of 'grep -r' is that it should be simple.

As a side effect this patch changes grep to use fts,
which should make it more robust with large directory
hierarchies.  This is why the patch subtracts more lines
than it adds.

This assumes the latest gnulib, so the first patch
syncs to gnulib, and the second one does the real work.

>From b2abf4baf9bb6c6c091d0ac13a6c27fadecf5794 Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Sat, 10 Mar 2012 12:23:03 -0800
Subject: [PATCH 1/2] build: update gnulib submodule to latest

---
 gnulib |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/gnulib b/gnulib
index eb21377..f9d2fe2 160000
--- a/gnulib
+++ b/gnulib
@@ -1 +1 @@
-Subproject commit eb213779301aa663ab84ac947e8e181e9ad554d0
+Subproject commit f9d2fe251f3a104df656ab6ffc64821893ab9003
-- 
1.7.6.5


>From 7a97e788391b13ebf6242ff74d7a19a1944d56c1 Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Sat, 10 Mar 2012 11:29:32 -0800
Subject: [PATCH 2/2] grep: -r no follows symlinks

Change -r to follow only command-line symlinks, and by default to
reads only devices named on the command line.  This is a simple
way to get a more-useful behavior when searching random
directories; the idea is to use 'find' if you want something fancy.
-R acts as before and gets a new alias --dereference-recursive.
The code now uses fts internally, so it should be more robust and
faster with large hierarchies.
* .gitignore: Remove lib/savedir.c, lib/savedir.h.
* tests/symlink: New file
* Makefile.boot (LIB_OBJS_core): Remove isdir.o, savedir.o.
Perhaps other changes are needed too, but I'm not sure what
this makefile is for.
* NEWS: Document changes.
* doc/grep.texi (File and Directory Selection): Likewise.
* bootstrap.conf (gnulib_modules): Remove dirent, dirname, isdir, open.
Add fstatat, fts, openat-safer.
* lib/Makefile.am (libgreputils_a_SOURCES): Remove savedir.c, savedir.h.
* lib/savedir.c, lib/savedir.h: Remove.
* po/POTFILES.in: Add lib/openat-die.c.
* src/main.c: Include fcntl-safer.h, fts_.h.  Don't include
isdir.h, savedir.h.
(struct stats, stats_base): Remove.
(long_options, usage, main): Add --dereference-recursive and
implement -r vs -R.
(filename_prefix_len, fts_options): New static vars.
(basic_fts_options, READ_COMMAND_LINE_DEVICES): New constants.
(devices): Now defaults to READ_COMMAND_LINE_DEVICES.
(reset, grep): Now takes just struct stat rather than file name and
struct stats.  All callers changed.
(fillbuf): Now takes struct stat reather than struct stats.
All callers changed.
(grep): Don't worry about recursing too deeply; fts and grepdesc
handle this now.
(grepdirent, grepdesc, grep_command_line_args): New functions.
(grepfile): New args DIRDESC, FOLLOW, COMMAND_LINE.  Remove struct stats
arg.  All callers changed.  Use openat_safer rather than open.
Use desc == STDIN_FILENO to tell whether we're reading "-".
Don't worry about EINTR when closing -- not possible, since we're
not catching signals.
* tests/Makefile.am (TESTS): Add symlink.
* tests/symlink: New file.
---
 .gitignore        |    2 -
 Makefile.boot     |    2 -
 NEWS              |   11 ++
 bootstrap.conf    |    7 +-
 doc/grep.texi     |   26 +++-
 lib/Makefile.am   |    2 +-
 lib/savedir.c     |  163 ----------------------
 lib/savedir.h     |   11 --
 po/POTFILES.in    |    1 +
 src/main.c        |  400 +++++++++++++++++++++++++++-------------------------
 tests/Makefile.am |    1 +
 tests/symlink     |   65 +++++++++
 12 files changed, 312 insertions(+), 379 deletions(-)
 delete mode 100644 lib/savedir.c
 delete mode 100644 lib/savedir.h
 create mode 100755 tests/symlink

diff --git a/.gitignore b/.gitignore
index 35f5d10..0b195d9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -64,5 +64,3 @@ TAGS
 !/lib/colorize-posix.c
 !/lib/colorize-w32.c
 !/lib/colorize.h
-!/lib/savedir.c
-!/lib/savedir.h
diff --git a/Makefile.boot b/Makefile.boot
index 043429b..4414110 100644
--- a/Makefile.boot
+++ b/Makefile.boot
@@ -41,10 +41,8 @@ LIB_OBJS_core =  \
       $(libdir)/error.$(OBJEXT) \
       $(libdir)/exclude.$(OBJEXT) \
       $(libdir)/hard-locale.$(OBJEXT) \
-      $(libdir)/isdir.$(OBJEXT) \
       $(libdir)/quotearg.$(OBJEXT) \
       $(libdir)/regex.$(OBJEXT) \
-      $(libdir)/savedir.$(OBJEXT) \
       $(libdir)/strtoumax.$(OBJEXT) \
       $(libdir)/xmalloc.$(OBJEXT) \
       $(libdir)/xstrtol.$(OBJEXT) \
diff --git a/NEWS b/NEWS
index d0a63d5..3a752b1 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,17 @@ GNU grep NEWS                                    -*- outline 
-*-
 
 * Noteworthy changes in release ?.? (????-??-??) [?]
 
+** New features
+
+  The -R option now has a long-option alias --dereference-recursive.
+
+** Changes in behavior
+
+  The -r (--recursive) option now follows only command-line symlinks.
+  Also, by default -r now reads only devices named on the command
+  line; this can be overridden with --devices.  -R acts as before, so
+  use -R if you prefer the old behavior of following all symlinks and
+  defaulting to reading all devices.
 
 * Noteworthy changes in release 2.11 (2012-03-02) [stable]
 
diff --git a/bootstrap.conf b/bootstrap.conf
index 45bb33d..7a71015 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -24,13 +24,13 @@ binary-io
 btowc
 c-ctype
 closeout
-dirent
-dirname
 do-release-commit-and-tag
 error
 exclude
 fcntl-h
 fnmatch
+fstatat
+fts
 getopt-gnu
 getpagesize
 gettext-h
@@ -44,7 +44,6 @@ intprops
 inttypes
 isatty
 isblank
-isdir
 iswctype
 largefile
 lseek
@@ -58,7 +57,7 @@ memchr
 mempcpy
 minmax
 obstack
-open
+openat-safer
 progname
 propername
 quote
diff --git a/doc/grep.texi b/doc/grep.texi
index c014d8f..1840e21 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -606,16 +606,21 @@ if the terminal driver interprets some of it as commands.
 @opindex --devices
 @cindex device search
 If an input file is a device, FIFO, or socket, use @var{action} to process it.
-By default, @var{action} is @samp{read},
-which means that devices are read just as if they were ordinary files.
+If @var{action} is @samp{read},
+all devices are read just as if they were ordinary files.
 If @var{action} is @samp{skip},
 devices, FIFOs, and sockets are silently skipped.
+By default, devices are read if they are on the command line or if the
address@hidden (@option{--dereference-recursive}) option is used, and are
+skipped if they are encountered recursively and the @option{-r}
+(@option{--recursive}) option is used.
 
 @item -d @var{action}
 @itemx address@hidden
 @opindex -d
 @opindex --directories
 @cindex directory search
address@hidden symbolic links
 If an input file is a directory, use @var{action} to process it.
 By default, @var{action} is @samp{read},
 which means that directories are read just as if they were ordinary files
@@ -624,7 +629,8 @@ and will cause @command{grep}
 to print error messages for every directory or silently skip them).
 If @var{action} is @samp{skip}, directories are silently skipped.
 If @var{action} is @samp{recurse},
address@hidden reads all files under each directory, recursively;
address@hidden reads all files under each directory, recursively,
+following command-line symbolic links and skipping other symlinks;
 this is equivalent to the @option{-r} option.
 
 @item address@hidden
@@ -663,16 +669,28 @@ Search only files whose base name matches @var{glob}
 (using wildcard matching as described under @option{--exclude}).
 
 @item -r
address@hidden -R
 @itemx --recursive
 @opindex -r
 @opindex --recursive
 @cindex recursive search
 @cindex searching directory trees
address@hidden symbolic links
 For each directory operand,
 read and process all files in that directory, recursively.
+Follow symbolic links on the command line, but skip symlinks
+that are encountered recursively.
 This is the same as the @samp{--directories=recurse} option.
 
address@hidden -R
address@hidden --dereference-recursive
address@hidden -R
address@hidden --dereference-recursive
address@hidden recursive search
address@hidden searching directory trees
address@hidden symbolic links
+For each directory operand, read and process all files in that
+directory, recursively, following all symbolic links.
+
 @end table
 
 @node Other Options
diff --git a/lib/Makefile.am b/lib/Makefile.am
index 04ae51e..527c6f5 100644
--- a/lib/Makefile.am
+++ b/lib/Makefile.am
@@ -32,7 +32,7 @@ INCLUDES = -I.. -I$(srcdir)
 AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS)
 
 libgreputils_a_SOURCES += \
-  colorize.c colorize.h savedir.c savedir.h
+  colorize.c colorize.h
 
 EXTRA_DIST += colorize-posix.c colorize-w32.c
 
diff --git a/lib/savedir.c b/lib/savedir.c
deleted file mode 100644
index 3f93c16..0000000
--- a/lib/savedir.c
+++ /dev/null
@@ -1,163 +0,0 @@
-/* savedir.c -- save the list of files in a directory in a string
-   Copyright (C) 1990, 1997-2001, 2009-2012 Free Software Foundation, Inc.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software Foundation,
-   Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA.  */
-
-/* Written by David MacKenzie <address@hidden>. */
-
-#include <config.h>
-
-#include <sys/types.h>
-#include <unistd.h>
-#include <dirent.h>
-
-#ifdef CLOSEDIR_VOID
-/* Fake a return value. */
-# define CLOSEDIR(d) (closedir (d), 0)
-#else
-# define CLOSEDIR(d) closedir (d)
-#endif
-
-#include <stdlib.h>
-#include <string.h>
-#include <fnmatch.h>
-#include "savedir.h"
-#include "xalloc.h"
-
-static char *path;
-static size_t pathlen;
-
-extern int isdir (const char *name);
-
-static int
-isdir1 (const char *dir, const char *file)
-{
-  size_t dirlen = strlen (dir);
-  size_t filelen = strlen (file);
-
-  while (dirlen && dir[dirlen - 1] == '/')
-    dirlen--;
-
-  if ((dirlen + filelen + 2) > pathlen)
-    {
-      pathlen *= 2;
-      if ((dirlen + filelen + 2) > pathlen)
-        pathlen = dirlen + filelen + 2;
-
-      path = xrealloc (path, pathlen);
-    }
-
-  memcpy (path, dir, dirlen);
-  path[dirlen] = '/';
-  strcpy (path + dirlen + 1, file);
-  return isdir (path);
-}
-
-/* Return a freshly allocated string containing the filenames
-   in directory DIR, separated by '\0' characters;
-   the end is marked by two '\0' characters in a row.
-   NAME_SIZE is the number of bytes to initially allocate
-   for the string; it will be enlarged as needed.
-   Return NULL if DIR cannot be opened or if out of memory. */
-char *
-savedir (const char *dir, off_t name_size, struct exclude *included_patterns,
-         struct exclude *excluded_patterns, struct exclude 
*excluded_directory_patterns )
-{
-  DIR *dirp;
-  struct dirent *dp;
-  char *name_space;
-  char *namep;
-
-  dirp = opendir (dir);
-  if (dirp == NULL)
-    return NULL;
-
-  /* Be sure name_size is at least `1' so there's room for
-     the final NUL byte.  */
-  if (name_size <= 0)
-    name_size = 1;
-
-  name_space = (char *) malloc (name_size);
-  if (name_space == NULL)
-    {
-      closedir (dirp);
-      return NULL;
-    }
-  namep = name_space;
-
-  while ((dp = readdir (dirp)) != NULL)
-    {
-      /* Skip "." and ".." (some NFS file systems' directories lack them). */
-      if (dp->d_name[0] != '.'
-          || (dp->d_name[1] != '\0'
-              && (dp->d_name[1] != '.' || dp->d_name[2] != '\0')))
-        {
-          size_t namlen = strlen (dp->d_name);
-          size_t size_needed = (namep - name_space) + namlen + 2;
-
-          if ((included_patterns || excluded_patterns)
-              && !isdir1 (dir, dp->d_name))
-            {
-              if (included_patterns
-                  && excluded_file_name (included_patterns, dp->d_name))
-                continue;
-              if (excluded_patterns
-                  && excluded_file_name (excluded_patterns, dp->d_name))
-                continue;
-            }
-
-          if ( excluded_directory_patterns
-              && isdir1 (dir, dp->d_name) )
-            {
-              if (excluded_directory_patterns
-                  && excluded_file_name (excluded_directory_patterns, 
dp->d_name))
-                continue;
-            }
-
-          if (size_needed > name_size)
-            {
-              char *new_name_space;
-
-              while (size_needed > name_size)
-                name_size += 1024;
-
-              new_name_space = realloc (name_space, name_size);
-              if (new_name_space == NULL)
-                {
-                  closedir (dirp);
-                  goto fail;
-                }
-              namep = new_name_space + (namep - name_space);
-              name_space = new_name_space;
-            }
-          strcpy (namep, dp->d_name);
-          namep += namlen + 1;
-        }
-    }
-  *namep = '\0';
-  if (CLOSEDIR (dirp))
-    {
-     fail:
-      free (name_space);
-      name_space = NULL;
-    }
-  if (path)
-    {
-      free (path);
-      path = NULL;
-      pathlen = 0;
-    }
-  return name_space;
-}
diff --git a/lib/savedir.h b/lib/savedir.h
deleted file mode 100644
index 00cb1a9..0000000
--- a/lib/savedir.h
+++ /dev/null
@@ -1,11 +0,0 @@
-#if !defined SAVEDIR_H_
-# define SAVEDIR_H_
-
-#include "exclude.h"
-
-extern char *
-savedir (const char *dir, off_t name_size,
-         struct exclude *, struct exclude *,
-         struct exclude *);
-
-#endif
diff --git a/po/POTFILES.in b/po/POTFILES.in
index b33c126..65f4f57 100644
--- a/po/POTFILES.in
+++ b/po/POTFILES.in
@@ -20,6 +20,7 @@ lib/closeout.c
 lib/error.c
 lib/getopt.c
 lib/obstack.c
+lib/openat-die.c
 lib/quotearg.c
 lib/regcomp.c
 lib/version-etc.c
diff --git a/src/main.c b/src/main.c
index 2f6c761..5364cd1 100644
--- a/src/main.c
+++ b/src/main.c
@@ -36,14 +36,14 @@
 #include "error.h"
 #include "exclude.h"
 #include "exitfail.h"
+#include "fcntl-safer.h"
+#include "fts_.h"
 #include "getopt.h"
 #include "grep.h"
 #include "intprops.h"
-#include "isdir.h"
 #include "progname.h"
 #include "propername.h"
 #include "quote.h"
-#include "savedir.h"
 #include "version-etc.h"
 #include "xalloc.h"
 #include "xstrtol.h"
@@ -63,15 +63,6 @@
    avoiding a potential (racy) infinite loop.  */
 static struct stat out_stat;
 
-struct stats
-{
-  struct stats const *parent;
-  struct stat stat;
-};
-
-/* base of chain of stat buffers, used to detect directory loops */
-static struct stats stats_base;
-
 /* if non-zero, display usage information and exit */
 static int show_help;
 
@@ -347,7 +338,7 @@ static struct option const long_options[] =
   {"only-matching", no_argument, NULL, 'o'},
   {"quiet", no_argument, NULL, 'q'},
   {"recursive", no_argument, NULL, 'r'},
-  {"recursive", no_argument, NULL, 'R'},
+  {"dereference-recursive", no_argument, NULL, 'R'},
   {"regexp", required_argument, NULL, 'e'},
   {"invert-match", no_argument, NULL, 'v'},
   {"silent", no_argument, NULL, 'q'},
@@ -369,6 +360,7 @@ unsigned char eolbyte;
 /* For error messages. */
 /* The input file name, or (if standard input) "-" or a --label argument.  */
 static char const *filename;
+static int filename_prefix_len;
 static int errseen;
 static int write_error_seen;
 
@@ -392,14 +384,19 @@ ARGMATCH_VERIFY (directories_args, directories_types);
 
 static enum directories_type directories = READ_DIRECTORIES;
 
+enum { basic_fts_options = FTS_CWDFD | FTS_NOSTAT | FTS_TIGHT_CYCLE_CHECK };
+static int fts_options = basic_fts_options | FTS_COMFOLLOW | FTS_PHYSICAL;
+
 /* How to handle devices. */
 static enum
   {
+    READ_COMMAND_LINE_DEVICES,
     READ_DEVICES,
     SKIP_DEVICES
-  } devices = READ_DEVICES;
+  } devices = READ_COMMAND_LINE_DEVICES;
 
-static int grepdir (char const *, struct stats const *);
+static int grepfile (int, char const *, int, int);
+static int grepdesc (int, int);
 #if defined HAVE_DOS_FILE_CONTENTS
 static inline int undossify_input (char *, size_t);
 #endif
@@ -473,7 +470,7 @@ static off_t after_last_match;      /* Pointer after last 
matching line that
 /* Reset the buffer for a new file, returning zero if we should skip it.
    Initialize on the first time through. */
 static int
-reset (int fd, char const *file, struct stats *stats)
+reset (int fd, struct stat const *st)
 {
   if (! pagesize)
     {
@@ -488,9 +485,9 @@ reset (int fd, char const *file, struct stats *stats)
   bufbeg[-1] = eolbyte;
   bufdesc = fd;
 
-  if (S_ISREG (stats->stat.st_mode))
+  if (S_ISREG (st->st_mode))
     {
-      if (file)
+      if (fd != STDIN_FILENO)
         bufoffset = 0;
       else
         {
@@ -510,7 +507,7 @@ reset (int fd, char const *file, struct stats *stats)
    to the beginning of the buffer contents, and 'buflim'
    points just after the end.  Return zero if there's an error.  */
 static int
-fillbuf (size_t save, struct stats const *stats)
+fillbuf (size_t save, struct stat const *st)
 {
   size_t fillsize = 0;
   int cc = 1;
@@ -543,9 +540,9 @@ fillbuf (size_t save, struct stats const *stats)
          is large.  However, do not use the original file size as a
          heuristic if we've already read past the file end, as most
          likely the file is growing.  */
-      if (S_ISREG (stats->stat.st_mode))
+      if (S_ISREG (st->st_mode))
         {
-          off_t to_be_read = stats->stat.st_size - bufoffset;
+          off_t to_be_read = st->st_size - bufoffset;
           off_t maxsize_off = save + to_be_read;
           if (0 <= to_be_read && to_be_read <= maxsize_off
               && maxsize_off == (size_t) maxsize_off
@@ -1085,7 +1082,7 @@ grepbuf (char const *beg, char const *lim)
    but if the file is a directory and we search it recursively, then
    return -2 if there was a match, and -1 otherwise.  */
 static intmax_t
-grep (int fd, char const *file, struct stats *stats)
+grep (int fd, struct stat const *st)
 {
   intmax_t nlines, i;
   int not_text;
@@ -1095,19 +1092,9 @@ grep (int fd, char const *file, struct stats *stats)
   char *lim;
   char eol = eolbyte;
 
-  if (!reset (fd, file, stats))
+  if (! reset (fd, st))
     return 0;
 
-  if (file && directories == RECURSE_DIRECTORIES
-      && S_ISDIR (stats->stat.st_mode))
-    {
-      /* Close fd now, so that we don't open a lot of file descriptors
-         when we recurse deeply.  */
-      if (close (fd) != 0)
-        suppressible_error (file, errno);
-      return grepdir (file, stats) - 2;
-    }
-
   totalcc = 0;
   lastout = 0;
   totalnl = 0;
@@ -1119,7 +1106,7 @@ grep (int fd, char const *file, struct stats *stats)
   residue = 0;
   save = 0;
 
-  if (! fillbuf (save, stats))
+  if (! fillbuf (save, st))
     {
       suppressible_error (filename, errno);
       return 0;
@@ -1190,7 +1177,7 @@ grep (int fd, char const *file, struct stats *stats)
         totalcc = add_count (totalcc, buflim - bufbeg - save);
       if (out_line)
         nlscan (beg);
-      if (! fillbuf (save, stats))
+      if (! fillbuf (save, st))
         {
           suppressible_error (filename, errno);
           goto finish_grep;
@@ -1214,53 +1201,183 @@ grep (int fd, char const *file, struct stats *stats)
 }
 
 static int
-grepfile (char const *file, struct stats *stats)
+grepdirent (FTS *fts, FTSENT *ent)
 {
-  int desc;
-  intmax_t count;
-  int status;
+  int follow, dirdesc;
+  int command_line = ent->fts_level == FTS_ROOTLEVEL;
+  struct stat *st = ent->fts_statp;
 
-  filename = (file ? file : label ? label : _("(standard input)"));
+  if (ent->fts_info == FTS_DP)
+    {
+      if (directories == RECURSE_DIRECTORIES && command_line)
+        out_file &= ~ (2 * !no_filenames);
+      return 1;
+    }
 
-  if (! file)
-    desc = STDIN_FILENO;
-  else if (devices == SKIP_DEVICES)
+  if ((ent->fts_info == FTS_D || ent->fts_info == FTS_DC
+       || ent->fts_info == FTS_DNR)
+      ? (directories == SKIP_DIRECTORIES
+         || (excluded_directory_patterns
+             && excluded_file_name (excluded_directory_patterns,
+                                    ent->fts_name)))
+      : ((included_patterns
+          && excluded_file_name (included_patterns, ent->fts_name))
+         || (excluded_patterns
+             && excluded_file_name (excluded_patterns, ent->fts_name))))
     {
-      /* Don't open yet, since that might have side effects on a device.  */
-      desc = -1;
+      fts_set (fts, ent, FTS_SKIP);
+      return 1;
     }
-  else
+
+  filename = ent->fts_path + filename_prefix_len;
+  follow = (fts->fts_options & FTS_LOGICAL
+            || (fts->fts_options & FTS_COMFOLLOW && command_line));
+
+  switch (ent->fts_info)
     {
-      /* When skipping directories, don't worry about directories
-         that can't be opened.  */
-      desc = open (file, O_RDONLY);
-      if (desc < 0 && directories != SKIP_DIRECTORIES)
-        {
-          suppressible_error (file, errno);
+    case FTS_D:
+      {
+        int skip_this_directory =
+          (directories == SKIP_DIRECTORIES
+           || (excluded_directory_patterns
+               && excluded_file_name (excluded_directory_patterns, filename)));
+        if (skip_this_directory || directories == READ_DIRECTORIES)
+          fts_set (fts, ent, FTS_SKIP);
+        if (skip_this_directory)
           return 1;
+        if (directories == RECURSE_DIRECTORIES)
+          {
+            out_file |= 2 * !no_filenames;
+            return 1;
+          }
+      }
+      break;
+
+    case FTS_DC:
+      if (!suppress_errors)
+        error (0, 0, _("warning: %s: %s"), filename,
+               _("recursive directory loop"));
+      return 1;
+
+    case FTS_DNR:
+    case FTS_ERR:
+    case FTS_NS:
+      suppressible_error (filename, ent->fts_errno);
+      return 1;
+
+    case FTS_DEFAULT:
+    case FTS_NSOK:
+      if (devices == SKIP_DEVICES
+          || (devices == READ_COMMAND_LINE_DEVICES && !command_line))
+        {
+          struct stat st1;
+          if (! st->st_mode)
+            {
+              /* The file type is not already known.  Get the file status
+                 before opening, since opening might have side effects
+                 on a device.  */
+              int flag = follow ? 0 : AT_SYMLINK_NOFOLLOW;
+              if (fstatat (fts->fts_cwd_fd, ent->fts_accpath, &st1, flag) != 0)
+                {
+                  suppressible_error (filename, errno);
+                  return 1;
+                }
+              st = &st1;
+            }
+          if (S_ISCHR (st->st_mode) || S_ISBLK (st->st_mode)
+              || S_ISSOCK (st->st_mode) || S_ISFIFO (st->st_mode))
+            return 1;
         }
+      break;
+
+    case FTS_F:
+    case FTS_SLNONE:
+      break;
+
+    case FTS_SL:
+    case FTS_W:
+      return 1;
+
+    default:
+      abort ();
     }
 
-  if (desc < 0
-      ? stat (file, &stats->stat) != 0
-      : fstat (desc, &stats->stat) != 0)
+  dirdesc = ((fts->fts_options & (FTS_NOCHDIR | FTS_CWDFD)) == FTS_CWDFD
+             ? fts->fts_cwd_fd
+             : AT_FDCWD);
+  return grepfile (dirdesc, ent->fts_accpath, follow, command_line);
+}
+
+static int
+grepfile (int dirdesc, char const *name, int follow, int command_line)
+{
+  int desc = openat_safer (dirdesc, name, O_RDONLY | (follow ? 0 : 
O_NOFOLLOW));
+  if (desc < 0)
     {
-      suppressible_error (filename, errno);
-      if (file)
-        close (desc);
+      if (follow || errno != ELOOP)
+        suppressible_error (filename, errno);
       return 1;
     }
+  return grepdesc (desc, command_line);
+}
 
-  if ((directories == SKIP_DIRECTORIES && S_ISDIR (stats->stat.st_mode))
-      || (devices == SKIP_DEVICES && (S_ISCHR (stats->stat.st_mode)
-                                      || S_ISBLK (stats->stat.st_mode)
-                                      || S_ISSOCK (stats->stat.st_mode)
-                                      || S_ISFIFO (stats->stat.st_mode))))
+static int
+grepdesc (int desc, int command_line)
+{
+  intmax_t count;
+  int status = 1;
+  struct stat st;
+
+  /* Get the file status, possibly for the second time.  This catches
+     a race condition if the directory entry changes after the
+     directory entry is read and before the file is opened.  For
+     example, normally DESC is a directory only at the top level, but
+     there is an exception if some other process substitutes a
+     directory for a non-directory while 'grep' is running.  */
+  if (fstat (desc, &st) != 0)
     {
-      if (file)
-        close (desc);
-      return 1;
+      suppressible_error (filename, errno);
+      goto closeout;
     }
+  if (desc != STDIN_FILENO
+      && directories == RECURSE_DIRECTORIES && S_ISDIR (st.st_mode))
+    {
+      /* Traverse the directory starting with its full name, because
+         unfortunately fts provides no way to traverse the directory
+         starting from its file descriptor.  */
+
+      FTS *fts;
+      FTSENT *ent;
+      int opts = fts_options & ~(command_line ? 0 : FTS_COMFOLLOW);
+      char *fts_arg[2];
+
+      /* Close DESC now, to conserve file descriptors if the race
+         condition occurs many times in a deep recursion.  */
+      if (close (desc) != 0)
+        suppressible_error (filename, errno);
+
+      fts_arg[0] = (char *) filename;
+      fts_arg[1] = NULL;
+      fts = fts_open (fts_arg, opts, NULL);
+
+      if (!fts)
+        xalloc_die ();
+      while ((ent = fts_read (fts)))
+        status &= grepdirent (fts, ent);
+      if (errno)
+        suppressible_error (filename, errno);
+      if (fts_close (fts) != 0)
+        suppressible_error (filename, errno);
+      return status;
+    }
+  if ((directories == SKIP_DIRECTORIES && S_ISDIR (st.st_mode))
+      || ((devices == SKIP_DEVICES
+           || (devices == READ_COMMAND_LINE_DEVICES && !command_line))
+          && (S_ISCHR (st.st_mode)
+              || S_ISBLK (st.st_mode)
+              || S_ISSOCK (st.st_mode)
+              || S_ISFIFO (st.st_mode))))
+    goto closeout;
 
   /* If there is a regular file on stdout and the current file refers
      to the same i-node, we have to report the problem and skip it.
@@ -1282,24 +1399,12 @@ grepfile (char const *file, struct stats *stats)
      condition that could result in "alternate" output.  */
   if (!out_quiet && list_files == 0 && 1 < max_count
       && S_ISREG (out_stat.st_mode) && out_stat.st_ino
-      && SAME_INODE (stats->stat, out_stat))
+      && SAME_INODE (st, out_stat))
     {
       if (! suppress_errors)
         error (0, 0, _("input file %s is also the output"), quote (filename));
       errseen = 1;
-      if (file)
-        close (desc);
-      return 1;
-    }
-
-  if (desc < 0)
-    {
-      desc = open (file, O_RDONLY);
-      if (desc < 0)
-        {
-          suppressible_error (file, errno);
-          return 1;
-        }
+      goto closeout;
     }
 
 #if defined SET_BINARY
@@ -1309,7 +1414,7 @@ grepfile (char const *file, struct stats *stats)
     SET_BINARY (desc);
 #endif
 
-  count = grep (desc, file, stats);
+  count = grep (desc, &st);
   if (count < 0)
     status = count + 2;
   else
@@ -1334,103 +1439,35 @@ grepfile (char const *file, struct stats *stats)
           fputc ('\n' & filename_mask, stdout);
         }
 
-      if (! file)
+      if (desc == STDIN_FILENO)
         {
           off_t required_offset = outleft ? bufoffset : after_last_match;
           if (required_offset != bufoffset
               && lseek (desc, required_offset, SEEK_SET) < 0
-              && S_ISREG (stats->stat.st_mode))
+              && S_ISREG (st.st_mode))
             suppressible_error (filename, errno);
         }
-      else
-        while (close (desc) != 0)
-          if (errno != EINTR)
-            {
-              suppressible_error (file, errno);
-              break;
-            }
     }
 
+ closeout:
+  if (desc != STDIN_FILENO && close (desc) != 0)
+    suppressible_error (filename, errno);
   return status;
 }
 
 static int
-grepdir (char const *dir, struct stats const *stats)
+grep_command_line_arg (char const *arg)
 {
-  char const *dir_or_dot = (dir ? dir : ".");
-  struct stats const *ancestor;
-  char *name_space;
-  int status = 1;
-  if (excluded_directory_patterns
-      && excluded_file_name (excluded_directory_patterns, dir))
-    return 1;
-
-  /* Mingw32 does not support st_ino.  No known working hosts use zero
-     for st_ino, so assume that the Mingw32 bug applies if it's zero.  */
-  if (stats->stat.st_ino)
+  if (STREQ (arg, "-"))
     {
-      for (ancestor = stats; (ancestor = ancestor->parent) != 0;  )
-        {
-          if (ancestor->stat.st_ino == stats->stat.st_ino
-              && ancestor->stat.st_dev == stats->stat.st_dev)
-            {
-              if (!suppress_errors)
-                error (0, 0, _("warning: %s: %s"), dir,
-                       _("recursive directory loop"));
-              errseen = 1;
-              return 1;
-            }
-        }
-    }
-
-  name_space = savedir (dir_or_dot, stats->stat.st_size, included_patterns,
-                        excluded_patterns, excluded_directory_patterns);
-
-  if (! name_space)
-    {
-      if (errno)
-        suppressible_error (dir_or_dot, errno);
-      else
-        xalloc_die ();
+      filename = label ? label : _("(standard input)");
+      return grepdesc (STDIN_FILENO, 1);
     }
   else
     {
-      size_t dirlen = 0;
-      int needs_slash = 0;
-      char *file_space = NULL;
-      char const *namep = name_space;
-      struct stats child;
-      if (dir)
-        {
-          dirlen = strlen (dir);
-          needs_slash = ! (dirlen == FILE_SYSTEM_PREFIX_LEN (dir)
-                           || ISSLASH (dir[dirlen - 1]));
-        }
-      child.parent = stats;
-      out_file += !no_filenames;
-      while (*namep)
-        {
-          size_t namelen = strlen (namep);
-          char const *file;
-          if (! dir)
-            file = namep;
-          else
-            {
-              file_space = xrealloc (file_space, dirlen + 1 + namelen + 1);
-              strcpy (file_space, dir);
-              file_space[dirlen] = '/';
-              strcpy (file_space + dirlen + needs_slash, namep);
-              file = file_space;
-            }
-          namep += namelen + 1;
-          status &= grepfile (file, &child);
-        }
-      out_file -= !no_filenames;
-      free (file_space);
-      free (name_space);
+      filename = arg;
+      return grepfile (AT_FDCWD, arg, 1, 1);
     }
-
-  return status;
 }
 
 _Noreturn void usage (int);
@@ -1500,7 +1537,8 @@ Output control:\n\
                             ACTION is `read', `recurse', or `skip'\n\
   -D, --devices=ACTION      how to handle devices, FIFOs and sockets;\n\
                             ACTION is `read' or `skip'\n\
-  -R, -r, --recursive       equivalent to --directories=recurse\n\
+  -r, --recursive           like --directories=recurse\n\
+  -R, --dereference-recursive  likewise, but follow all symlinks\n\
 "));
       printf (_("\
       --include=FILE_PATTERN  search only files that match FILE_PATTERN\n\
@@ -1981,6 +2019,8 @@ main (int argc, char **argv)
         break;
 
       case 'R':
+        fts_options = basic_fts_options | FTS_LOGICAL;
+        /* Fall through.  */
       case 'r':
         directories = RECURSE_DIRECTORIES;
         last_recursive = prev_optind;
@@ -2177,48 +2217,24 @@ main (int argc, char **argv)
   if (max_count == 0)
     exit (EXIT_FAILURE);
 
+  if (fts_options & FTS_LOGICAL && devices == READ_COMMAND_LINE_DEVICES)
+    devices = READ_DEVICES;
+
   if (optind < argc)
     {
       status = 1;
       do
-        {
-          char *file = argv[optind];
-          if (!STREQ (file, "-")
-              && (included_patterns || excluded_patterns
-                  || excluded_directory_patterns))
-            {
-              if (isdir (file))
-                {
-                  if (excluded_directory_patterns
-                      && excluded_file_name (excluded_directory_patterns,
-                                             file))
-                    continue;
-                }
-              else
-                {
-                  if (included_patterns
-                      && excluded_file_name (included_patterns, file))
-                    continue;
-                  if (excluded_patterns
-                      && excluded_file_name (excluded_patterns, file))
-                    continue;
-                }
-            }
-          status &= grepfile (STREQ (file, "-") ? (char *) NULL : file,
-                              &stats_base);
-        }
+        status &= grep_command_line_arg (argv[optind]);
       while (++optind < argc);
     }
   else if (directories == RECURSE_DIRECTORIES && prepended < last_recursive)
     {
-      status = 1;
-      if (stat (".", &stats_base.stat) == 0)
-        status = grepdir (NULL, &stats_base);
-      else
-        suppressible_error (".", errno);
+      /* Grep through ".", omitting leading "./" from diagnostics.  */
+      filename_prefix_len = 2;
+      status = grep_command_line_arg (".");
     }
   else
-    status = grepfile ((char *) NULL, &stats_base);
+    status = grep_command_line_arg ("-");
 
   /* We register via atexit() to test stdout.  */
   exit (errseen ? EXIT_TROUBLE : status);
diff --git a/tests/Makefile.am b/tests/Makefile.am
index c2cd2f7..13061fe 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -82,6 +82,7 @@ TESTS =                                               \
   spencer1                                     \
   spencer1-locale                              \
   status                                       \
+  symlink                                      \
   turkish-I                                    \
   warn-char-classes                            \
   word-delim-multibyte                         \
diff --git a/tests/symlink b/tests/symlink
new file mode 100755
index 0000000..012d8f8
--- /dev/null
+++ b/tests/symlink
@@ -0,0 +1,65 @@
+#!/bin/sh
+# Check that "grep -r" does the right thing with symbolic links.
+
+# Copyright (C) 2012 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# written by Paul Eggert
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+mkdir dir || framework_failure_
+echo a > dir/a || framework_failure_
+echo b > dir/b || framework_failure_
+ln -s a dir/c || framework_failure_
+ln -s . dir/d || framework_failure_
+ln -s dangling dir/e || framework_failure_
+
+touch out || framework_failure_
+
+for recursion in '' -r -R
+do
+  for files in '' '*'
+  do
+    case $recursion,$files in
+      -R,* | *,'*') expected_status=2 ;;
+      *) expected_status=0 ;;
+    esac
+
+    (cd dir && grep $recursion '^' $files <a ) >grepout
+    test $? -eq $expected_status || fail=1
+
+    case $recursion,$files in
+      ,)
+         exp='a\n' ;;
+      ,'*' | -R,)
+         exp='a:a\nb:b\nc:a\n' ;;
+      -r,)
+         exp='a:a\nb:b\n' ;;
+      -r,'*')
+         exp='a:a\nb:b\nc:a\nd/a:a\nd/b:b\n' ;;
+      -R,'*')
+         exp='a:a\nb:b\nc:a\nd/a:a\nd/b:b\nd/c:a\n' ;;
+      *)
+         framework_failure_ ;;
+    esac
+
+    printf "$exp" >exp || framework_failure_
+    LC_ALL=C sort grepout >out || fail=1
+    compare exp out || fail=1
+  done
+done
+
+Exit $fail
-- 
1.7.6.5




reply via email to

[Prev in Thread] Current Thread [Next in Thread]