bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] grep: -r no follows symlinks


From: Paul Eggert
Subject: Re: [PATCH] grep: -r no follows symlinks
Date: Mon, 12 Mar 2012 14:09:00 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120209 Thunderbird/10.0.1

Thanks for the review.  Here's a followup that takes your
suggestions into account.

Also, the recent bugfix for -r --exclude-dir means the patch
needs to be adjusted slightly.  The new patch is a tad shorter, yay!
As before, this patch assumes a sync to the latest gnulib.

>From e6cee987efe899ad1a88c9af0ffc62011b99a006 Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Sat, 10 Mar 2012 11:29:32 -0800
Subject: [PATCH] grep: -r no longer follows symlinks; use fts

Change -r to follow only command-line symlinks, and by default to
read only devices named on the command line.  This is a simple
way to get a more-useful behavior when searching random
directories; the idea is to use 'find' if you want something fancy.
-R acts as before and gets a new alias --dereference-recursive.
The code now uses fts internally, so it is more robust and
faster with large hierarchies.
* .gitignore: Remove lib/savedir.c, lib/savedir.h.
* tests/symlink: New file
* Makefile.boot (LIB_OBJS_core): Remove isdir.o, savedir.o.
Perhaps other changes are needed too, but I'm not sure what
this makefile is for.
* NEWS: Document changes.
* doc/grep.texi (File and Directory Selection): Likewise.
* bootstrap.conf (gnulib_modules): Remove dirent, dirname, isdir, open.
Add fstatat, fts, openat-safer.
* lib/Makefile.am (libgreputils_a_SOURCES): Remove savedir.c, savedir.h.
* lib/savedir.c, lib/savedir.h: Remove.
* po/POTFILES.in: Add lib/openat-die.c.
* src/main.c: Include fcntl-safer.h, fts_.h.  Don't include
isdir.h, savedir.h.
(struct stats, stats_base): Remove.
(long_options, usage, main): Add --dereference-recursive and
implement -r vs -R.
(filename_prefix_len, fts_options): New static vars.
(basic_fts_options, READ_COMMAND_LINE_DEVICES): New constants.
(devices): Now defaults to READ_COMMAND_LINE_DEVICES.
(reset, grep): Now takes just struct stat rather than file name and
struct stats.  All callers changed.
(fillbuf): Now takes struct stat reather than struct stats.
All callers changed.
(grep): Don't worry about recursing too deeply; fts and grepdesc
handle this now.
(grepdirent, grepdesc, grep_command_line_args): New functions.
(grepfile): New args DIRDESC, FOLLOW, COMMAND_LINE.  Remove struct stats
arg.  All callers changed.  Use openat_safer rather than open.
Use desc == STDIN_FILENO to tell whether we're reading "-".
Don't worry about EINTR when closing -- not possible, since we're
not catching signals.
* tests/Makefile.am (TESTS): Add symlink.
* tests/symlink: New file.
---
 .gitignore        |    2 -
 Makefile.boot     |    2 -
 NEWS              |   16 +++
 bootstrap.conf    |    7 +-
 doc/grep.texi     |   26 +++-
 lib/Makefile.am   |    2 +-
 lib/savedir.c     |  163 ----------------------
 lib/savedir.h     |   11 --
 po/POTFILES.in    |    1 +
 src/main.c        |  390 +++++++++++++++++++++++++++--------------------------
 tests/Makefile.am |    1 +
 tests/symlink     |   65 +++++++++
 12 files changed, 308 insertions(+), 378 deletions(-)
 delete mode 100644 lib/savedir.c
 delete mode 100644 lib/savedir.h
 create mode 100755 tests/symlink

diff --git a/.gitignore b/.gitignore
index 35f5d10..0b195d9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -64,5 +64,3 @@ TAGS
 !/lib/colorize-posix.c
 !/lib/colorize-w32.c
 !/lib/colorize.h
-!/lib/savedir.c
-!/lib/savedir.h
diff --git a/Makefile.boot b/Makefile.boot
index 043429b..4414110 100644
--- a/Makefile.boot
+++ b/Makefile.boot
@@ -41,10 +41,8 @@ LIB_OBJS_core =  \
       $(libdir)/error.$(OBJEXT) \
       $(libdir)/exclude.$(OBJEXT) \
       $(libdir)/hard-locale.$(OBJEXT) \
-      $(libdir)/isdir.$(OBJEXT) \
       $(libdir)/quotearg.$(OBJEXT) \
       $(libdir)/regex.$(OBJEXT) \
-      $(libdir)/savedir.$(OBJEXT) \
       $(libdir)/strtoumax.$(OBJEXT) \
       $(libdir)/xmalloc.$(OBJEXT) \
       $(libdir)/xstrtol.$(OBJEXT) \
diff --git a/NEWS b/NEWS
index d4d70f5..6e413d8 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,22 @@ GNU grep NEWS                                    -*- outline 
-*-
    grep no longer segfaults with -r --exclude-dir and no file operand.
    I.e., ":|grep -r --exclude-dir=D PAT" would segfault.
 
+   Recursive grep now uses fts for directory traversal, so it can
+   handle much-larger directories without reporting things like "File
+   name too long", and it can run much faster when dealing with large
+   directory hierarchies.
+
+** New features
+
+  The -R option now has a long-option alias --dereference-recursive.
+
+** Changes in behavior
+
+  The -r (--recursive) option now follows only command-line symlinks.
+  Also, by default -r now reads only devices named on the command
+  line; this can be overridden with --devices.  -R acts as before, so
+  use -R if you prefer the old behavior of following all symlinks and
+  defaulting to reading all devices.
 
 * Noteworthy changes in release 2.11 (2012-03-02) [stable]
 
diff --git a/bootstrap.conf b/bootstrap.conf
index 45bb33d..7a71015 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -24,13 +24,13 @@ binary-io
 btowc
 c-ctype
 closeout
-dirent
-dirname
 do-release-commit-and-tag
 error
 exclude
 fcntl-h
 fnmatch
+fstatat
+fts
 getopt-gnu
 getpagesize
 gettext-h
@@ -44,7 +44,6 @@ intprops
 inttypes
 isatty
 isblank
-isdir
 iswctype
 largefile
 lseek
@@ -58,7 +57,7 @@ memchr
 mempcpy
 minmax
 obstack
-open
+openat-safer
 progname
 propername
 quote
diff --git a/doc/grep.texi b/doc/grep.texi
index c014d8f..1840e21 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -606,16 +606,21 @@ if the terminal driver interprets some of it as commands.
 @opindex --devices
 @cindex device search
 If an input file is a device, FIFO, or socket, use @var{action} to process it.
-By default, @var{action} is @samp{read},
-which means that devices are read just as if they were ordinary files.
+If @var{action} is @samp{read},
+all devices are read just as if they were ordinary files.
 If @var{action} is @samp{skip},
 devices, FIFOs, and sockets are silently skipped.
+By default, devices are read if they are on the command line or if the
address@hidden (@option{--dereference-recursive}) option is used, and are
+skipped if they are encountered recursively and the @option{-r}
+(@option{--recursive}) option is used.
 
 @item -d @var{action}
 @itemx address@hidden
 @opindex -d
 @opindex --directories
 @cindex directory search
address@hidden symbolic links
 If an input file is a directory, use @var{action} to process it.
 By default, @var{action} is @samp{read},
 which means that directories are read just as if they were ordinary files
@@ -624,7 +629,8 @@ and will cause @command{grep}
 to print error messages for every directory or silently skip them).
 If @var{action} is @samp{skip}, directories are silently skipped.
 If @var{action} is @samp{recurse},
address@hidden reads all files under each directory, recursively;
address@hidden reads all files under each directory, recursively,
+following command-line symbolic links and skipping other symlinks;
 this is equivalent to the @option{-r} option.
 
 @item address@hidden
@@ -663,16 +669,28 @@ Search only files whose base name matches @var{glob}
 (using wildcard matching as described under @option{--exclude}).
 
 @item -r
address@hidden -R
 @itemx --recursive
 @opindex -r
 @opindex --recursive
 @cindex recursive search
 @cindex searching directory trees
address@hidden symbolic links
 For each directory operand,
 read and process all files in that directory, recursively.
+Follow symbolic links on the command line, but skip symlinks
+that are encountered recursively.
 This is the same as the @samp{--directories=recurse} option.
 
address@hidden -R
address@hidden --dereference-recursive
address@hidden -R
address@hidden --dereference-recursive
address@hidden recursive search
address@hidden searching directory trees
address@hidden symbolic links
+For each directory operand, read and process all files in that
+directory, recursively, following all symbolic links.
+
 @end table
 
 @node Other Options
diff --git a/lib/Makefile.am b/lib/Makefile.am
index 04ae51e..527c6f5 100644
--- a/lib/Makefile.am
+++ b/lib/Makefile.am
@@ -32,7 +32,7 @@ INCLUDES = -I.. -I$(srcdir)
 AM_CFLAGS += $(GNULIB_WARN_CFLAGS) $(WERROR_CFLAGS)
 
 libgreputils_a_SOURCES += \
-  colorize.c colorize.h savedir.c savedir.h
+  colorize.c colorize.h
 
 EXTRA_DIST += colorize-posix.c colorize-w32.c
 
diff --git a/lib/savedir.c b/lib/savedir.c
deleted file mode 100644
index 3f93c16..0000000
--- a/lib/savedir.c
+++ /dev/null
@@ -1,163 +0,0 @@
-/* savedir.c -- save the list of files in a directory in a string
-   Copyright (C) 1990, 1997-2001, 2009-2012 Free Software Foundation, Inc.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software Foundation,
-   Inc., 51 Franklin Street - Fifth Floor, Boston, MA 02110-1301, USA.  */
-
-/* Written by David MacKenzie <address@hidden>. */
-
-#include <config.h>
-
-#include <sys/types.h>
-#include <unistd.h>
-#include <dirent.h>
-
-#ifdef CLOSEDIR_VOID
-/* Fake a return value. */
-# define CLOSEDIR(d) (closedir (d), 0)
-#else
-# define CLOSEDIR(d) closedir (d)
-#endif
-
-#include <stdlib.h>
-#include <string.h>
-#include <fnmatch.h>
-#include "savedir.h"
-#include "xalloc.h"
-
-static char *path;
-static size_t pathlen;
-
-extern int isdir (const char *name);
-
-static int
-isdir1 (const char *dir, const char *file)
-{
-  size_t dirlen = strlen (dir);
-  size_t filelen = strlen (file);
-
-  while (dirlen && dir[dirlen - 1] == '/')
-    dirlen--;
-
-  if ((dirlen + filelen + 2) > pathlen)
-    {
-      pathlen *= 2;
-      if ((dirlen + filelen + 2) > pathlen)
-        pathlen = dirlen + filelen + 2;
-
-      path = xrealloc (path, pathlen);
-    }
-
-  memcpy (path, dir, dirlen);
-  path[dirlen] = '/';
-  strcpy (path + dirlen + 1, file);
-  return isdir (path);
-}
-
-/* Return a freshly allocated string containing the filenames
-   in directory DIR, separated by '\0' characters;
-   the end is marked by two '\0' characters in a row.
-   NAME_SIZE is the number of bytes to initially allocate
-   for the string; it will be enlarged as needed.
-   Return NULL if DIR cannot be opened or if out of memory. */
-char *
-savedir (const char *dir, off_t name_size, struct exclude *included_patterns,
-         struct exclude *excluded_patterns, struct exclude 
*excluded_directory_patterns )
-{
-  DIR *dirp;
-  struct dirent *dp;
-  char *name_space;
-  char *namep;
-
-  dirp = opendir (dir);
-  if (dirp == NULL)
-    return NULL;
-
-  /* Be sure name_size is at least `1' so there's room for
-     the final NUL byte.  */
-  if (name_size <= 0)
-    name_size = 1;
-
-  name_space = (char *) malloc (name_size);
-  if (name_space == NULL)
-    {
-      closedir (dirp);
-      return NULL;
-    }
-  namep = name_space;
-
-  while ((dp = readdir (dirp)) != NULL)
-    {
-      /* Skip "." and ".." (some NFS file systems' directories lack them). */
-      if (dp->d_name[0] != '.'
-          || (dp->d_name[1] != '\0'
-              && (dp->d_name[1] != '.' || dp->d_name[2] != '\0')))
-        {
-          size_t namlen = strlen (dp->d_name);
-          size_t size_needed = (namep - name_space) + namlen + 2;
-
-          if ((included_patterns || excluded_patterns)
-              && !isdir1 (dir, dp->d_name))
-            {
-              if (included_patterns
-                  && excluded_file_name (included_patterns, dp->d_name))
-                continue;
-              if (excluded_patterns
-                  && excluded_file_name (excluded_patterns, dp->d_name))
-                continue;
-            }
-
-          if ( excluded_directory_patterns
-              && isdir1 (dir, dp->d_name) )
-            {
-              if (excluded_directory_patterns
-                  && excluded_file_name (excluded_directory_patterns, 
dp->d_name))
-                continue;
-            }
-
-          if (size_needed > name_size)
-            {
-              char *new_name_space;
-
-              while (size_needed > name_size)
-                name_size += 1024;
-
-              new_name_space = realloc (name_space, name_size);
-              if (new_name_space == NULL)
-                {
-                  closedir (dirp);
-                  goto fail;
-                }
-              namep = new_name_space + (namep - name_space);
-              name_space = new_name_space;
-            }
-          strcpy (namep, dp->d_name);
-          namep += namlen + 1;
-        }
-    }
-  *namep = '\0';
-  if (CLOSEDIR (dirp))
-    {
-     fail:
-      free (name_space);
-      name_space = NULL;
-    }
-  if (path)
-    {
-      free (path);
-      path = NULL;
-      pathlen = 0;
-    }
-  return name_space;
-}
diff --git a/lib/savedir.h b/lib/savedir.h
deleted file mode 100644
index 00cb1a9..0000000
--- a/lib/savedir.h
+++ /dev/null
@@ -1,11 +0,0 @@
-#if !defined SAVEDIR_H_
-# define SAVEDIR_H_
-
-#include "exclude.h"
-
-extern char *
-savedir (const char *dir, off_t name_size,
-         struct exclude *, struct exclude *,
-         struct exclude *);
-
-#endif
diff --git a/po/POTFILES.in b/po/POTFILES.in
index b33c126..65f4f57 100644
--- a/po/POTFILES.in
+++ b/po/POTFILES.in
@@ -20,6 +20,7 @@ lib/closeout.c
 lib/error.c
 lib/getopt.c
 lib/obstack.c
+lib/openat-die.c
 lib/quotearg.c
 lib/regcomp.c
 lib/version-etc.c
diff --git a/src/main.c b/src/main.c
index f4f1235..5bd150d 100644
--- a/src/main.c
+++ b/src/main.c
@@ -36,14 +36,14 @@
 #include "error.h"
 #include "exclude.h"
 #include "exitfail.h"
+#include "fcntl-safer.h"
+#include "fts_.h"
 #include "getopt.h"
 #include "grep.h"
 #include "intprops.h"
-#include "isdir.h"
 #include "progname.h"
 #include "propername.h"
 #include "quote.h"
-#include "savedir.h"
 #include "version-etc.h"
 #include "xalloc.h"
 #include "xstrtol.h"
@@ -63,15 +63,6 @@
    avoiding a potential (racy) infinite loop.  */
 static struct stat out_stat;
 
-struct stats
-{
-  struct stats const *parent;
-  struct stat stat;
-};
-
-/* base of chain of stat buffers, used to detect directory loops */
-static struct stats stats_base;
-
 /* if non-zero, display usage information and exit */
 static int show_help;
 
@@ -347,7 +338,7 @@ static struct option const long_options[] =
   {"only-matching", no_argument, NULL, 'o'},
   {"quiet", no_argument, NULL, 'q'},
   {"recursive", no_argument, NULL, 'r'},
-  {"recursive", no_argument, NULL, 'R'},
+  {"dereference-recursive", no_argument, NULL, 'R'},
   {"regexp", required_argument, NULL, 'e'},
   {"invert-match", no_argument, NULL, 'v'},
   {"silent", no_argument, NULL, 'q'},
@@ -369,6 +360,7 @@ unsigned char eolbyte;
 /* For error messages. */
 /* The input file name, or (if standard input) "-" or a --label argument.  */
 static char const *filename;
+static size_t filename_prefix_len;
 static int errseen;
 static int write_error_seen;
 
@@ -392,14 +384,19 @@ ARGMATCH_VERIFY (directories_args, directories_types);
 
 static enum directories_type directories = READ_DIRECTORIES;
 
+enum { basic_fts_options = FTS_CWDFD | FTS_NOSTAT | FTS_TIGHT_CYCLE_CHECK };
+static int fts_options = basic_fts_options | FTS_COMFOLLOW | FTS_PHYSICAL;
+
 /* How to handle devices. */
 static enum
   {
+    READ_COMMAND_LINE_DEVICES,
     READ_DEVICES,
     SKIP_DEVICES
-  } devices = READ_DEVICES;
+  } devices = READ_COMMAND_LINE_DEVICES;
 
-static int grepdir (char const *, struct stats const *);
+static int grepfile (int, char const *, int, int);
+static int grepdesc (int, int);
 #if defined HAVE_DOS_FILE_CONTENTS
 static inline int undossify_input (char *, size_t);
 #endif
@@ -473,7 +470,7 @@ static off_t after_last_match;      /* Pointer after last 
matching line that
 /* Reset the buffer for a new file, returning zero if we should skip it.
    Initialize on the first time through. */
 static int
-reset (int fd, char const *file, struct stats *stats)
+reset (int fd, struct stat const *st)
 {
   if (! pagesize)
     {
@@ -488,9 +485,9 @@ reset (int fd, char const *file, struct stats *stats)
   bufbeg[-1] = eolbyte;
   bufdesc = fd;
 
-  if (S_ISREG (stats->stat.st_mode))
+  if (S_ISREG (st->st_mode))
     {
-      if (file)
+      if (fd != STDIN_FILENO)
         bufoffset = 0;
       else
         {
@@ -510,7 +507,7 @@ reset (int fd, char const *file, struct stats *stats)
    to the beginning of the buffer contents, and 'buflim'
    points just after the end.  Return zero if there's an error.  */
 static int
-fillbuf (size_t save, struct stats const *stats)
+fillbuf (size_t save, struct stat const *st)
 {
   size_t fillsize = 0;
   int cc = 1;
@@ -543,9 +540,9 @@ fillbuf (size_t save, struct stats const *stats)
          is large.  However, do not use the original file size as a
          heuristic if we've already read past the file end, as most
          likely the file is growing.  */
-      if (S_ISREG (stats->stat.st_mode))
+      if (S_ISREG (st->st_mode))
         {
-          off_t to_be_read = stats->stat.st_size - bufoffset;
+          off_t to_be_read = st->st_size - bufoffset;
           off_t maxsize_off = save + to_be_read;
           if (0 <= to_be_read && to_be_read <= maxsize_off
               && maxsize_off == (size_t) maxsize_off
@@ -1085,7 +1082,7 @@ grepbuf (char const *beg, char const *lim)
    but if the file is a directory and we search it recursively, then
    return -2 if there was a match, and -1 otherwise.  */
 static intmax_t
-grep (int fd, char const *file, struct stats *stats)
+grep (int fd, struct stat const *st)
 {
   intmax_t nlines, i;
   int not_text;
@@ -1095,19 +1092,9 @@ grep (int fd, char const *file, struct stats *stats)
   char *lim;
   char eol = eolbyte;
 
-  if (!reset (fd, file, stats))
+  if (! reset (fd, st))
     return 0;
 
-  if (file && directories == RECURSE_DIRECTORIES
-      && S_ISDIR (stats->stat.st_mode))
-    {
-      /* Close fd now, so that we don't open a lot of file descriptors
-         when we recurse deeply.  */
-      if (close (fd) != 0)
-        suppressible_error (file, errno);
-      return grepdir (file, stats) - 2;
-    }
-
   totalcc = 0;
   lastout = 0;
   totalnl = 0;
@@ -1119,7 +1106,7 @@ grep (int fd, char const *file, struct stats *stats)
   residue = 0;
   save = 0;
 
-  if (! fillbuf (save, stats))
+  if (! fillbuf (save, st))
     {
       suppressible_error (filename, errno);
       return 0;
@@ -1190,7 +1177,7 @@ grep (int fd, char const *file, struct stats *stats)
         totalcc = add_count (totalcc, buflim - bufbeg - save);
       if (out_line)
         nlscan (beg);
-      if (! fillbuf (save, stats))
+      if (! fillbuf (save, st))
         {
           suppressible_error (filename, errno);
           goto finish_grep;
@@ -1214,53 +1201,175 @@ grep (int fd, char const *file, struct stats *stats)
 }
 
 static int
-grepfile (char const *file, struct stats *stats)
+grepdirent (FTS *fts, FTSENT *ent)
 {
-  int desc;
-  intmax_t count;
-  int status;
+  int follow, dirdesc;
+  int command_line = ent->fts_level == FTS_ROOTLEVEL;
+  struct stat *st = ent->fts_statp;
 
-  filename = (file ? file : label ? label : _("(standard input)"));
+  if (ent->fts_info == FTS_DP)
+    {
+      if (directories == RECURSE_DIRECTORIES && command_line)
+        out_file &= ~ (2 * !no_filenames);
+      return 1;
+    }
 
-  if (! file)
-    desc = STDIN_FILENO;
-  else if (devices == SKIP_DEVICES)
+  if ((ent->fts_info == FTS_D || ent->fts_info == FTS_DC
+       || ent->fts_info == FTS_DNR)
+      ? (directories == SKIP_DIRECTORIES
+         || (! (command_line && filename_prefix_len != 0)
+             && excluded_directory_patterns
+             && excluded_file_name (excluded_directory_patterns,
+                                    ent->fts_name)))
+      : ((included_patterns
+          && excluded_file_name (included_patterns, ent->fts_name))
+         || (excluded_patterns
+             && excluded_file_name (excluded_patterns, ent->fts_name))))
     {
-      /* Don't open yet, since that might have side effects on a device.  */
-      desc = -1;
+      fts_set (fts, ent, FTS_SKIP);
+      return 1;
     }
-  else
+
+  filename = ent->fts_path + filename_prefix_len;
+  follow = (fts->fts_options & FTS_LOGICAL
+            || (fts->fts_options & FTS_COMFOLLOW && command_line));
+
+  switch (ent->fts_info)
     {
-      /* When skipping directories, don't worry about directories
-         that can't be opened.  */
-      desc = open (file, O_RDONLY);
-      if (desc < 0 && directories != SKIP_DIRECTORIES)
+    case FTS_D:
+      if (directories == RECURSE_DIRECTORIES)
         {
-          suppressible_error (file, errno);
+          out_file |= 2 * !no_filenames;
           return 1;
         }
+      fts_set (fts, ent, FTS_SKIP);
+      break;
+
+    case FTS_DC:
+      if (!suppress_errors)
+        error (0, 0, _("warning: %s: %s"), filename,
+               _("recursive directory loop"));
+      return 1;
+
+    case FTS_DNR:
+    case FTS_ERR:
+    case FTS_NS:
+      suppressible_error (filename, ent->fts_errno);
+      return 1;
+
+    case FTS_DEFAULT:
+    case FTS_NSOK:
+      if (devices == SKIP_DEVICES
+          || (devices == READ_COMMAND_LINE_DEVICES && !command_line))
+        {
+          struct stat st1;
+          if (! st->st_mode)
+            {
+              /* The file type is not already known.  Get the file status
+                 before opening, since opening might have side effects
+                 on a device.  */
+              int flag = follow ? 0 : AT_SYMLINK_NOFOLLOW;
+              if (fstatat (fts->fts_cwd_fd, ent->fts_accpath, &st1, flag) != 0)
+                {
+                  suppressible_error (filename, errno);
+                  return 1;
+                }
+              st = &st1;
+            }
+          if (S_ISCHR (st->st_mode) || S_ISBLK (st->st_mode)
+              || S_ISSOCK (st->st_mode) || S_ISFIFO (st->st_mode))
+            return 1;
+        }
+      break;
+
+    case FTS_F:
+    case FTS_SLNONE:
+      break;
+
+    case FTS_SL:
+    case FTS_W:
+      return 1;
+
+    default:
+      abort ();
     }
 
-  if (desc < 0
-      ? stat (file, &stats->stat) != 0
-      : fstat (desc, &stats->stat) != 0)
+  dirdesc = ((fts->fts_options & (FTS_NOCHDIR | FTS_CWDFD)) == FTS_CWDFD
+             ? fts->fts_cwd_fd
+             : AT_FDCWD);
+  return grepfile (dirdesc, ent->fts_accpath, follow, command_line);
+}
+
+static int
+grepfile (int dirdesc, char const *name, int follow, int command_line)
+{
+  int desc = openat_safer (dirdesc, name, O_RDONLY | (follow ? 0 : 
O_NOFOLLOW));
+  if (desc < 0)
     {
-      suppressible_error (filename, errno);
-      if (file)
-        close (desc);
+      if (follow || errno != ELOOP)
+        suppressible_error (filename, errno);
       return 1;
     }
+  return grepdesc (desc, command_line);
+}
 
-  if ((directories == SKIP_DIRECTORIES && S_ISDIR (stats->stat.st_mode))
-      || (devices == SKIP_DEVICES && (S_ISCHR (stats->stat.st_mode)
-                                      || S_ISBLK (stats->stat.st_mode)
-                                      || S_ISSOCK (stats->stat.st_mode)
-                                      || S_ISFIFO (stats->stat.st_mode))))
+static int
+grepdesc (int desc, int command_line)
+{
+  intmax_t count;
+  int status = 1;
+  struct stat st;
+
+  /* Get the file status, possibly for the second time.  This catches
+     a race condition if the directory entry changes after the
+     directory entry is read and before the file is opened.  For
+     example, normally DESC is a directory only at the top level, but
+     there is an exception if some other process substitutes a
+     directory for a non-directory while 'grep' is running.  */
+  if (fstat (desc, &st) != 0)
     {
-      if (file)
-        close (desc);
-      return 1;
+      suppressible_error (filename, errno);
+      goto closeout;
+    }
+  if (desc != STDIN_FILENO
+      && directories == RECURSE_DIRECTORIES && S_ISDIR (st.st_mode))
+    {
+      /* Traverse the directory starting with its full name, because
+         unfortunately fts provides no way to traverse the directory
+         starting from its file descriptor.  */
+
+      FTS *fts;
+      FTSENT *ent;
+      int opts = fts_options & ~(command_line ? 0 : FTS_COMFOLLOW);
+      char *fts_arg[2];
+
+      /* Close DESC now, to conserve file descriptors if the race
+         condition occurs many times in a deep recursion.  */
+      if (close (desc) != 0)
+        suppressible_error (filename, errno);
+
+      fts_arg[0] = (char *) filename;
+      fts_arg[1] = NULL;
+      fts = fts_open (fts_arg, opts, NULL);
+
+      if (!fts)
+        xalloc_die ();
+      while ((ent = fts_read (fts)))
+        status &= grepdirent (fts, ent);
+      if (errno)
+        suppressible_error (filename, errno);
+      if (fts_close (fts) != 0)
+        suppressible_error (filename, errno);
+      return status;
     }
+  if ((directories == SKIP_DIRECTORIES && S_ISDIR (st.st_mode))
+      || ((devices == SKIP_DEVICES
+           || (devices == READ_COMMAND_LINE_DEVICES && !command_line))
+          && (S_ISCHR (st.st_mode)
+              || S_ISBLK (st.st_mode)
+              || S_ISSOCK (st.st_mode)
+              || S_ISFIFO (st.st_mode))))
+    goto closeout;
 
   /* If there is a regular file on stdout and the current file refers
      to the same i-node, we have to report the problem and skip it.
@@ -1282,24 +1391,12 @@ grepfile (char const *file, struct stats *stats)
      condition that could result in "alternate" output.  */
   if (!out_quiet && list_files == 0 && 1 < max_count
       && S_ISREG (out_stat.st_mode) && out_stat.st_ino
-      && SAME_INODE (stats->stat, out_stat))
+      && SAME_INODE (st, out_stat))
     {
       if (! suppress_errors)
         error (0, 0, _("input file %s is also the output"), quote (filename));
       errseen = 1;
-      if (file)
-        close (desc);
-      return 1;
-    }
-
-  if (desc < 0)
-    {
-      desc = open (file, O_RDONLY);
-      if (desc < 0)
-        {
-          suppressible_error (file, errno);
-          return 1;
-        }
+      goto closeout;
     }
 
 #if defined SET_BINARY
@@ -1309,7 +1406,7 @@ grepfile (char const *file, struct stats *stats)
     SET_BINARY (desc);
 #endif
 
-  count = grep (desc, file, stats);
+  count = grep (desc, &st);
   if (count < 0)
     status = count + 2;
   else
@@ -1334,103 +1431,35 @@ grepfile (char const *file, struct stats *stats)
           fputc ('\n' & filename_mask, stdout);
         }
 
-      if (! file)
+      if (desc == STDIN_FILENO)
         {
           off_t required_offset = outleft ? bufoffset : after_last_match;
           if (required_offset != bufoffset
               && lseek (desc, required_offset, SEEK_SET) < 0
-              && S_ISREG (stats->stat.st_mode))
+              && S_ISREG (st.st_mode))
             suppressible_error (filename, errno);
         }
-      else
-        while (close (desc) != 0)
-          if (errno != EINTR)
-            {
-              suppressible_error (file, errno);
-              break;
-            }
     }
 
+ closeout:
+  if (desc != STDIN_FILENO && close (desc) != 0)
+    suppressible_error (filename, errno);
   return status;
 }
 
 static int
-grepdir (char const *dir, struct stats const *stats)
+grep_command_line_arg (char const *arg)
 {
-  char const *dir_or_dot = (dir ? dir : ".");
-  struct stats const *ancestor;
-  char *name_space;
-  int status = 1;
-  if (dir && excluded_directory_patterns
-      && excluded_file_name (excluded_directory_patterns, dir))
-    return 1;
-
-  /* Mingw32 does not support st_ino.  No known working hosts use zero
-     for st_ino, so assume that the Mingw32 bug applies if it's zero.  */
-  if (stats->stat.st_ino)
+  if (STREQ (arg, "-"))
     {
-      for (ancestor = stats; (ancestor = ancestor->parent) != 0;  )
-        {
-          if (ancestor->stat.st_ino == stats->stat.st_ino
-              && ancestor->stat.st_dev == stats->stat.st_dev)
-            {
-              if (!suppress_errors)
-                error (0, 0, _("warning: %s: %s"), dir,
-                       _("recursive directory loop"));
-              errseen = 1;
-              return 1;
-            }
-        }
-    }
-
-  name_space = savedir (dir_or_dot, stats->stat.st_size, included_patterns,
-                        excluded_patterns, excluded_directory_patterns);
-
-  if (! name_space)
-    {
-      if (errno)
-        suppressible_error (dir_or_dot, errno);
-      else
-        xalloc_die ();
+      filename = label ? label : _("(standard input)");
+      return grepdesc (STDIN_FILENO, 1);
     }
   else
     {
-      size_t dirlen = 0;
-      int needs_slash = 0;
-      char *file_space = NULL;
-      char const *namep = name_space;
-      struct stats child;
-      if (dir)
-        {
-          dirlen = strlen (dir);
-          needs_slash = ! (dirlen == FILE_SYSTEM_PREFIX_LEN (dir)
-                           || ISSLASH (dir[dirlen - 1]));
-        }
-      child.parent = stats;
-      out_file += !no_filenames;
-      while (*namep)
-        {
-          size_t namelen = strlen (namep);
-          char const *file;
-          if (! dir)
-            file = namep;
-          else
-            {
-              file_space = xrealloc (file_space, dirlen + 1 + namelen + 1);
-              strcpy (file_space, dir);
-              file_space[dirlen] = '/';
-              strcpy (file_space + dirlen + needs_slash, namep);
-              file = file_space;
-            }
-          namep += namelen + 1;
-          status &= grepfile (file, &child);
-        }
-      out_file -= !no_filenames;
-      free (file_space);
-      free (name_space);
+      filename = arg;
+      return grepfile (AT_FDCWD, arg, 1, 1);
     }
-
-  return status;
 }
 
 _Noreturn void usage (int);
@@ -1500,7 +1529,8 @@ Output control:\n\
                             ACTION is `read', `recurse', or `skip'\n\
   -D, --devices=ACTION      how to handle devices, FIFOs and sockets;\n\
                             ACTION is `read' or `skip'\n\
-  -R, -r, --recursive       equivalent to --directories=recurse\n\
+  -r, --recursive           like --directories=recurse\n\
+  -R, --dereference-recursive  likewise, but follow all symlinks\n\
 "));
       printf (_("\
       --include=FILE_PATTERN  search only files that match FILE_PATTERN\n\
@@ -1981,6 +2011,8 @@ main (int argc, char **argv)
         break;
 
       case 'R':
+        fts_options = basic_fts_options | FTS_LOGICAL;
+        /* Fall through.  */
       case 'r':
         directories = RECURSE_DIRECTORIES;
         last_recursive = prev_optind;
@@ -2177,48 +2209,24 @@ main (int argc, char **argv)
   if (max_count == 0)
     exit (EXIT_FAILURE);
 
+  if (fts_options & FTS_LOGICAL && devices == READ_COMMAND_LINE_DEVICES)
+    devices = READ_DEVICES;
+
   if (optind < argc)
     {
       status = 1;
       do
-        {
-          char *file = argv[optind];
-          if (!STREQ (file, "-")
-              && (included_patterns || excluded_patterns
-                  || excluded_directory_patterns))
-            {
-              if (isdir (file))
-                {
-                  if (excluded_directory_patterns
-                      && excluded_file_name (excluded_directory_patterns,
-                                             file))
-                    continue;
-                }
-              else
-                {
-                  if (included_patterns
-                      && excluded_file_name (included_patterns, file))
-                    continue;
-                  if (excluded_patterns
-                      && excluded_file_name (excluded_patterns, file))
-                    continue;
-                }
-            }
-          status &= grepfile (STREQ (file, "-") ? (char *) NULL : file,
-                              &stats_base);
-        }
+        status &= grep_command_line_arg (argv[optind]);
       while (++optind < argc);
     }
   else if (directories == RECURSE_DIRECTORIES && prepended < last_recursive)
     {
-      status = 1;
-      if (stat (".", &stats_base.stat) == 0)
-        status = grepdir (NULL, &stats_base);
-      else
-        suppressible_error (".", errno);
+      /* Grep through ".", omitting leading "./" from diagnostics.  */
+      filename_prefix_len = 2;
+      status = grep_command_line_arg (".");
     }
   else
-    status = grepfile ((char *) NULL, &stats_base);
+    status = grep_command_line_arg ("-");
 
   /* We register via atexit() to test stdout.  */
   exit (errseen ? EXIT_TROUBLE : status);
diff --git a/tests/Makefile.am b/tests/Makefile.am
index c2cd2f7..13061fe 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -82,6 +82,7 @@ TESTS =                                               \
   spencer1                                     \
   spencer1-locale                              \
   status                                       \
+  symlink                                      \
   turkish-I                                    \
   warn-char-classes                            \
   word-delim-multibyte                         \
diff --git a/tests/symlink b/tests/symlink
new file mode 100755
index 0000000..012d8f8
--- /dev/null
+++ b/tests/symlink
@@ -0,0 +1,65 @@
+#!/bin/sh
+# Check that "grep -r" does the right thing with symbolic links.
+
+# Copyright (C) 2012 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# written by Paul Eggert
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+mkdir dir || framework_failure_
+echo a > dir/a || framework_failure_
+echo b > dir/b || framework_failure_
+ln -s a dir/c || framework_failure_
+ln -s . dir/d || framework_failure_
+ln -s dangling dir/e || framework_failure_
+
+touch out || framework_failure_
+
+for recursion in '' -r -R
+do
+  for files in '' '*'
+  do
+    case $recursion,$files in
+      -R,* | *,'*') expected_status=2 ;;
+      *) expected_status=0 ;;
+    esac
+
+    (cd dir && grep $recursion '^' $files <a ) >grepout
+    test $? -eq $expected_status || fail=1
+
+    case $recursion,$files in
+      ,)
+         exp='a\n' ;;
+      ,'*' | -R,)
+         exp='a:a\nb:b\nc:a\n' ;;
+      -r,)
+         exp='a:a\nb:b\n' ;;
+      -r,'*')
+         exp='a:a\nb:b\nc:a\nd/a:a\nd/b:b\n' ;;
+      -R,'*')
+         exp='a:a\nb:b\nc:a\nd/a:a\nd/b:b\nd/c:a\n' ;;
+      *)
+         framework_failure_ ;;
+    esac
+
+    printf "$exp" >exp || framework_failure_
+    LC_ALL=C sort grepout >out || fail=1
+    compare exp out || fail=1
+  done
+done
+
+Exit $fail
-- 
1.7.6.5




reply via email to

[Prev in Thread] Current Thread [Next in Thread]