bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #25538] excluded files are still stat()ed


From: Jim Meyering
Subject: Re: [bug #25538] excluded files are still stat()ed
Date: Wed, 11 Feb 2009 12:24:34 +0100

Kevin Pulo <address@hidden> wrote:
> URL:
>   <http://savannah.gnu.org/bugs/?25538>
>                  Summary: excluded files are still stat()ed
...
> Details:
>
> My problem is that files I have excluded from du using -X or --exclude still
> have stat() run on them.  In particular, this is a problem when trying to
> exclude some fuse filesystems on Linux, eg. sshfs and encfs, which deny all
> access to other users (including root).
>
> For example:
>
> address@hidden:~# du -axk /home/kev/mnt/sf
> du: cannot access `/home/kev/mnt/sf/home': Permission denied
> 4       /home/kev/mnt/sf
> address@hidden:~# du -axk --exclude=/home/kev/mnt/sf/home /home/kev/mnt/sf
> du: cannot access `/home/kev/mnt/sf/home': Permission denied
> 4       /home/kev/mnt/sf
> address@hidden:~# echo $?
> 1
> address@hidden:~#
>
>
> The non-zero exit status is particularly troubling, since it means I cannot
> chain other commands after du using '&&' whenever it's operating on a tree
> that has these sorts of fuse fs's in it.
>
> The only workaround at the moment is to exclude the parent directory, eg.
> --exclude=~kev/mnt/sf in the example above.  This unfortunately means that
> everything else in that directory is also excluded.
>
>
> address@hidden:~# du -axk --exclude=/home/kev/mnt/sf /home/kev/mnt/sf
> address@hidden:~# echo $?
> 0
> address@hidden:~#
>
> Having looked at the code, I'm not sure how this would best be fixed.  The
> list of excluded files is only used in process_file(), which is too late.  I
> presume that the stat() is happening inside fts_read(), which populates
> end->fts_statp with the results of the stat() call.  I suppose that extending
> fts_read() to also respect the exclusion list would be fairly invasive.
> Alternatively, the EPERM could persist during the fts_read(), but then be
> somehow "forgotten about" later for excluded files, allowing the exit status
> to return to being 0 (assuming no other genuine errors).

Thanks for the report.
You're right that accommodating this would be invasive.
However, I'm beginning to think that it's necessary.

fts needs stat information on directories for a few reasons:
  - dev/inode, for cycle detection
  - dev to support the FTS_XDEV option
  - stat.st_mode or dirent.d_type to know when something is a directory
so obviously we can't skip it.

However, one approach would be to extend fts by giving it an
option whereby fts_read would no longer call fts_stat before returning.
With this new option, it would be the responsibility of the caller
to set information before the next fts_read call.

Here's the relevant, just-before-return code from fts_read:

check_for_dir:
                sp->fts_cur = p;
                if (p->fts_info == FTS_NSOK)
                  {
                    if (p->fts_statp->st_size == FTS_STAT_REQUIRED)
                      p->fts_info = fts_stat(sp, p, false);
                    else
                      fts_assert (p->fts_statp->st_size == 
FTS_NO_STAT_REQUIRED);
                  }

                if (p->fts_info == FTS_D)
                  {
                    /* Now that P->fts_statp is guaranteed to be valid,
                       if this is a command-line directory, record its
                       device number, to be used for FTS_XDEV.  */
                    if (p->fts_level == FTS_ROOTLEVEL)
                      sp->fts_dev = p->fts_statp->st_dev;
                    Dprintf (("  entering: %s\n", p->fts_path));
                    if (! enter_dir (sp, p))
                      {
                        __set_errno (ENOMEM);
                        return NULL;
                      }
                  }
                return p;

By the way, while looking at this, I noticed
fts_stat was being called earlier than I expected (it was
probably in fts_build), so I made this change to du.c:

diff --git a/src/du.c b/src/du.c
index 860e8fe..0749097 100644
--- a/src/du.c
+++ b/src/du.c
@@ -660,7 +660,7 @@ main (int argc, char **argv)
   char *files_from = NULL;

   /* Bit flags that control how fts works.  */
-  int bit_flags = FTS_TIGHT_CYCLE_CHECK;
+  int bit_flags = FTS_TIGHT_CYCLE_CHECK | FTS_DEFER_STAT;

   /* Select one of the three FTS_ options that control if/when
      to follow a symlink.  */

I'll probably commit that and similar for the other fts clients
today.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]