bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #24140] Painfully slow find(1) in list-permission-only AFS paths


From: Daniel Richard G.
Subject: [bug #24140] Painfully slow find(1) in list-permission-only AFS paths
Date: Tue, 30 Sep 2008 02:11:41 +0000
User-agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux; X11; x86_64; en_US) KHTML/3.5.7 (like Gecko)

Follow-up Comment #13, bug #24140 (project findutils):

Okay, I've done some investigation on this.

First off, as you suspected, there's not much than can be done for ftsfind. I
think the fts backend could be modified to handle AFS gracefully, but that
goes beyond the scope of this project.

For oldfind, however, I believe there is a way. I took advantage of the
d_type optimization and the "mode" argument to process_path(), to allow
calling digest_mode() only if the eval tree requires it.

Here is a patch that is not intended to be applied to CVS, but merely
illustrates a hack that works around the problem (without resorting to AFS
calls, even). The oldfind(1) program built with this works perfectly in the
problematic AFS directories. I'll explain it below:

====BEGIN PATCH====
--- find/find.c 10 Mar 2008 12:06:11 -0000      1.134
+++ find/find.c 29 Sep 2008 20:58:51 -0000
@@ -207,7 +207,7 @@
   /* If no paths are given, default to ".".  */
   for (i = end_of_leading_options; i < argc &&
!looks_like_expression(argv[i], true); i++)
     {
-      process_top_path (argv[i], 0);
+      process_top_path (argv[i], S_IFDIR);  /* assume path is directory */
     }

   /* If there were no path arguments, default to ".". */
@@ -1164,11 +1164,12 @@
   /* Assume it is a non-directory initially. */
   stat_buf.st_mode = 0;
   state.rel_pathname = name;
-  state.type = 0;
+  state.type = mode;
   state.have_stat = false;
   state.have_type = false;

-  if (!digest_mode(mode, pathname, name, &stat_buf, leaf))
+  if ((eval_tree->need_stat || eval_tree->need_type) &&
+      !digest_mode(mode, pathname, name, &stat_buf, leaf))
     return 0;

   if (!S_ISDIR (state.type))
====END PATCH====

I modified process_path() to skip the call to digest_mode()---and the
troublesome stat() call therein---unless a predicate requires it. This isn't a
problem when recursing directories, because process_dir() passes in
mode==S_IFDIR (thanks to xsavedir() grabbing the d_type info), and it's not a
problem for leaf nodes, since those would just have mode==0 (which may well be
enough to go on for the predicates).

It *is* a problem for the top-level call, because that passes in mode==0
unconditionally, so I hard-coded the assumption that the starting points are
directories. In real life, the starting points would have to be stat()ed
separately, as we obviously don't have d_type information for those.

I saw a lot of optimization logic, and can't be sure that the above isn't
horribly breaking some common cases, so I'm hoping you can fill me in on what
else might be going on here that I need to be aware of. I believe the main
caveat is the assumption that "d_type==DT_UNKNOWN --> NOT_A_DIRECTORY" always
holds true; if not, then we'd probably have to check for AFS so that we rely
on that assumption only when we're in actual AFS space.

There's also room for improvement here, too. With this code tweak, "oldfind
/path/to/afs" works fine, but "oldfind /path/to/afs -type f" yields a string
of "Permission denied" error messages, one every ~3 seconds. Which could be
handled more gracefully, but is in any event understandable. However, the same
command with "-type d" yields the same behavior, when it really shouldn't. It
shouldn't have to stat entries to tell if they are directories or not, when
that is already known from the d_type information. (Admittedly, this is more
ambitious a change.)

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?24140>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]