bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #58197] "find" fails to optimize "-path /usr/foo -o -path /usr/bar"


From: Spencer Baugh
Subject: [bug #58197] "find" fails to optimize "-path /usr/foo -o -path /usr/bar" to "-regex '/usr/\(foo\|bar\)'"
Date: Wed, 19 Jul 2023 16:52:01 -0400 (EDT)

Follow-up Comment #3, bug #58197 (project findutils):

One use case is GNU Emacs which heavily uses find, for example in M-x rgrep. 
Emacs often constructs find commands which look like this by default:

find -H . \( -path \*/SCCS/\* -o -path \*/RCS/\* -o -path \*/CVS/\* -o -path
\*/MCVS/\* -o -path \*/.src/\* -o -path \*/.svn/\* -o -path \*/.git/\* -o
-path \*/.hg/\* -o -path \*/.bzr/\* -o -path \*/_MTN/\* -o -path \*/_darcs/\*
-o -path \*/\{arch\}/\* -o -path \*/.\#\* -o -path \*.o -o -path \*\~ -o -path
\*.bin -o -path \*.lbin -o -path \*.so -o -path \*.a -o -path \*.ln -o -path
\*.blg -o -path \*.bbl -o -path \*.elc -o -path \*.lof -o -path \*.glo -o
-path \*.idx -o -path \*.lot -o -path \*.fmt -o -path \*.tfm -o -path \*.class
-o -path \*.fas -o -path \*.lib -o -path \*.mem -o -path \*.x86f -o -path
\*.sparcf -o -path \*.dfsl -o -path \*.pfsl -o -path \*.d64fsl -o -path
\*.p64fsl -o -path \*.lx64fsl -o -path \*.lx32fsl -o -path \*.dx64fsl -o -path
\*.dx32fsl -o -path \*.fx64fsl -o -path \*.fx32fsl -o -path \*.sx64fsl -o
-path \*.sx32fsl -o -path \*.wx64fsl -o -path \*.wx32fsl -o -path \*.fasl -o
-path \*.ufsl -o -path \*.fsl -o -path \*.dxl -o -path \*.lo -o -path \*.la -o
-path \*.gmo -o -path \*.mo -o -path \*.toc -o -path \*.aux -o -path \*.cp -o
-path \*.fn -o -path \*.ky -o -path \*.pg -o -path \*.tp -o -path \*.vr -o
-path \*.cps -o -path \*.fns -o -path \*.kys -o -path \*.pgs -o -path \*.tps
-o -path \*.vrs -o -path \*.pyc -o -path \*.pyo \) -prune -o  -type f 
-print0

That is, it lists a bunch of file extensions to ignore (from
grep-find-ignored-files) and passes them to find.

Because find does not optimize this, the list of file extensions completely
dominates the cost of the find execution; on my system, running a find over a
particularly common directory, going from the full set of ignores to just one
takes the find execution time from:

real    0m7.807s
user    0m7.505s
sys     0m0.359s

to:

real    0m0.491s
user    0m0.221s
sys     0m0.293s

Could you reconsider optimizing this case?

If not, how should Emacs be invoking find instead?  Should we optimize this
into a single regex before passing it to find?  That is a bit worse user
experience, because we provide the ability to edit the find invocation, but it
may be worth it for speed improvements.



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?58197>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]