bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64735: 29.0.92; find invocations are ~15x slower because of ignores


From: Eli Zaretskii
Subject: bug#64735: 29.0.92; find invocations are ~15x slower because of ignores
Date: Mon, 24 Jul 2023 16:26:27 +0300

> Date: Mon, 24 Jul 2023 15:55:13 +0300
> Cc: luangruo@yahoo.com, sbaugh@janestreet.com, yantar92@posteo.net,
>  64735@debbugs.gnu.org
> From: Dmitry Gutov <dmitry@gutov.dev>
> 
> >> 1. 'find' itself is much slower there. There is room for improvement in
> >> the port.
> > 
> > I think it's the filesystem, not the port (which I did myself in this
> > case).
> 
> But directory-files-recursively goes through the same filesystem, 
> doesn't it?

It does (more or less; see below).  But I was not trying to explain
why Find is slower than directory-files-recursively, I was trying to
explain why Find on Windows is slower than Find on GNU/Linux.

If you are asking why directory-files-recursively is so much faster on
Windows than Find, then the main factors I can think about are:

  . IPC, at least in how we implement it in Emacs on MS-Windows, via a
    separate thread and OS-level events between them to signal that
    stuff is available for reading, whereas
    directory-files-recursively avoids this overhead completely;
  . Find uses Posix APIs: 'stat', 'chdir', 'readdir' -- which on
    Windows are emulated by wrappers around native APIs.  Moreover,
    Find uses 'char *' for file names, so calling native APIs involves
    transparent conversion to UTF-16 and back, which is what native
    APIs accept and return.  By contrast, Emacs on Windows calls the
    native APIs directly, and converts to UTF-16 from UTF-8, which is
    faster.  (This last point also means that using Find on Windows
    has another grave disadvantage: it cannot fully support non-ASCII
    file names, only those that can be encoded by the current
    single-byte system codepage.)

> >> 2. The process output handling is worse.
> > 
> > Not sure what that means.
> 
> Emacs's ability to process the output of a process on the particular 
> platform.
> 
> You said:
> 
>    Btw, the Find command with pipe to some other program, like wc,
>    finishes much faster, like 2 to 4 times faster than when it is run
>    from find-directory-files-recursively.  That's probably the slowdown
>    due to communications with async subprocesses in action.

I see this slowdown on GNU/Linux as well.

> One thing to try it changing the -with-find implementation to use a 
> synchronous call, to compare (e.g. using 'process-file'). And repeat 
> these tests on GNU/Linux too.

This still uses pipes, albeit without the pselect stuff.

> >> 3. Something particular to the project being used for the test.
> > 
> > I don't think I understand this one.
> 
> This described the possibility where the disparity between the 
> implementations' runtimes was due to something unusual in the project 
> structure, if you tested different projects between Windows and 
> GNU/Linux, making direct comparison less useful. It's the least likely 
> cause, but still sometimes a possibility.

I have on my Windows system a d:/usr/share tree that is very similar
to (albeit somewhat smaller than) a typical /usr/share tree on Posix
systems.  I tried with that as well, and the results were similar.

> > The ezwinports is the version I'm using here.  But maybe someone came
> > up with a better one: after all, I did my port many years ago (because
> > the native ports available back then were abysmally slow).
> 
> We should also look at the exact numbers. If you say that "| wc" 
> invocation is 2-4x faster than what's reported in the benchmark, then it 
> takes about 2-4 seconds. Which is still oddly slower than your reported 
> numbers for directory-files-recursively.

Yes, so there are additional factors at work, at least with this port
of Find.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]