nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] MH-W intro/help request


From: Erich Boleyn
Subject: Re: [Nmh-workers] MH-W intro/help request
Date: Wed, 03 Dec 2014 23:44:00 -0800

Ken Hornstein writes:

> >Specifically, I was testing it on a very large folder of approx 100K
> >messages.  Both the "mark ..." and a "show N" invokation take about
> >the more than 1/10th of a second on average, even for extremely short
> >outputs.
> 
> Okay, yeah, THAT makes sense.  Pretty much every command calls folder_read(),
> and I am 99% sure the problem there is doing a readdir() on that super huge
> directory (it's not performing a stat() on every file, though).  Obviously
> the output size isn't the problem.  Note that maybe the problem is we do
> something stupid with malloc or something else; it might be interesting
> to see how long things like "ls" take in that directory (running ls without
> stat()ing any files, of course); if it's significantly faster then maybe
> we can do better.

It's not faster to do, say "ls -1 > /dev/null".  It is similar overall
perf.


> We're kind of in a tough spot here.  Sequences can contain entries for
> messages that don't exist; the way that gets resolved is by reading the
> directory and removing any files from the sequence list when the folder
> data structure is built. mark(1) isn't just reading the sequence file
> and printing out the exact line; it's calling seq_print(), which is
> the same routine that the sequence routines use to output the sequence
> structure.  Getting the sequence list without actually reading the
> folder ... well, it's possible, but it would require some surgery.

This seems wrong...  for example, as I make the MIME stuff work
better I'll be extracting many separate components from an email.
I just measured it, and for that kind of large folder, displaying 1
email could easily get to be 1+ seconds of cumulative time, which
would likely make this unacceptable.

Is there any way I can completely avoid the giant folder check?  I
can't think of why it is being done time after time for simple
program invokations that, for example, refer to a specifically
enumerated message.  Obviously *asking* for some relative message
list ID like "last" would need to check the directory to find
which message number that is referring to, but it would be easy to
do that in one step, always referring to the number after that.

[NOTE: I suspect I'm getting into the "we've talked/fought/etc. about
 this many times before" territory and it may not be worth discussing
 on the list right now ... since this issue is arguably orthogonal
 of what I'm doing]


--
    Erich Stefan Boleyn     <address@hidden>     http://www.uruk.org/
"Reality is truly stranger than fiction; Probably why fiction is so popular"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]