[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Nmh-workers] some indexing results

From: Paul Vixie
Subject: [Nmh-workers] some indexing results
Date: Mon, 07 Feb 2011 08:34:35 +0000

i added the rfc822 headers to my sleepycat index files, which had the
following effects.  building indexes for my ~10GByte mail store takes
~90 times longer (one hour 45 minutes vs. one minute 11 seconds).  the
indexes are ~20 times larger.  a modified version of scan, though it can
no longer add the first few words of the body to each scan line, is
2..4X faster.  that's not good enough to justify the complexity, but
i'll work on it more before i decide whether it's the best we're going
to get.  i can probably add the first ~100 characters of the body into
the database element, to make sure scan can print the first few words of
it as now.  here are the current results.

#nsa:amd64# folder +inbox
inbox+ has 3269 messages  (1-6144); cur=6134; (others).

#nsa:amd64# repeat 3 time sh -c "scan +inbox > /tmp/vix1"
0.237u 0.728s 0:26.27 3.6%      132+1395k 0+1io 0pf+0w
0.165u 0.675s 0:35.35 2.3%      153+1598k 0+1io 0pf+0w
0.309u 0.813s 0:43.99 2.5%      140+1462k 0+1io 0pf+0w

#nsa:amd64# ktrace scan +inbox > /dev/null
#nsa:amd64# kdump -s | wc -l
#nsa:amd64# kdump -s | grep -c NAMI

#nsa:amd64# repeat 3 time sh -c "./scan +inbox > /tmp/vix2"
0.253u 0.784s 0:13.33 7.7%      134+1271k 0+1io 0pf+0w
0.311u 0.775s 0:13.76 7.8%      129+1251k 0+1io 0pf+0w
0.253u 0.642s 0:13.86 6.4%      145+1402k 0+1io 0pf+0w

#nsa:amd64# ktrace ./scan +inbox > /dev/null
#nsa:amd64# kdump -s | wc -l
#nsa:amd64# kdump -s | grep -c NAMI

so, sleepycat is burning me somewhere, so while i'm avoiding open() i'm
paying too high a cost elsewhere, probably in unnecessary concurrency
control.  (unnec'y since i'm holding a file-level lock throughout.)
more later.  if i get far enough i'll want to do a similar test with pick.

noting, the software architecture of MH deeply presumes on the file store,
everywhere i look i find fopen() and similar calls.  this means to get MH
to read the headers out of a database i had to temporarily use a nonportable
freebsd library call "funopen" that lets me attach a "FILE *" to something
that is not a file.  i won't be offering patches that work that way, i'll
have to rototill the internal interfaces to separate out the "FILE *" logic
from the rest.  that's a lot of work.  i won't do it unless i get 10X or
better performance from "scan" and "pick" in my (nonportable) testing.

the test system uses freebsd 7 and ZFS "raidz2" if anybody cares.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]