[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
MMIO scan performance
From: |
xystrus |
Subject: |
MMIO scan performance |
Date: |
Mon, 15 Apr 2002 01:01:13 -0400 |
User-agent: |
Mutt/1.3.22.1i |
I did some preliminary testing of a mailbox scanning function using
MMIO. I have /not/ yet done anything with file I/O, so I have nothing
to compare it to. I thought you guys might be interested in the
resutls anyway. The test consisted of opening the file, locking it,
scanning it, unlocking it, closing it. For this test (well, the
second and third ones), I used shared locks, so multiple processes
could read the same mail store simultaneously.
What the (minimalistic, admittedly) scan function does:
- allocate space for message information
- mmap the spool file
- identify the start of each message (headers), start of the body,
and the end of each message, and save them in an array.
Here are the stats of the system I ran it on:
Dell Inspiron 5000 (laptop)
Celeron 450
128MB RAM
10 GB IDE hard drive
The data that I used to test it with:
-rw-r--r-- 1 xystrus xystrus 2.3M Apr 12 12:29 a (342 msgs)
-rw-r--r-- 1 xystrus xystrus 2.3M Apr 12 12:30 b (341 msgs)
-rw-r--r-- 1 xystrus xystrus 2.3M Apr 12 12:30 c
-rw-r--r-- 1 xystrus xystrus 2.3M Apr 12 12:30 d
-rw-r--r-- 1 xystrus xystrus 182M Apr 14 22:15 w (27,399 msgs)
-rw-r--r-- 1 xystrus xystrus 364M Apr 14 22:10 y (54,719 msgs)
-rw-r--r-- 1 xystrus xystrus 364M Apr 14 22:08 z
In case it's not clear from the above, a-d are (copies of) the same file,
and y and z are the same file. [b-d had the IMAP status message
removed.] The data uses is real e-mail from one of my mailboxes. It
unfortunately didn't have any attachments in it, as I don't get a lot
of mail with attachments. I tend to frown on such things... ;-)
a-d are the original, actual mailbox, and w-z are multiple copies of
the same mailbox concatenated together. The messages range in size
from about 1k to 40k. Average message size is a little less than 7k.
TESTS & RESULTS
---------------
Test 1
------
First I used time to see how long it took to scan each of the
mailboxes:
$ /usr/bin/time ./sc a
total messages: 342
Command exited with non-zero status 22
0.06user 0.02system 0:00.46elapsed 17%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (687major+12minor)pagefaults 0swaps
$ /usr/bin/time ./sc w
total messages: 27399
Command exited with non-zero status 24
5.02user 1.26system 0:57.52elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (46766major+212minor)pagefaults 0swaps
$ /usr/bin/time ./sc y
total messages: 54719
Command exited with non-zero status 24
9.91user 2.47system 1:12.86elapsed 16%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (93416major+348minor)pagefaults 0swaps
I was pretty impressed with this speed, given I was testing on a
laptop.
Next, I ran the same test with 4 instances running against the same
mailbox.
[Note that the sc program opens the file with exclusive locks; the mt1
program below opens them with shared locks. Aside from that, they're
identical.]
Test 2
------
The first one finished before I could hit return in the second window,
etc. :) The times for each of the four processes was within about a
second in each of the three cases:
$ alias t="/usr/bin/time ./mt1 a"
$ t
0.07user 0.01system 0:00.08elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (673major+11minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt1 w"
$ t
5.01user 0.57system 1:03.86elapsed 8%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (46756major+211minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt1 y"
$ t
11.11user 0.76system 1:50.06elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (93405major+346minor)pagefaults 0swaps
Still not bad at under 2 minutes to process almost 55,000 messages in
a 364MB file. On a laptop with only 128MB of RAM, no less.
Test 3
------
Here's where it got a little hairy. I then ran 3 scans on w, and 2 scans
each on y and z simultaneously. The results on this one were not
quite so nice:
$ alias t="/usr/bin/time ./mt w"
$ t
4.85user 0.41system 5:54.07elapsed 1%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (46767major+249minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt y"
$ t
9.78user 1.05system 9:58.70elapsed 1%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (93416major+384minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt z"
$ t
10.08user 1.01system 10:01.47elapsed 1%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (93433major+373minor)pagefaults 0swaps
The output from top below is informative:
10:19pm up 5 days, 19:47, 17 users, load average: 7.52, 4.08, 2.03
106 processes: 105 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 10.8% user, 14.7% system, 0.0% nice, 74.4% idle
Mem: 126972K av, 124716K used, 2256K free, 0K shrd, 672K buff
Swap: 136512K av, 65812K used, 70700K free 94608K cached
So what we see here is we don't want to use a laptop with an IDE drive
and 128MB of RAM as our mail server, if we have a lot of users with
large mailboxes on the same filesystem, and we're using this MMIO
scheme. :) CPU util is very low, unsurprisingly; this operation is
bound by memory and I/O. Notice, however, that page faults stay about
the same between the three tests. I would have expected them to go
up.
I'll be very interested to compare against file I/O to see if
it makes a difference. My suspicion has been that it will be
noticably slower, though some have indicated with a lower spec machine
file I/O actually improves on the performance of MMIO. Given that I'm
testing with mailboxes 2x the size of my laptop's RAM, I think that
should show up in my tests.
What I /did do/ though, was perform the same tests on a much better
system. Stats:
Dual Celeron 400 (with ABit BP6 mobo)
512 MB RAM
Tekram UW SCSI adaptor
7200RPM UW SCSI drive (er, might be 10,000 RPM, I can't remember)
Unsurprisingly, this system kicked my laptop's ass.
Test 1
------
$ /usr/bin/time ./sc a
total messages: 342
Command exited with non-zero status 22
0.11user 0.02system 0:00.26elapsed 48%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (685major+12minor)pagefaults 0swaps
$ /usr/bin/time ./sc w
total messages: 27399
Command exited with non-zero status 24
6.71user 0.93system 0:14.41elapsed 53%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (46762major+206minor)pagefaults 0swaps
$ /usr/bin/time ./sc y
total messages: 54799
Command exited with non-zero status 24
13.70user 2.11system 0:30.26elapsed 52%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (93423major+340minor)pagefaults 0swaps
Test 2
------
[I did not bother with the first (2MB) mailbox, as it runs too
quickly to be worthwhile.]
$ alias t="/usr/bin/time ./mt1 w"
$ t
6.39user 0.50system 0:15.22elapsed 45%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (46748major+205minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt1 y"
$ t
12.96user 1.09system 0:30.00elapsed 46%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (93409major+339minor)pagefaults 0swaps
Test 3
------
$ alias t="/usr/bin/time ./mt1 w"
$ t
6.31user 0.63system 1:57.87elapsed 5%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (46748major+205minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt1 y"
$ t
13.71user 1.56system 3:39.60elapsed 6%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (93444major+348minor)pagefaults 0swaps
$ alias t="/usr/bin/time ./mt1 z"
$ t
13.09user 1.95system 3:40.21elapsed 6%CPU (0avgtext+0avgdata
0maxresident)k
0inputs+0outputs (93444major+348minor)pagefaults 0swaps
The following is output from top just before the first 3 processes
(the 180MB mailbox) died.
8:18pm up 12 days, 2:35, 9 users, load average: 6.06, 3.20, 1.53
73 processes: 72 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 18.4% user, 2.3% system, 0.0% nice, 78.1% idle
CPU1 states: 18.1% user, 4.2% system, 0.0% nice, 77.1% idle
Mem: 512876K av, 509800K used, 3076K free, 0K shrd,
744K buff
Swap: 262120K av, 12092K used, 250028K free
473584K cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
19640 xystrus 16 0 98764 96M 98216 D 6.8 19.2 0:06 mt1
19638 xystrus 16 0 98764 96M 98216 D 6.2 19.2 0:07 mt1
19634 xystrus 16 0 101M 101M 100M D 6.0 20.1 0:07 mt1
19644 xystrus 17 0 98740 96M 98192 D 6.0 19.2 0:06 mt1
19642 xystrus 17 0 98740 96M 98192 D 5.3 19.2 0:07 mt1
19632 xystrus 13 0 101M 101M 100M D 5.1 20.1 0:07 mt1
19636 xystrus 13 0 101M 101M 100M D 5.1 20.1 0:06 mt1
The load average crept up much more slowly than on the laptop,
reaching the above plateau just before the first three processes died.
What all this tells me is, if you have an IMAP server and a whole
bunch of users with large mailboxes, you want to run it on a machine
with lots of memory and fast drives, preferably fast RAID (probably
1+0, striped and mirrored, to maximize performance gain on reads and
still provide reasonable performance on writes). CPU really isn't
much of a factor, as it spends a large chunk of time idle on both
systems.
Not that any of this really comes as a surprise... But again, I will
be hugely interested to compare to file I/O. But, especially since
I'm probably going to have to spec out and build an IMAP server soon,
it's good to have some "real" numbers.
Xy
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- MMIO scan performance,
xystrus <=