[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: HFS Patch 15 out
From: |
K . G . |
Subject: |
Re: HFS Patch 15 out |
Date: |
Mon, 27 Sep 2004 19:14:47 +0200 |
On Mon, 27 Sep 2004 13:17:20 +0200 (MEST)
Szakacsits Szabolcs <address@hidden> wrote:
>
> On Fri, 24 Sep 2004, K.G. wrote:
>
> > I've rewritten the search algorithm using a cache storing extent
> > references of the FS in a sorted order, but this has not resulted in
> > the spectacular speed improvement I expected. I've found that most
> > (indeed nearly all) CPU time is spent in kernel space so I will try to
> > understand why
>
> On most of todays commodity hardwares the disk IO (speed and seek) should
> be the bottleneck. For example, if I see it right, you use a lot sync's.
> That's definitely a big performance killer, the kernel can't merge,
> optimize disk IO and often the resize process just blocks waiting for IO.
>
> Here are two ntfsresize timings (for relocating 3 GB) using in sync
> intensive (data should be consistent even in case of power outage) and
> less sync intensive mode (data might be corrupt in case of power outage):
>
> 8.43s usr, 46.85s sys, 179.16s real, 30% CPU <-- careful sync's
> 39.16s usr, 59.76s sys, 843.82s real, 11% CPU <-- lot of sync's
>
> In this case, careful sync gave almost 5x speedup.
>
> Ntfsresize's block allocator also tries to minimize disk seeks thus the
> kernel can merge the IO write request.
>
I eventually identified the bottleneck, which stands in linux.c, flush_cache,
called but the sync function of Parted:
There's a loop which does an BLKFLSBUF ioctl on every umounted partitions of
the disk, and that take (umount test and ioctl) an amount of time that is
just too high for my high use of syncs.
Here is a bench with the loop :
address@hidden:~/Code/parted-1.6.15/parted$ time ./launch
real 208m2.805s
user 0m37.430s
sys 201m29.530s
and here is one without the loop ( but still with
the ioctl call on the whole device ) :
address@hidden:~/Code/parted-1.6.15/parted$ time ./launch
real 7m7.716s
user 0m2.030s
sys 0m10.860s
So here this is a 29x real time speedup and a 930x CPU time speedup on
my test system :p
I would like to know what BLKFLSBUF does exactly (though I don't think this
is what take the more time, I think it's the mount test, but I havn't bench
it separatly yet) ?
(and as a more general question, does someone knows a good documentation
about any ioctl...)
> > and find new solutions to accelerate things.
>
> Doing it efficiently _and_ safely from user space can be somehow
> complicated because the kernel doesn't provide the needed support,
> nor it exports needed functionality (e.g. setting write barriers).
>
> Szaka
>