qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 2/4] Add block-queue


From: Kevin Wolf
Subject: Re: [Qemu-devel] [RFC PATCH v2 2/4] Add block-queue
Date: Wed, 17 Nov 2010 14:41:22 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Fedora/3.0.10-1.fc12 Thunderbird/3.0.10

Am 17.11.2010 13:43, schrieb Stefan Hajnoczi:
> On Fri, Nov 5, 2010 at 6:38 PM, Kevin Wolf <address@hidden> wrote:
>> Instead of directly executing writes and fsyncs, queue them and execute them
>> asynchronously. What makes this interesting is that we can delay syncs and if
>> multiple syncs occur, we can merge them into one bdrv_flush.
> 
> The block-queue concept adds another layer to the cache=
> considerations.  We're going to complete write requests that have not
> yet been issued to the host.
> 
> This won't work for cluster filesystems where multiple VMs are
> accessing the block device, although that is a niche case.  I guess
> they can assume writes are visible after they receive a completion.

Right, but that doesn't work for anything but raw anyway. What I intend
block-queue for is just metadata writes, so nothing that raw would use.

> There must be a trade-off between holding back requests until a flush
> is forced and issuing requests as they come (what we do today).  At
> some point if you have too much data queued the flush is going to be
> painful and might result in odd timing patterns.  This feels similar
> to batching tx packets on the network.

I agree. Though as long as you use it only for metadata, the requests
are relatively small and they overwrite each other so that there's a
chance that you can drop some of the from the queue before submitting them.

What I implemented here is that the queue is flushed if it reaches some
maximum length. So this value is where you'd need to find the right
trade-off.

>> A typical sequence in qcow2 (simple cluster allocation) looks like this:
>>
>> 1. Update refcount table
>> 2. bdrv_flush
>> 3. Update L2 entry
>>
>> If we delay the operation and get three of these sequences queued before
>> actually executing, we end up with the following result, saving two syncs:
>>
>> 1. Update refcount table (req 1)
>> 2. Update refcount table (req 2)
>> 3. Update refcount table (req 3)
>> 4. bdrv_flush
>> 5. Update L2 entry (req 1)
>> 6. Update L2 entry (req 2)
>> 7. Update L2 entry (req 3)
> 
> How does block-queue group writes 1-3 and 5-7 together?  I thought
> reqs 1-3 will each have their own context but from a quick look at the
> code I don't think that is the case for qcow2 in patch 4.
> 
> Another way of asking is, how does block-queue know that it is safe to
> put writes 1-3 together before a bdrv_flush?  Why doesn't it also put
> writes 5-7 before the flush?
> 
> Perhaps your current qcow2 implementation isn't taking advantage of
> block-queue bdrv_flush() batching for concurrent write requests?
> 
> I'm missing something ;).

Yes, you are. ;-)

The contexts are indeed the mechanism that it uses to achieve this. Have
a look at qcow_aio_setup, patch 4/4 adds a blkqueue_init_context() call
there.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]