qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutin


From: Anthony Liguori
Subject: Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O
Date: Wed, 26 Jan 2011 10:08:15 -0600
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10

On 01/26/2011 09:50 AM, Kevin Wolf wrote:
Am 26.01.2011 16:40, schrieb Avi Kivity:
On 01/22/2011 11:29 AM, Stefan Hajnoczi wrote:
Converting qcow2 to use coroutines is fairly simple since most of qcow2
is synchronous.  The synchronous I/O functions likes bdrv_pread() now
transparently work when called from a coroutine, so all the synchronous
code just works.

The explicitly asynchronous code is adjusted to repeatedly call
qcow2_aio_read_cb() or qcow2_aio_write_cb() until the request completes.
At that point the coroutine will return from its entry function and its
resources are freed.

The bdrv_aio_readv() and bdrv_aio_writev() user callback is now invoked
from a BH.  This is necessary since the user callback code does not
expect to be executed from a coroutine.

This conversion is not completely correct because the safety the
synchronous code does not carry over to the coroutine version.
Previously, a synchronous code path could assume that it will never be
interleaved with another request executing.  This is no longer true
because bdrv_pread() and bdrv_pwrite() cause the coroutine to yield and
other requests can be processed during that time.

The solution is to carefully introduce checks so that pending requests
do not step on each other's toes.  That is left for a future patch...
The way I thought of doing this is:

qcow_aio_write(...)
{
      execute_in_coroutine {
          co_mutex_lock(&bs->mutex);
          do_qcow_aio_write(...); // original qcow code
          co_mutex_release(&bs->mutex);

The release has to be executed in the call back.

I think it's a bit nicer to not do a mutex, but rather to have a notion of freezing/unfreezing the block queue and instead do:

completion() {
   bdrv_unfreeze(bs);
}

coroutine {
   bdrv_freeze(bs);
   do_qcow_aio_write(completion);
}

Freeze/unfreeze is useful in a number of other places too (like snapshotting).

Regards,

Anthony Liguori

      }
}

(similar changes for the the other callbacks)

if the code happens to be asynchronous (no metadata changes), we'll take
the mutex and release it immediately after submitting the I/O, so no
extra serialization happens.  If the code does issue a synchronous
metadata write, we'll lock out all other operations on the same block
device, but still allow the vcpu to execute, since all the locking
happens in a coroutine.

Essentially, a mutex becomes the dependency tracking mechnism.  A global
mutex means all synchronous operations are dependent.  Later, we can
convert the metadata cache entry dependency lists to local mutexes
inside the cache entry structures.
I thought a bit about it since you mentioned it in the call yesterday
and I think this approach makes sense. Even immediately after the
conversion we should be in a better state than with Stefan's approach
because I/O without metadata disk access won't be serialized.

In the other thread you mentioned that you have written some code
independently. Do you have it in some public git repository?

Kevin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]