Here's a draft of a plan that should improve qcow2 performance. It's
written in wiki syntax for eventual upload to wiki.qemu.org; lines
starting with # are numbered lists, not comments.
= Basics =
At the minimum level, no operation should block the main thread. This
could be done in two ways: extending the state machine so that each
blocking operation can be performed asynchronously
(<code>bdrv_aio_*</code>)
or by threading: each new operation is handed off to a worker thread.
Since a full state machine is prohibitively complex, this document
will discuss threading.
== Basic threading strategy ==
A first iteration of qcow2 threading adds a single mutex to an image.
The existing qcow2 code is then executed within a worker thread,
acquiring the mutex before starting any operation and releasing it
after completion. Concurrent operations will simply block until the
operation is complete. For operations which are already asynchronous,
the blocking time will be negligible since the code will call
<code>bdrv_aio_{read,write}</code> and return, releasing the mutex.
The immediate benefit is that currently blocking operations no long block
the main thread, instead they just block the block operation which is
blocking anyway.
== Eliminating the threading penalty ==
We can eliminate pointless context switches by using the worker thread
context we're in to issue the I/O. This is trivial for synchronous calls
(<code>bdrv_read</code> and <code>bdrv_write</code>); we simply issue
the I/O
from the same thread we're currently in. The underlying raw block format
driver threading code needs to recognize we're in a worker thread
context so
it doesn't need to use a worker thread of its own; perhaps using a thread
variable to see if it is in the main thread or an I/O worker thread.
For asynchronous operations, this is harder. We may add a
<code>bdrv_queue_aio_read</code> and <code>bdrv_queue_aio_write</code> if
to replace a
bdrv_aio_read()
mutex_unlock(bs.mutex)
return;
sequence. Alternatively, we can just eliminate asynchronous calls. To
retain concurrency we drop the mutex while performing the operation:
an convert a <code>bdrv_aio_read</code> to:
mutex_unlock(bs.mutex)
bdrv_read()
mutex_lock(bs.mutex)