qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 01/18] block/io: add bdrv_aio_{preadv, pwrite


From: Pavel Butsykin
Subject: Re: [Qemu-devel] [PATCH v1 01/18] block/io: add bdrv_aio_{preadv, pwritev}
Date: Thu, 24 Nov 2016 18:10:25 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 24.11.2016 15:36, Kevin Wolf wrote:
Am 24.11.2016 um 11:58 hat Pavel Butsykin geschrieben:
On 23.11.2016 17:28, Kevin Wolf wrote:
Am 15.11.2016 um 07:36 hat Pavel Butsykin geschrieben:
It's just byte-based wrappers over bdrv_co_aio_prw_vector(), which provide
  a byte-based interface for AIO read/write.

Signed-off-by: Pavel Butsykin <address@hidden>

I'm in the process to phase out the last users of bdrv_aio_*() so that
this set of interfaces can be removed. I'm doing this because it's an
unnecessary redundancy, we have too many wrapper functions that expose
the same functionality with different syntax. So let's not add new
users.

At first sight, you don't even seem to use bdrv_aio_preadv() for actual
parallelism, but you often have a pattern like this:

     void foo_cb(void *opaque)
     {
         ...
         qemu_coroutine_enter(acb->co);
     }

     void caller()
     {
         ...
         acb = bdrv_aio_preadv(...);
         qemu_coroutine_yield();
     }

The code will actually become a lot simpler if you use bdrv_co_preadv()
instead because you don't have to have a callback, but you get pure
sequential code.

The part that actually has some parallelism, pcache_readahead_request(),
already creates its own coroutine, so it runs in the background without
using callback-style interfaces.

I used bdrv_co_preadv(), because it conveniently solves the partial
cache hit. To solve the partial cache hit, we need to split a request
into smaller parts, make asynchronous requests and wait for all
requests in one place.

Do you propose to create a coroutine for each part of request? It
seemed to me that bdrv_co_preadv() is a wrapper that allows us to get
rid of the same code.

It's actually the other way round, bdrv_co_preadv() is the "native"
block layer API, and bdrv_aio_*() are wrappers providing an alternative
interface.


Sorry, I mixed up bdrv_co_preadv() and drv_aio_preadv() :)

Also I forgot that in this version, pcache can't split one request into many small requests(as in RFC version).

I'm looking at pcache_co_readahead(), for example. It looks like this:

     bdrv_aio_preadv(bs->file, node->common.offset, &readahead_acb.qiov,
                     node->common.bytes, pcache_aio_readahead_cb,
                     &readahead_acb);
     qemu_coroutine_yield();

And then we have pcache_aio_readahead_cb(), which ends in:

     qemu_coroutine_enter(acb->co);

So here the callback style doesn't buy you anything, it just rips the
code apart in two function. There is no parallelism here anyway,
pcache_co_readahead() doesn't do anything until the callback reenters
it. This is a very obvious example where bdrv_co_preadv() will simplify
the code.


I agree.

It's similar with the other bdrv_aio_preadv() calls, which are in
pcache_co_preadv():

         if (bytes > s->max_aio_size) {
             bdrv_aio_preadv(bs->file, offset, qiov, bytes,
                             pcache_aio_read_cb, &acb);
             goto out;
         }

         update_req_stats(s->req_stats, offset, bytes);

         status = pcache_lookup_data(&acb);
         if (status == CACHE_MISS) {
             bdrv_aio_preadv(bs->file, offset, qiov, bytes,
                             pcache_aio_read_cb, &acb);
         } else if (status == PARTIAL_CACHE_HIT) {
             assert(acb.part.qiov.niov != 0);
             bdrv_aio_preadv(bs->file, acb.part.offset, &acb.part.qiov,
                             acb.part.bytes, pcache_aio_read_cb, &acb);
         }

         pcache_readahead_request(&acb);

         if (status == CACHE_HIT && --acb.ref == 0) {
             return 0;
         }

     out:
         qemu_coroutine_yield();

Here you have mainly the pcache_readahead_request() call between
bdrv_aio_preadv() and the yield. It only spawns a new coroutine, which
works in the background, so I think you can move it to before the reads
and then the reads can trivially become bdrv_co_preadv() and the
callback can again be inlined instead of ripping the function in two
parts.


The bdrv_aio_pwritev() call in pcache_co_pwritev() is just the same
thing and using the coroutine version results in obvious code
improvements.


And I think this are all uses of bdrv_aio_*() in the pcache driver, so
converting it to use bdrv_co_*() instead isn't only possible, but will
improve the legibility of your code, too. It's a clear win in all three
places.


Yes, you're right, this scheme was acceptable when the caller was
bdrv_aio_*(), but now it just confuses. Thanks for the explanation!

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]