[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context pr
From: |
Denis Plotnikov |
Subject: |
Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section" |
Date: |
Tue, 11 Dec 2018 16:55:00 +0000 |
On 07.12.2018 15:26, Kevin Wolf wrote:
> Am 05.12.2018 um 13:23 hat Denis Plotnikov geschrieben:
>> At the time, the "drained section" doesn't protect Block Driver State
>> from the requests appearing in the vCPU threads.
>> This could lead to the data loss because of request coming to
>> an unexpected BDS.
>>
>> For example, when a request comes to ide controller from the guest,
>> the controller creates a request coroutine and executes the coroutine
>> in the vCPU thread. If another thread(iothread) has entered the
>> "drained section" on a BDS with bdrv_drained_begin, which protects
>> BDS' AioContext from external requests, and released the AioContext
>> because of finishing some coroutine by the moment of the request
>> appearing at the ide controller, the controller acquires the AioContext
>> and executes its request without any respect to the entered
>> "drained section" producing any kinds of data inconsistency.
>>
>> The patch prevents this case by putting requests from external threads to
>> the queue on AioContext while the context is protected for external requests
>> and executes those requests later on the external requests protection
>> removing.
>>
>> Also, the patch marks requests generated in a vCPU thread as external ones
>> to make use of the request postponing.
>>
>> How to reproduce:
>> 1. start vm with an ide disk and a linux guest
>> 2. in the guest run: dd if=... of=... bs=4K count=100000 oflag=direct
>> 3. (qemu) drive_mirror "disk-name"
>> 4. wait until block job can receive block_job_complete
>> 5. (qemu) block_job_complete "disk-name"
>> 6. blk_aio_p[read|write]v may appear in vCPU context (here is the race)
>>
>> Signed-off-by: Denis Plotnikov <address@hidden>
>
> This is getting closer, but I'd like to see two more major changes:
>
>> diff --git a/include/block/aio.h b/include/block/aio.h
>> index 0ca25dfec6..8512bda44e 100644
>> --- a/include/block/aio.h
>> +++ b/include/block/aio.h
>> @@ -19,6 +19,7 @@
>> #include "qemu/event_notifier.h"
>> #include "qemu/thread.h"
>> #include "qemu/timer.h"
>> +#include "qemu/coroutine.h"
>>
>> typedef struct BlockAIOCB BlockAIOCB;
>> typedef void BlockCompletionFunc(void *opaque, int ret);
>> @@ -130,6 +131,11 @@ struct AioContext {
>> QEMUTimerListGroup tlg;
>>
>> int external_disable_cnt;
>> + /* Queue to store the requests coming when the context is disabled for
>> + * external requests.
>> + * Don't use a separate lock for protection relying the context lock
>> + */
>> + CoQueue postponed_reqs;
>
> Why involve the AioContext at all? This could all be kept at the
> BlockBackend level without extending the layering violation that
> aio_disable_external() is.
>
> BlockBackends get notified when their root node is drained, so hooking
> things up there should be as easy, if not even easier than in
> AioContext.
Just want to make sure that I understood correctly what you meant by
"BlockBackends get notified". Did you mean that bdrv_drain_end calls
child's role callback blk_root_drained_end by calling
bdrv_parent_drained_end?
In case if it's so, it won't work if resume postponed requests in
blk_root_drained_end since we can't know if external is disabled for the
context because the counter showing that is decreased only after roles'
drained callbacks are finished at bdrv_do_drained_end.
Please correct me if I'm wrong.
Looking at the patch again, I think that it might be useful to keep the
requests in the structure that limits their execution and also protects
the access (context acquire/release) although it's indeed the layering
violation but at least we can store the parts related at the same place
and later on move somewhere else alongside the request restrictor.
Denis
>
>> /* Number of AioHandlers without .io_poll() */
>> int poll_disable_cnt;
>> @@ -483,6 +489,15 @@ static inline void aio_timer_init(AioContext *ctx,
>> */
>> int64_t aio_compute_timeout(AioContext *ctx);
>>
>> +/**
>> + * aio_co_enter:
>> + * @ctx: the context to run the coroutine
>> + * @co: the coroutine to run
>> + *
>> + * Enter a coroutine in the specified AioContext.
>> + */
>> +void aio_co_enter(AioContext *ctx, struct Coroutine *co);
>> +
>> /**
>> * aio_disable_external:
>> * @ctx: the aio context
>> @@ -491,9 +506,17 @@ int64_t aio_compute_timeout(AioContext *ctx);
>> */
>> static inline void aio_disable_external(AioContext *ctx)
>> {
>> + aio_context_acquire(ctx);
>> atomic_inc(&ctx->external_disable_cnt);
>> + aio_context_release(ctx);
>> }
>
> This acquire/release pair looks rather useless?
>
>> +static void run_postponed_co(void *opaque)
>> +{
>> + AioContext *ctx = (AioContext *) opaque;
>> +
>> + qemu_co_queue_restart_all(&ctx->postponed_reqs);
>> +}
>> /**
>> * aio_enable_external:
>> * @ctx: the aio context
>> @@ -504,12 +527,17 @@ static inline void aio_enable_external(AioContext *ctx)
>> {
>> int old;
>>
>> + aio_context_acquire(ctx);
>> old = atomic_fetch_dec(&ctx->external_disable_cnt);
>> assert(old > 0);
>> if (old == 1) {
>> + Coroutine *co = qemu_coroutine_create(run_postponed_co, ctx);
>> + aio_co_enter(ctx, co);
>> +
>> /* Kick event loop so it re-arms file descriptors */
>> aio_notify(ctx);
>> }
>> + aio_context_release(ctx);
>> }
>>
>> /**
>> @@ -564,15 +592,6 @@ void aio_co_schedule(AioContext *ctx, struct Coroutine
>> *co);
>> */
>> void aio_co_wake(struct Coroutine *co);
>>
>> -/**
>> - * aio_co_enter:
>> - * @ctx: the context to run the coroutine
>> - * @co: the coroutine to run
>> - *
>> - * Enter a coroutine in the specified AioContext.
>> - */
>> -void aio_co_enter(AioContext *ctx, struct Coroutine *co);
>> -
>> /**
>> * Return the AioContext whose event loop runs in the current thread.
>> *
>> diff --git a/include/block/block.h b/include/block/block.h
>> index 7f5453b45b..0685a73975 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -83,8 +83,14 @@ typedef enum {
>> */
>> BDRV_REQ_SERIALISING = 0x80,
>>
>> + /*
>> + * marks requests comming from external sources,
>> + * e.g block requests from dma running in the vCPU thread
>> + */
>> + BDRV_REQ_EXTERNAL = 0x100,
>
> I don't like this one because BlockBackends should be used _only_ from
> external sources.
>
> I know that this isn't quite true today and there are still block jobs
> calling BlockBackend internally while handling a BDS request, but I hope
> with Vladimir's backup patches going it, this can go away.
>
> So I suggest that for the time being, we invert the flag and have a
> BDRV_REQ_INTERNAL instead, which is only used for BlockBackend requests,
> hopefully doesn't have to be used in many places and which can go away
> eventually.
>
>> /* Mask of valid flags */
>> - BDRV_REQ_MASK = 0xff,
>> + BDRV_REQ_MASK = 0xfff,
>> } BdrvRequestFlags;
>>
>> typedef struct BlockSizes {
>
> Kevin
>
--
Best,
Denis
- [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Denis Plotnikov, 2018/12/05
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Kevin Wolf, 2018/12/07
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Denis Plotnikov, 2018/12/10
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section",
Denis Plotnikov <=
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Kevin Wolf, 2018/12/12
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Denis Plotnikov, 2018/12/13
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Kevin Wolf, 2018/12/13
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Denis Plotnikov, 2018/12/14
- Re: [Qemu-devel] [PATCH] blk: postpone request execution on a context protected with "drained section", Denis Plotnikov, 2018/12/18