[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for-1.4 stable] block: handle spurious coroutine
From: |
Kevin Wolf |
Subject: |
Re: [Qemu-devel] [PATCH for-1.4 stable] block: handle spurious coroutine entries |
Date: |
Mon, 11 Feb 2013 14:00:58 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0 |
Am 11.02.2013 13:42, schrieb Paolo Bonzini:
> Il 11/02/2013 13:29, Kevin Wolf ha scritto:
>> The bug is not in this function but in process_incoming_migration(). It
>> should never blindly enter a coroutine which is in an unknown state.
>
> process_incoming_migration() is the function that _creates_ the
> coroutine and enters it for the first time, so there shouldn't be any
> problem there.
>
> The function we're looking at should be, as you write correctly,
> enter_migration_coroutine().
Yes, Stefan's commit message confused me.
>> If it can reenter here, it can reenter anywhere in block drivers, and
>> adding a workaround to one place that just yields again if it got an
>> early reentrance doesn't fix the real bug.
>>
>> Which is the yield that corresponds to the enter in
>> enter_migration_coroutine()? We need to add some state that can be used
>> to make sure the enter happens only for this specific yield.
>
> In this case the corresponding yield is in qemu-coroutine-io.c, and it
> is driven by read-availability of the migration file descriptor. So it
> is possible to test before entering the coroutine, but really ugly (not
> just because it would cost one extra system call).
So the problem is that qemu-coroutine-io.c has a bad interface that
yields without scheduling the reenter itself. Wouldn't it better if it
was qemu_co_sendv_recvv() which registered an fd handler and handled the
yield/reenter stuff transparently?
> But why is enter_migration_coroutine() being called at all? bdrv_write
> should be a synchronous call, it should run its own qemu_aio_wait()
> event loop and that loop doesn't include the migration file descriptor.
Why doesn't it include the migration fd? Wouldn't we get such behaviour
only if we had different AioContexts?
Kevin