qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH for-1.4 stable] block: handle spurious coroutine


From: Kevin Wolf
Subject: Re: [Qemu-devel] [PATCH for-1.4 stable] block: handle spurious coroutine entries
Date: Mon, 11 Feb 2013 14:29:41 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120605 Thunderbird/13.0

Am 11.02.2013 14:17, schrieb Stefan Hajnoczi:
> On Mon, Feb 11, 2013 at 1:42 PM, Paolo Bonzini <address@hidden> wrote:
>> Il 11/02/2013 13:29, Kevin Wolf ha scritto:
>>> The bug is not in this function but in process_incoming_migration(). It
>>> should never blindly enter a coroutine which is in an unknown state.
>>
>> process_incoming_migration() is the function that _creates_ the
>> coroutine and enters it for the first time, so there shouldn't be any
>> problem there.
>>
>> The function we're looking at should be, as you write correctly,
>> enter_migration_coroutine().
>>
>>> If it can reenter here, it can reenter anywhere in block drivers, and
>>> adding a workaround to one place that just yields again if it got an
>>> early reentrance doesn't fix the real bug.
>>>
>>> Which is the yield that corresponds to the enter in
>>> enter_migration_coroutine()? We need to add some state that can be used
>>> to make sure the enter happens only for this specific yield.
>>
>> In this case the corresponding yield is in qemu-coroutine-io.c, and it
>> is driven by read-availability of the migration file descriptor.  So it
>> is possible to test before entering the coroutine, but really ugly (not
>> just because it would cost one extra system call).
>>
>> But why is enter_migration_coroutine() being called at all?  bdrv_write
>> should be a synchronous call, it should run its own qemu_aio_wait()
>> event loop and that loop doesn't include the migration file descriptor.
> 
> qemu_aio_wait() is not called because we are already in a coroutine -
> the migration coroutine.  bdrv_write() yields instead of looping on
> qemu_aio_wait().

Right. But that's not really a problem, then the main loop will call the
necessary AIO callbacks that reenter the coroutines - it's a superset of
qemu_aio_wait().

> If we want coroutine_fn to be composable, they must handle spurious
> entries.  We could add code to clear the socket fd handler while we're
> not waiting for it but I think handling spurious entries is the most
> robust solution.

Either clear the fd handler or modify the handler function to check the
state of the coroutine before it reenters (like block_job_resume() does,
for example)

So far the rule has been that each enter corresponds to a well-known
yield. If you really want to change this rule, you'd have to fix things
all over the place because this is an assumption that most if not all
coroutine based code in qemu relies on. I don't want to be the one to
review this change, and I don't think it's a good idea either...

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]