qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: QMP (without OOB) function running in thread different from the main


From: Fiona Ebner
Subject: Re: QMP (without OOB) function running in thread different from the main thread as part of aio_poll
Date: Wed, 26 Apr 2023 16:31:34 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0

Am 20.04.23 um 08:55 schrieb Paolo Bonzini:
> 
> 
> Il gio 20 apr 2023, 08:11 Markus Armbruster <armbru@redhat.com
> <mailto:armbru@redhat.com>> ha scritto:
> 
>     So, splicing in a bottom half unmoored monitor commands from the main
>     loop.  We weren't aware of that, as our commit messages show.
> 
>     I guess the commands themselves don't care; all they need is the BQL.
> 
>     However, did we unwittingly change what can get blocked?  Before,
>     monitor commands could block only the main thread.  Now they can also
>     block vCPU threads.  Impact?
> 
> 
> Monitor commands could always block vCPU threads through the BQL(*).
> However, aio_poll() only runs in the vCPU threads in very special cases;
> typically associated to resetting a device which causes a blk_drain() on
> the device's BlockBackend. So it is not a performance issue.
> 

AFAIU, all generated coroutine wrappers use aio_poll. In my backtrace
aio_poll happens via blk_pwrite for a pflash device. So a bit more often
than "very special cases" ;)

> However, liberal reuse of the main block layer AioContext could indeed
> be a *correctness* issue. I need to re-read Fiona's report instead of
> stopping at the first three lines because it's the evening. :)

For me, being called in a vCPU thread caused problems with a custom QMP
function patched in by Proxmox. The function uses a newly opened
BlockBackend and calls qemu_mutex_unlock_iothread() after which
qemu_get_current_aio_context() returns 0x0 (when running in the main
thread, it still returns the main thread's AioContext). It then calls
blk_pwritev which is also a generated coroutine wrapper and the
assert(qemu_get_current_aio_context() == qemu_get_aio_context());
in the else branch of the AIO_WAIT_WHILE_INTERNAL macro fails.

Sounds like there's room for improvement in our code :/ I'm not aware of
something similar in upstream QEMU.

Thanks to Markus for the detailed history lesson!

Best Regards,
Fiona




reply via email to

[Prev in Thread] Current Thread [Next in Thread]