qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] Recent patches for 2.4


From: Peter Lieven
Subject: Re: [Qemu-stable] Recent patches for 2.4
Date: Tue, 04 Aug 2015 13:57:01 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Am 04.08.2015 um 13:53 schrieb Paolo Bonzini:

On 04/08/2015 11:22, Peter Lieven wrote:
edec47c main-loop: fix qemu_notify_event for aio_notify optimization
Part of the above AioContext series.
So either the whole series or none of them I guess?
It's a separate bug, and theoretically it's there in 2.3.1 as well, but
no one ever reproduced it (it would hang in make check) so not
worthwhile.
Can you give me a pointer what the symtoms where?
If a thread tries to wake up the main thread using qemu_notify_event(),
the main thread will never wake up.  This for example could happen if
the first thread calls qemu_set_fd_handler() or timer_mod().

I have a qemu-img convert job on x86_64 that reproducibly hangs on
bdrv_drain_all at the end of the convert process.
I convert from nfs:// to local storage here. I try to figure out which BS
reports busy. Qemu here is still 2.2.1.
qemu-img does not use main-loop, so this cannot be the cause.

The AioContext bugs only happen when you have a thread executing the
main loop and one thread executing aio_poll, so they can also be
excluded as the cause of qemu-img problems.

Okay, what I found out is that in aio_poll I get revents = POLLIN for
the nfs file descriptor. But there is no data available on the socket.
But as a consequence progress is true and we loop here forever.

I have seen that is a common bug in Linux to return POLLIN on a fd
even there is no data available. I don't have this problem in general,
in this case no qemu-img or qemu process would ever terminate when
nfs is involved, but in this special case it happens reproducible.

Peter




reply via email to

[Prev in Thread] Current Thread [Next in Thread]