qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [libvirt] [PATCH] qemu: Fix shutdown regression


From: Anthony Liguori
Subject: Re: [Qemu-devel] [libvirt] [PATCH] qemu: Fix shutdown regression
Date: Tue, 20 Sep 2011 15:12:59 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110516 Lightning/1.0b2 Thunderbird/3.1.10

On 09/20/2011 02:03 PM, Eric Blake wrote:
On 09/20/2011 12:52 PM, Anthony Liguori wrote:
On 09/20/2011 01:01 PM, Eric Blake wrote:
On 09/20/2011 11:39 AM, Jiri Denemark wrote:
The commit that prevents disk corruption on domain shutdown
(96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
only recently in QEMU git. With affected QEMU binaries, domains cannot
be shutdown properly and stay in a paused state. This patch tries to
avoid this by sending SIGKILL to 0.1[45].* QEMU processes. Though we
wait a bit more between sending SIGTERM and SIGKILL to reduce the
possibility of virtual disk corruption.
---
src/qemu/qemu_capabilities.c | 7 +++++++
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_process.c | 19 +++++++++++++------
3 files changed, 21 insertions(+), 6 deletions(-)

ACK. But it would be nice if upstream qemu could give us a more reliable
indication of whether the qemu SIGTERM bug is fixed, so that we don't
corrupt
data on a patched 0.14 or 0.15 qemu.

Can you be a lot more specific about what bug you mean?


https://bugzilla.redhat.com/show_bug.cgi?id=739895

That just got applied, last week, so no, it's not in any release right now.


That is, as part of fixing the bug in qemu,
we should also update -help text or something similar, so that libvirt
can avoid
making decisions solely on version numbers.

The version number *is* the right way to make decisions. We've gone
through this dozens of times.

The fact that distros backport all sorts of stuff means that you need to
maintain a matrix of versions with features. It's not our (upstream
QEMU's) responsibility to tell you the differences that exist in forks
of QEMU.

Version numbers are lousy, precisely because they are not granular enough.
That's why the autoconf philosophy frowns so heavily on version checks, and
prefers feature checks instead.

We want to know which features are present,

Features and bugs are different things. I'm all for providing ways to detect whether we support certain commands in QMP, command line options, etc.

 not which versions introduced which
features. In this case, we want to know about a particular feature (SIGTERM is
not broken), which we know exists later than 0.15, but which might also exist as
a backport in 0.14 or 0.15.

No, you want to know, does d9389b9664df561db796b18eb8309fffe58faf8b existing in this build of QEMU. But makes d9389b more important than d296363 or db118fe72?

If you want to know whether a bug is fixed that is important to *you*, then you should check the git log correlating to that version and embed that info in libvirt. Then libvirt is entirely empowered to deem whatever bug fixes you think are important to the table that you maintain.

If qemu tells us that information, then upstream
libvirt can make the decision correctly regardless of how distros backport the
patch. But if qemu does not expose the information, then upstream libvirt must
be pessimistic, and you've now forced the distros to do double-duty - they must
backport both the qemu fix, and write a distro-specific libvirt patch that
alters the version matrix to play with the distro build of qemu.

Or distros could use use the QEMU stable branch as their base and invest in backporting to QEMU stable instead of maintaining private backport trees.

Regards,

Anthony Liguori




reply via email to

[Prev in Thread] Current Thread [Next in Thread]