[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.2?] block/qcow2-bitmap: fix crash bug in qcow2_co_remov

From: Eric Blake
Subject: Re: [PATCH for-4.2?] block/qcow2-bitmap: fix crash bug in qcow2_co_remove_persistent_dirty_bitmap
Date: Thu, 5 Dec 2019 15:53:45 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2

On 12/5/19 2:16 PM, John Snow wrote:

Last minute edit: hmm, actually, transaction action introduced in
4.2, so crash is not a regression, only broken block-dirty-bitmap-remove
command is a regression... Maybe it's OK for stable.

Libvirt REALLY wants to use transaction bitmap management (and require
qemu 4.2) for its incremental backup patches that Peter is almost ready
to merge in.  I'm trying to ascertain:

When exactly can this bug hit?  Can you give a sequence of QMP commands
that will trigger it for easy reproduction?  Are there workarounds (such
as checking that a bitmap exists prior to attempting to remove it)?  If
it does NOT get fixed in rc5, how will libvirt be able to probe whether
the fix is in place?

It looks like:

- You need to have at least one bitmap
- You need to use transactional remove
- you need to misspell the bitmap name
- The remove will fail (return -EINVAL) but doesn't set errp
- The caller chokes on incomplete information, state->bitmap is NULL

So in libvirt's case, as long as libvirt manages bitmaps completely, it's a bug in libvirt to request deletion of a bitmap that doesn't exist. Or, someone messes with a qcow2 image of an offline guest behind libvirt's back without updating libvirt's metadata of what bitmaps should exist, and then if libvirt fails to check that a bitmap actually exists, a user may be able to coerce libvirt into requesting a bitmap deletion that will cause a qemu crash, but that's the user's fault for going behind libvirt's back. Or, libvirt could add code that instead of trying to blindly delete a bitmap, it first makes a QMP call to ensure the bitmap still exists (annoying, but harmless even when the bug is fixed), instead of blaming the bug on the user operating behind libvirt's back.

The bug is nasty, but feels to be enough of a corner case that I think deferring to 5.0 with CC: stable (and then downstreams can backport it at will) is the right approach; no need to hold up 4.2 if this is the only flaw. But I'm also not opposed to it going in 4.2 if we have anything else serious.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

reply via email to

[Prev in Thread] Current Thread [Next in Thread]