qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command


From: Eric Blake
Subject: Re: [Qemu-devel] [PATCH v4 10/10] Add the drive-reopen command
Date: Wed, 14 Mar 2012 07:11:01 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120216 Thunderbird/10.0.1

On 03/14/2012 03:34 AM, Kevin Wolf wrote:

>> That said, reopen is really hard to be implemented as a transaction
>> without breaking that rule. For example in the blkmirror case you'd
>> need to open the destination image in r/w while the mirroring is in
>> action (already having the same image in r/w mode).
>>
>> There are several solutions here but they are either really hard to
>> implement or non definitive. For example:
>>
>> * We could try to implement the reopen command for each special case,
>>   eg: blkmirror, reopening the same image, etc... and in such cases
>>   reusing the same bs that we already have. The downside is that this
>>   command will be coupled with all this special cases.
> 
> The problem we're trying to solve is that we have a graph of open
> BlockDriverStates (connected by bs->file, backing file relations etc.)
> and we want to transform it into a different graph of open BDSs
> atomically, reusing zero, one or many of the existing BDSs, and possibly
> with changed properties (cache mode, read-only, etc.)
> 
> What we can have reasonably easily (there are patches floating around,
> they just need to be completed), is a bdrv_reopen() that changes flags
> on one given BDS, without changing the file it's backed by. This is
> already broken up into prepare/commit/abort stages as we need it to
> reopen VMDK's split images safely.
> 
> In theory this should be enough to build the new graph by opening yet
> unused BDSs, preparing the reopen of reused ones and only if all of that
> was successful, committing the bdrv_reopen and changing the relations
> between the nodes. I hope that at the same time it's clear that this
> isn't exactly trivial to implement.

Agreed that 1) it looks implementable, and 2) it does not look trivial.

> 
>> * We could use the transaction APIs without actually making it
>>   transaction (if we fail in the middle we can't rollback). The only
>>   advantage of this is that we'd provide a consistent API to libvirt
>>   and we would postpone the problem to the future. Anyway I strongly
>>   discourage this as it's completely unsafe and it's going to break
>>   the transaction semantic. Moreover it's a solution that relies too
>>   much on the hope of finding something appropriate in the future.
> 
> This is not an option. Advertising transactional behaviour and not
> implementing it is just plain wrong.

Absolutely concur.  From libvirt's perspective, I will be assuming that:

if 'transaction' exists, then it supports 'blockdev-snapshot-sync' and
'drive-mirror' (assuming that these are the only two commands that are
in 'transaction' in qemu 1.1), but nothing else is safe to use without a
further probe.  Then, if down the road, we go through the pain of making
'drive-reopen' transactionable, then there will be some new
introspection command (one idea was an optional argument to
query-commands which limits output to just the commands that can also
occur in a 'transaction'), and libvirt will have to use that
introspection before using the additional features.

> 
>> * We could leave it as it is, a distinct command that is not part of
>>   the transaction and that it's closing the old image before opening
>>   the new one.
> 
> Yes, this would be the short-term preliminary solution. I would tend to
> leave it to downstreams to implement it as an extension, though.

Correct - I'm fine with libvirt using the direct, non-atomic,
'drive-reopen' for the immediate needs of oVirt while we work on a more
complete solution for down the road.

> 
>> This is not completely correct, the main intent was to not spread one
>> image chain across two storage domains (making it incomplete if one of
>> them was missing). In the next oVirt release a VM can have different
>> disks on different storage domains, so this wouldn't be a special case
>> but just a normal situation.
> 
> The problem with this kind of argument is that we're not developing only
> for oVirt, but need to look for what makes sense for any management tool
> (or even just direct users of qemu).

Indeed - that's why I still think it's dangerous to not have an atomic
'drive-reopen', even if oVirt can be coded to work in spite of an
initial implementation being non-atomic.  But there's a difference
between wish-lists (atomic reopen) and practicality (what can we
implement now), and I'm not going to insist on the impossible :)

-- 
Eric Blake   address@hidden    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]