qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [RFC PATCH 06/10] block: Allow changing the backing fil


From: Alberto Garcia
Subject: Re: [Qemu-block] [RFC PATCH 06/10] block: Allow changing the backing file on reopen
Date: Wed, 12 Sep 2018 17:09:48 +0200
User-agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (i586-pc-linux-gnu)

On Thu 21 Jun 2018 03:06:22 PM CEST, Kevin Wolf wrote:
>> > Actually, do we ever use bdrv_reopen() for flags other than
>> > read-only?  Maybe we should get rid of that flags nonsense and
>> > simply make it a bdrv_reopen_set_readonly() taking a boolean.
>> 
>> I think that's a good idea. There's however one case where other
>> flags are changed: reopen_backing_file() in block/replication.c that
>> also clears BDRV_O_INACTIVE. I'd need to take a closer look to that
>> code to see what we can do about it.
>
> Uh, and that works?
>
> Clearing BDRV_O_INACTIVE with bdrv_reopen() (which means, without
> calling bdrv_invalidate_cache()) is surely suspicious. Probably this
> code is buggy.
>
> Another reason for removing flags from the interface: We didn't do any
> sanity checks for the flag changes.

I'm working on removing the flags parameter from bdrv_reopen() and
friends, and I think this is the only clearly strange thing on the way.

The documentation is in https://wiki.qemu.org/Features/BlockReplication
or in docs/block-replication.txt.

As far as I can see there's the following backing chain:

  [secondary] <- [hidden] <- [active]

There's an NBD server writing on [secondary], a backup job copying data
from [secondary] to [hidden], and the VM writing to [active].

The reopen_backing_file() function in question simply puts the two
backing images in read-write mode (clearing BDRV_O_INACTIVE along the
way) so the NBD server and the backup job can write to them.

This is done by replication_start(), which can only be reached with the
QMP command "xen-set-replication". There's a comment that says "The
caller of the function MUST make sure vm stopped" but no one seems to
check that the vm is indeed stopped (?). This API was btw added half a
year after the replication driver, and before this point the only user
of replication_start() was a unit test.

Without having tested the COLO / replication code before, I don't see
any reason why it would make sense to clear the BDRV_O_INACTIVE flag on
the backing files, or why that flag would be there in the first place
(and I'm not even talking about whether that flag is supposed to be
set/cleared with a simple bdrv_reopen()). So I'm thinking to simply
remove it from there.

If anyone else has more information I'd be happy to hear it.

Berto



reply via email to

[Prev in Thread] Current Thread [Next in Thread]