qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bitmap migration bug with -drive while block mirror runs


From: Vladimir Sementsov-Ogievskiy
Subject: Re: bitmap migration bug with -drive while block mirror runs
Date: Wed, 2 Oct 2019 12:22:01 +0000

02.10.2019 14:11, Kevin Wolf wrote:
> Am 02.10.2019 um 12:46 hat Peter Krempa geschrieben:
>> On Tue, Oct 01, 2019 at 12:07:54 -0400, John Snow wrote:
>>>
>>>
>>> On 10/1/19 11:57 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> 01.10.2019 17:10, John Snow wrote:
>>>>>
>>>>>
>>>>> On 10/1/19 10:00 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> Otherwise: I have a lot of cloudy ideas on how to solve this, but
>>>>>>> ultimately what we want is to be able to find the "addressable" name for
>>>>>>> the node the bitmap is attached to, which would be the name of the first
>>>>>>> ancestor node that isn't a filter. (OR, the name of the block-backend
>>>>>>> above that node.)
>>>>>> Not the name of ancestor node, it will break mapping: it must be name of 
>>>>>> the
>>>>>> node itself or name of parent (may be through several filters) 
>>>>>> block-backend
>>>>>>
>>>>>
>>>>> Ah, you are right of course -- because block-backends are the only
>>>>> "nodes" for which we actually descend the graph and add the bitmap to
>>>>> its child.
>>>>>
>>>>> So the real back-resolution mechanism is:
>>>>>
>>>
>>> Amendment:
>>>     - If our local node-name N is well-formed, use this.
>>
>> I'd like to re-iterate that the necessity to keep node names same on
>> both sides of migration is unexpected, undocumented and in some cases
>> impossible.
> 
> I think the (implicitly made) requirement is not that all node-names are
> kept the same, but only the node-names of those nodes for which
> migration transfers some state.
> 
> It seems to me that bitmap migration is the first case of putting
> something in the migration stream that isn't related to a frontend, but
> to the backend, so the usual device hierarchy to address information
> doesn't work here. And it seems the implications of this weren't really
> considered sufficiently, resulting in the design problem we're
> discussing now.
> 
> What we need to transfer is dirty bitmaps, which can be attached to any
> node in the block graph. If we accept that the way to transfer this is
> the migration stream, we need a way to tell which bitmap belongs to
> which node. Matching node-name is the obvious answer, just like a
> matching device tree hierarchy is used for frontends.
> 
> If we don't want to use the migration stream for backends, we would need
> to find another way to transfer the bitmaps. I would welcome removing
> backend data from the migration stream, but if this includes
> non-persistent bitmaps, I don't see what the alternative could be.

But how to migrate persistent bitmaps if storage is not shared?

And even with only persistent bitmaps and shared storage: bitmaps data may
be large, and storing/loading it during migration downtime will increase
it.

> 
>> If you want to mandate that they must be kept the same please document
>> it and also note the following:
>>
>> - during migrations the storage layout may change e.g. a backing chain
>>    may become flattened, thus keeping node names stable beyond the top
>>    layer is impossible
> 
> You don't want to transfer bitmaps of nodes that you're going to drop.
> I'm not an expert for these bitmaps, but I think this just means you
> would have to disable any bitmaps on the backing files to be dropped on
> the source host before you migrate.

You mean remove them.. But yes, any way it's not a problem. If corresponding
node isn't exist on target, we don't need any bitmaps for it.

> 
>> - in some cases (readonly image in a cdrom not present on destination,
>>    thus not relevant here probably) it may even become impossible to
>>    create any node thus keeping the top node may be impossible
> 
> Same thing, you don't want to transfer a bitmap for a node that
> disappears.
> 
>> - it should be documented when and why this happens and how management
>>    tools are supposed to do it
>>
>> - please let me know what's actually expected, since libvirt
>>    didn't enable blockdev yet we can fix any unexpected expectations
>>
>> - Document it so that the expectations don't change after this.
> 
> Yes, we need a good and ideally future-proof rule of which node-names
> need to stay the same. Currently it's only bitmaps, but might we get
> another feature later where we want to transfer more backend data?
> 
>> - Ideally node names will not be bound to anything and freely
>>    changeable. If necessary we can provide a map to qemu during migration
>>    which is probably less painful and more straightforward than keeping
>>    them in sync somehow ...
> 
> A map feels painful for the average user (and for the QEMU
> implementation), even if it looks convenient for libvirt. If anything,
> I'd make it optional and default to 1:1 mappings for anything that isn't
> explicitly mapped.
> 

Hmm, I don't think that optional map is painful.

What about the following:

1. If map is provided:
- migrate only bitmaps in nodes, specified by map
- bitmaps migrated only accordingly to the map, block device names are not 
involved at all

2. If map not provided:
- For nodes directly bound to named block backends, or through several filters, 
use name of this
block backend.
- For other nodes use node-name

===

And I think [2.] should be done now to fix current bug, and [1.] may be 
postponed until we
really need it.

-- 
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]