qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy


From: Avi Kivity
Subject: Re: [Qemu-devel] Re: [patch 2/3] Add support for live block copy
Date: Tue, 01 Mar 2011 11:39:27 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101209 Fedora/3.1.7-0.35.b3pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.7

On 02/28/2011 08:12 PM, Anthony Liguori wrote:


On Feb 28, 2011 11:47 AM, "Avi Kivity" <address@hidden <mailto:address@hidden>> wrote:
>
> On 02/28/2011 07:33 PM, Anthony Liguori wrote:
>>
>>
>> >
>> > You're just ignoring what I've written.
>>
>> No, you're just impervious to my subtle attempt to refocus the discussion on solving a practical problem.
>>
>> There's a lot of good, reasonably straight forward changes we can make that have a high return on investment.
>>
>
> Is making qemu the authoritative source of configuration information a straightforward change? Is the return on it high? Is the investment low?

I think this is where we fundamentally disagree. My position is that QEMU is already the authoritative source. Having a state file doesn't change anything.

Do a hot unplug of a network device with upstream libvirt with acpiphp unloaded, consult libvirt and then consult the monitor to see who has the right view of the guests config.


libvirt is right and the monitor is wrong.

On real hardware, calling _EJ0 doesn't affect the configuration one little bit (if I understand it correctly). It just turns off power to the slot. If you power-cycle, the card will be there.

In the real world, the authoritative source of configuration is a human with a screwdriver. The virtualized equivalent is the management tool.

To me, that's the definition of authoritative.

> "No" to all three (ignoring for the moment whether it is good or not, which we were debating).
>
>
>> The only suggestion I'm making beyond Marcelo's original patch is that we use a structured format and that we make it possible to use the same file to solve this problem in multiple places.
>>
>
> No, you're suggesting a lot more than that.

That's exactly what I'm suggesting from a technical perspective.


Unless I'm hallucinating, you're suggesting quite a bit more. A revolution in how qemu is to be managed.

>> I don't think this creates a fundamental break in how management tools interact with QEMU. I don't think introducing RAID support in the block layer is a reasonable alternative.
>>
>>
>
> Why not?

Because its a lot of complexity and code that can go wrong while only solving the race for one specific case. Not to mention that we double the iop rate.


IMO it's of similar complexity. The number of I/Os don't change (reads stay the same, and any write that has already been mirrored needs to be re-mirrored in both cases. We do gain lower latency switchover and we package the code as a block format driver instead of core block code. We decouple the dependencies from live migration.

> Something that avoids the whole state thing altogether:
>
> - instead of atomically switching when live copy is done, keep on issuing writes to both the origin and the live copy
> - issue a notification to management
> - management receives the notification, and issues an atomic blockdev switch command

> this is really the RAID-1 solution but without the state file (credit Dor). An advantage is that there is no additional latency when trying to catch up to the dirty bitmap.

It still suffers from the two generals problem. You cannot solve this without making one node reliable and that takes us back to it being either QEMU (posted event and state file) or the management tool (sync event).



It works without either. If qemu fails, you simply re-mirror everything. If the management tool fails, it re-subscribes to the mirror-complete event, queries whether it already happened in its absence, and if it did, requests the switchover.

--
error compiling committee.c: too many arguments to function




reply via email to

[Prev in Thread] Current Thread [Next in Thread]