qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic
Date: Wed, 1 Aug 2018 23:47:31 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 08/01/2018 09:56 PM, John Snow wrote:
>
> On 08/01/2018 02:42 PM, Denis V. Lunev wrote:
>> On 08/01/2018 08:40 PM, Dr. David Alan Gilbert wrote:
>>> * John Snow (address@hidden) wrote:
>>>> On 08/01/2018 06:20 AM, Dr. David Alan Gilbert wrote:
>>>>> * John Snow (address@hidden) wrote:
>>>>>
>>>>> <snip>
>>>>>
>>>>>> I'd rather do something like this:
>>>>>> - Always flush bitmaps to disk on inactivate.
>>>>> Does that increase the time taken by the inactivate measurably?
>>>>> If it's small relative to everything else that's fine; it's just I
>>>>> always worry a little since I think this happens after we've stopped the
>>>>> CPU on the source, so is part of the 'downtime'.
>>>>>
>>>>> Dave
>>>>> --
>>>>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>>>>>
>>>> I'm worried that if we don't, we're leaving behind unusable, partially
>>>> complete files behind us. That's a bad design and we shouldn't push for
>>>> it just because it's theoretically faster.
>>> Oh I don't care about theoretical speed; but if it's actually unusably
>>> slow in practice then it needs fixing.
>>>
>>> Dave
>> This is not "theoretical" speed. This is real practical speed and
>> instability.
> It's theoretical until I see performance testing numbers; do you have
> any? How much faster does the pivot happen by avoiding making the qcow2
> consistent on close?
>
> I don't argue that it's faster to just simply not write data, but what's
> not obvious is how much time it actually saves in practice and if that's
> worth doing unintuitive and undocumented things like "the source file
> loses bitmaps after a migration because it was faster."

Also, frankly speaking, I do not understand the goal of this purism.

There 2 main cases - shared and non-shared storage. On shared
storage:
- normally migration is finished successfully. Source is shut down,
  target is started. The data in the file on shared storage would be
  __IMMEDIATELY__ marked as stale on target, i.e. you will save CBT
 on source (with IO over networked fs), load CBT on target (with IO
 over networked FS), mark CBT as stale (IO again). CBT data written
 is marked as lost
- failed migration. OK, we have CBT data written on source, CBT
  data read on source, CBT data marked stale. Thus any CBT on
  disk while VM is running is pure overhead.

The same situation is when we use non-shared migration. In this
case the situation is even worse. You save CBT and put it to trash
upon migration complete.

Please also note, that CBT saving almost does not protect us
from powerlosses as the power should be lost at the very
specific moment to allow data to survive and most likely
we will have to drop CBT completely.

Den

Normally migration is executed as follows:
- source is gently shutdowned, all da



reply via email to

[Prev in Thread] Current Thread [Next in Thread]