[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance
From: |
Orit Wasserman |
Subject: |
Re: [Qemu-devel] [PATCH RFC 0/4] Curling: KVM Fault Tolerance |
Date: |
Thu, 12 Sep 2013 10:37:57 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130805 Thunderbird/17.0.8 |
On 09/11/2013 04:54 AM, address@hidden wrote:
> Hi,
>
>>The first is that if the VM failure happen in the middle on the live
>>migration >the backup VM state will be inconsistent which means you can't
>>failover to it.
>
> Yes, I have concerned about this problem. That is why we need a prefetch
> buffer.
>
You are right I missed that.
>>Solving it is not simple as you need some transaction mechanism that will
>>>change the backup VM state only when the transaction completes (the live
>>migration completes). >Kemari has something like that. >
>
> The backup VM state will be loaded only when the one whole migration data is
> prefetched. Otherwise, VM state will not be loaded. So the backup VM is
> ensured to have a consistent state like a checkpoint.
> However, how close this checkpoint to the point of the VM failure depends on
> the workload and bandwidth.
>
At the moment in your implementation the prefetch buffer can be very large
(several copies of guest memory size)
are you planning to address this issue?
>>The second is that sadly live migration doesn't always converge this means
>>>that the backup VM won't have a consist state to failover to. >You need to
>>detect such a case and throttle down the guest to force convergence.
>
> Yes, that's a problem. AFAK, qemu already have an auto convergence feature.
How about activating it when you do fault tolerance automatically?
> From another perspective, if many migrations could not converge, maybe the
> workload is high and the bandwidth is low, and it is not recommended to use
> FT in general.
>
I agree but we need some way to notify the user of such problem.
Regards,
Orit
>
>