qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH v7 00/12] rdma: migration support
Date: Thu, 13 Jun 2013 17:40:31 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6

Il 13/06/2013 17:17, Michael R. Hines ha scritto:
> On 06/13/2013 04:06 PM, Paolo Bonzini wrote:
>> Regarding the timestamp problem, it should be fixed in the RDMA code.
>> You did find a bug, but xyz_start_outgoing_migration should be
>> asynchronous and the pinning should happen in the setup phase.  This is
>> because the setup phase is already running outside the big QEMU lock and
>> the guest would not be frozen.
> 
> I think you misunderstood the symptom. The pinning is *already*
> happening in the setup phase (xyz_start_outgoing_migration), not
> inside the the migration_thread().

I was referring to ram_control_before_iterate(f, RAM_CONTROL_SETUP).
Even better, that would be a good reason to remove the

+     * Please leave in place. These calls generate reserved messages in
+     * the RDMA protocol in order to pre-register RDMA memory in the
+     * future before the bulk round begins.

comment that I don't like. :)

xyz_start_outgoing_migration is the connect phase if you want to give it
a name.

The connect phase runs with the big QEMU lock taken, so any interrupt
will freeze VCPUs until the pinning ends.

>> (1) move the pinning to the setup phase
>
> This is already done in the existing patchset.

See above, it isn't. :)

>> (2) add a debug mode where every pass unpins all the memory and
>> restarts.  Speed doesn't matter, this is so that the protocol supports
>> it from the beginning, and any caching heuristics need to be done on the
>> source side.  As all debug modes, it will be somewhat prone to bitrot,
>> but at least there is a reference implementation for anyone who laters
>> wants to add caching.
>>
>> I think (2) is very important so that, for example, during fault
>> tolerance you can reduce a bit the pinned size for smaller workloads,
>> even without ballooning.
> 
> I agree that this is a necessary feature (dynamic source registration), but
> it is a lot more complicated than a simple unpin of everything before
> every pass.
> As you suggested, I would rather not introduce unused code, but rather
> wait until
> someone in the future has a full-functional, testable, implementation.

I know it's complicated.  But I think it needs to be part of the design
that the complication lies only in the source.  Starting with a
simple-minded (borderline stupid) implementation that is as slow as TCP
is enough to complete this missing part of the design.  Everything else
is "just" an optimization on top.

>>      (*) for example, why the introduction of acct_update_position?  Is
>>      it a fix for a bug that always existed, or driven by some other
>>      changes?
> 
> This is important because RDMA writes do not happen sycnrhonously. It is 
> impossible
> to update the accounting inside of save_live_iterate() because the RDMA 
> operations
> are still outstanding.
> 
> It is only until they have completed later that we can actually know
> whether what the accounting statistics really are.

So it was always there.  Good, but you should have written it in the
cover letter.

Juan, are you going to pick this up, or should Anthony do it?  (Michael:
in any case, do not send a pull request yourself).

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]