qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 23/23] migration: Use multifd before we check for the zero


From: Peter Xu
Subject: Re: [PATCH v3 23/23] migration: Use multifd before we check for the zero page
Date: Wed, 15 Dec 2021 09:39:34 +0800

On Mon, Dec 13, 2021 at 10:03:53AM +0100, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Thu, Dec 02, 2021 at 06:38:27PM +0100, Juan Quintela wrote:
> >> This needs to be improved to be compatible with old versions.
> >
> > Any plan to let new binary work with old binary?
> 
> Yes, but I was waiting for 7.0 to get out.  Basically I need to do:
> 
> if (old)
>     run the old code
> else
>     new code
> 
> this needs to be done only in a couple of places, but I need the
> machine_type 7.0 created to put the property there.

OK.  We can also have the tunable be false by default until the new machine
type arrives; then the series won't need to be blocked by the machine type
patch and it'll be only one last patch to be adjusted there.

> 
> > Maybe boost the version field for multifd packet (along with a
> > multifd_version=2 parameter and only set on new machine types)?
> 
> For now, we only need to add a flag for the ZERO_PAGE functionality.  if
> we are on older qemu, just don't test for zero pages.  On reception, we
> can just accept everything, i.e. if there are no zero pages, everything
> is ok.

Do you mean zero detection for multifd=on only?  As otherwise it could regress
old machine types in some very common scenarios, iiuc, e.g. idle guests.

> 
> > PS: We should really have some handshake mechanism between src/dst, I 
> > dreamt it
> > for a long time..  So that we only need to specify the 
> > capability/parameters on
> > src someday and we'll never see incompatible migration failing randomly 
> > because
> > the handshake should guarantee no stupid mistake..  Sad.
> 
> That has been on my ToDo list for too long, just need the time to do
> it.  It would make everything much, much easier.
> 
> >> But .... if we don't care about RDMA, why do we care about
> >> control_save_page()?
> >
> > Could anyone help to explain why we don't care?  I still see bugfixes 
> > coming..
> 
> Sentence was inside a context.  We don't care for RDMA while we are on
> multifd.  If multifd ever supports RDMA, it would be a new
> implementation that don't use such hooks.
> 
> IMVHO, RDMA implementation in qemu is quite bad.  For historic reasons,
> they needed to use qemu_file abstraction for comunication, so they are
> dropping directly the ability of doing direct copies of pages.
> So, if one is requiring to mlock all the guest memory on both sides to
> use RDMA, the *right* thing to do from my point of view is just
> "remotely" read the page without any overhead.
> 
> Yes, that requires quite a bit of changes, I was not suggesting that it
> was a trivial task.

I see!

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]