[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 23/23] migration: Use multifd before we check for the zero
From: |
Peter Xu |
Subject: |
Re: [PATCH v3 23/23] migration: Use multifd before we check for the zero page |
Date: |
Wed, 15 Dec 2021 09:39:34 +0800 |
On Mon, Dec 13, 2021 at 10:03:53AM +0100, Juan Quintela wrote:
> Peter Xu <peterx@redhat.com> wrote:
> > On Thu, Dec 02, 2021 at 06:38:27PM +0100, Juan Quintela wrote:
> >> This needs to be improved to be compatible with old versions.
> >
> > Any plan to let new binary work with old binary?
>
> Yes, but I was waiting for 7.0 to get out. Basically I need to do:
>
> if (old)
> run the old code
> else
> new code
>
> this needs to be done only in a couple of places, but I need the
> machine_type 7.0 created to put the property there.
OK. We can also have the tunable be false by default until the new machine
type arrives; then the series won't need to be blocked by the machine type
patch and it'll be only one last patch to be adjusted there.
>
> > Maybe boost the version field for multifd packet (along with a
> > multifd_version=2 parameter and only set on new machine types)?
>
> For now, we only need to add a flag for the ZERO_PAGE functionality. if
> we are on older qemu, just don't test for zero pages. On reception, we
> can just accept everything, i.e. if there are no zero pages, everything
> is ok.
Do you mean zero detection for multifd=on only? As otherwise it could regress
old machine types in some very common scenarios, iiuc, e.g. idle guests.
>
> > PS: We should really have some handshake mechanism between src/dst, I
> > dreamt it
> > for a long time.. So that we only need to specify the
> > capability/parameters on
> > src someday and we'll never see incompatible migration failing randomly
> > because
> > the handshake should guarantee no stupid mistake.. Sad.
>
> That has been on my ToDo list for too long, just need the time to do
> it. It would make everything much, much easier.
>
> >> But .... if we don't care about RDMA, why do we care about
> >> control_save_page()?
> >
> > Could anyone help to explain why we don't care? I still see bugfixes
> > coming..
>
> Sentence was inside a context. We don't care for RDMA while we are on
> multifd. If multifd ever supports RDMA, it would be a new
> implementation that don't use such hooks.
>
> IMVHO, RDMA implementation in qemu is quite bad. For historic reasons,
> they needed to use qemu_file abstraction for comunication, so they are
> dropping directly the ability of doing direct copies of pages.
> So, if one is requiring to mlock all the guest memory on both sides to
> use RDMA, the *right* thing to do from my point of view is just
> "remotely" read the page without any overhead.
>
> Yes, that requires quite a bit of changes, I was not suggesting that it
> was a trivial task.
I see!
Thanks,
--
Peter Xu