[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero prec
From: |
Peter Xu |
Subject: |
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages |
Date: |
Thu, 27 Apr 2017 11:20:37 +0800 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Wed, Apr 26, 2017 at 09:37:43PM +0200, Andrea Arcangeli wrote:
> Hello,
>
> On Wed, Apr 26, 2017 at 08:04:43PM +0100, Dr. David Alan Gilbert wrote:
> > * Christian Borntraeger (address@hidden) wrote:
> > > On 04/26/2017 08:37 PM, Dr. David Alan Gilbert (git) wrote:
> > > > From: "Dr. David Alan Gilbert" <address@hidden>
> > > >
> > > > When an all-zero page is received during the precopy
> > > > phase of a postcopy-enabled migration we must force
> > > > allocation otherwise accesses to the page will still
> > > > get blocked by userfault.
> > > >
> > > > Symptom:
> > > > a) If the page is accessed by a device during device-load
> > > > then we get a deadlock as the source finishes sending
> > > > all its pages but the destination device-load is still
> > > > paused and so doesn't clean up.
> > > >
> > > > b) If the page is accessed later, then the thread will stay
> > > > paused until the end of migration rather than carrying on
> > > > running, until we release userfault at the end.
> > > >
> > > > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > > > Reported-by: Christian Borntraeger <address@hidden>
> > >
> > > CC stable? after all the guest hangs on both sides
> > >
> > > Has survived 40 migrations (usually failed at the 2nd)
> > > Tested-by: Christian Borntraeger <address@hidden>
> >
> > Great...but.....
> > Andrea (added to the mail) says this shouldn't be necessary.
> > The read we were doing in the is_zero_range() should have been sufficient
> > to get the page mapped and that zero page should have survived.
> >
> > So - I guess that's back a step, we need to figure out why the
> > page disapepars for you.
>
> Yes reading during precopy is enough to fill the hole and prevent
> userfault missing faults to trigger.
>
> Somehow the pagetable must be mapped by a zeropage or a hugezeropage
> or a regular page allocated during a previous precopy pass or a
> pre-zeroed subpage part of a THP.
>
> Even if the hugezeropage is splitted later by a MADV_DONTNEED with
> postcopy starts, they will become 4k zeropages.
>
> After a read succeeds, nothing (except MADV_DONTNEED or other explicit
> syscalls which qemu would need to invoke explicitly between
> is_zero_range and UFFDIO_REGISTER) should be able to bring the
> pagetable back to its "pte_none/pmd_none" state that will then trigger
> missing userfaults during postcopy later.
No matter what finally the solution would be (after see Juan's
comment, I am curious about whether is_zero_page() behaves differently
in power now)... Dave, would it worth mentioning in
ram_handle_compressed() about this read side-effect? Otherwise imho it
might be hard for many people to quickly notice this.
Thanks,
--
Peter Xu
- [Qemu-devel] [PATCH 0/2] Postcopy fix and traces, Dr. David Alan Gilbert (git), 2017/04/26
- [Qemu-devel] [PATCH 2/2] migration: Extra tracing, Dr. David Alan Gilbert (git), 2017/04/26
- [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Dr. David Alan Gilbert (git), 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Dr. David Alan Gilbert, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Andrea Arcangeli, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages,
Peter Xu <=
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/27
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Andrea Arcangeli, 2017/04/27
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/28
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Dr. David Alan Gilbert, 2017/04/28
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/26
- Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Christian Borntraeger, 2017/04/26
Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages, Juan Quintela, 2017/04/26