qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Dual userfaultfd behavior


From: Alexey Perevalov
Subject: Re: [Qemu-devel] Dual userfaultfd behavior
Date: Mon, 10 Apr 2017 19:31:32 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

Hello,



On 03/18/2017 02:27 AM, Mike Kravetz wrote:
On 03/15/2017 06:47 AM, Alexey Perevalov wrote:
Hi Andrea,

thank you for so perfect design description,

the main question who will do RFC patches,
you or Mike or if you not against I could try.
Sorry for not replying sooner, I have been away from e-mail.

I have some other projects which require attention before I could start
any in depth work on this issue.
ok, if nobody started, let me to begin to code that.
I already implemented straightforward approach, but it
didn't take into account page migration.
On 03/14/2017 12:46 AM, Andrea Arcangeli wrote:
I think what is needed is an extension to UFFDIO_COPY so that it will
not fail if used with lower granularity than the hugetlbfs page
size. Then a special feature flag should be set in the
uffdio_api.features to tell userland it can use a lower granularity
than the hugetlbfs page size.

UFFDIO_COPY will still allocate a 1GB page, but it will copy only part
of and then map only the copied part with a pte of 4k granularity (or
2MB granularity if the uffdio_copy.dst/len parameters allows for
hugepmds). This way if there's a missing fault in the part that is not
copied yet, it'll still trigger a page fault and we'll call
handle_userfault from hugetlb_no_page, which will trigger a new
UFFDIO_COPY and map another fragment of the 1GB page.
The same code than should allow us to map a 2MB hugetlbfs page with 4k
granularity in UFFDIO_COPY so then userland can choose if to do
postcopy live migration on hugetlbfs with 2MB or 4kb granularity (on
very slow network 4kb would be preferable for example for the latency).

Mapping a 1GB page without a hugepud or mapping a 2MB page without a
hugepmd breaks all sort of invariants into the hugetlbfs code, page
migration wouldn't be able to cope with it either. Solving the fallout
from such breakage is what is required to implement this basically.
Agree!

I was wondering if it might be easier or more straight forward to split
the target vma.  You would then create a new vma of the copy page size
to be used during the copy.  In this way the the mappings and page table
entries remain as expected.

Of course, you would still want to allocate a huge page of the original
target vma size and map that as smaller page sizes are copied.  I don't
think there is anything that does something similar today.

When the copy is done (or aborted) we then create/convert a new vma for
the huge page and merge it into the target vma(s).

Not sure if that would be any easier.  It was just the first thing that
popped into my head.



--
Best regards,
Alexey Perevalov



reply via email to

[Prev in Thread] Current Thread [Next in Thread]