[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v0 0/7] Background snapshots

From: Mike Kravetz
Subject: Re: [Qemu-devel] [PATCH v0 0/7] Background snapshots
Date: Tue, 14 Aug 2018 16:16:33 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

On 08/13/2018 12:00 PM, Dr. David Alan Gilbert wrote:
> cc'ing in Mike*2
> * Denis Plotnikov (address@hidden) wrote:
>> On 26.07.2018 12:23, Peter Xu wrote:
>>> On Thu, Jul 26, 2018 at 10:51:33AM +0200, Paolo Bonzini wrote:
>>>> On 25/07/2018 22:04, Andrea Arcangeli wrote:
>>>>> It may look like the uffd-wp model is wish-feature similar to an
>>>>> optimization, but without the uffd-wp model when the WP fault is
>>>>> triggered by kernel code, the sigsegv model falls apart and requires
>>>>> all kind of ad-hoc changes just for this single feature. Plus uffd-wp
>>>>> has other benefits: it makes it all reliable in terms of not
>>>>> increasing the number of vmas in use during the snapshot. Finally it
>>>>> makes it faster too with no mmap_sem for reading and no sigsegv
>>>>> signals.
>>>>> The non cooperative features got merged first because there was much
>>>>> activity on the kernel side on that front, but this is just an ideal
>>>>> time to nail down the remaining issues in uffd-wp I think. That I
>>>>> believe is time better spent than trying to emulate it with sigsegv
>>>>> and changing all drivers to send new events down to qemu specific to
>>>>> the sigsegv handling. We considered this before doing uffd for
>>>>> postcopy too but overall it's unreliable and more work (no single
>>>>> change was then needed to KVM code with uffd to handle postcopy and
>>>>> here it should be the same).
>>>> I totally agree.  The hard part in userfaultfd was the changes to the
>>>> kernel get_user_pages API, but the payback was huge because _all_ kernel
>>>> uses (KVM, vhost-net, syscalls, etc.) just work with userfaultfd.  Going
>>>> back to mprotect would be a huge mistake.
>>> Thanks for explaining the bits.  I'd say I wasn't aware of the
>>> difference before I started the investigation (and only until now I
>>> noticed that major difference between mprotect and userfaultfd).  I'm
>>> really glad that it's much clear (at least for me) on which way we
>>> should choose.
>>> Now I'm thinking whether we can move the userfault write protect work
>>> forward.  The latest discussion I saw so far is in 2016, when someone
>>> from Huawei tried to use the write protect feature for that old
>>> version of live snapshot but reported issue:
>>>    https://lists.gnu.org/archive/html/qemu-devel/2016-12/msg01127.html
>>> Is that the latest status for userfaultfd wr-protect?
>>> If so, I'm thinking whether I can try to re-verify the work (I tried
>>> his QEMU repository but I failed to compile somehow, so I plan to
>>> write some even simpler code to try) to see whether I can get the same
>>> KVM error he encountered.
>>> Thoughts?
>> Just to sum up all being said before.
>> Using mprotect is a bad idea because VM's memory can be accessed from the
>> number of places (KVM, vhost, ...) which need their own special care
>> of tracking memory accesses and notifying QEMU which makes the mprotect
>> using unacceptable.
>> Protected memory accesses tracking can be done via userfaultfd's WP mode
>> which isn't available right now.
>> So, the reasonable conclusion is to wait until the WP mode is available and
>> build the background snapshot on top of userfaultfd-wp.
>> But, works on adding the WP-mode is pending for a quite a long time already.
>> Is there any way to estimate when it could be available?
> I think a question is whether anyone is actively working on it; I
> suspect really it's on a TODO list rather than moving at the moment.

I am not working on it, and it is not on my TODO list.

However, if someone starts making progress I will jump in and work on
hugetlbfs support.  My intention would be to not let hugetlbfs support
'fall behind' general uffd support.

Mike Kravetz

reply via email to

[Prev in Thread] Current Thread [Next in Thread]