[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages
From: |
Andrea Arcangeli |
Subject: |
Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages |
Date: |
Tue, 7 Oct 2014 15:37:10 +0200 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
Hi Kirill,
On Tue, Oct 07, 2014 at 02:10:26PM +0300, Kirill A. Shutemov wrote:
> On Fri, Oct 03, 2014 at 07:08:00PM +0200, Andrea Arcangeli wrote:
> > There's one constraint enforced to allow this simplification: the
> > source pages passed to remap_anon_pages must be mapped only in one
> > vma, but this is not a limitation when used to handle userland page
> > faults with MADV_USERFAULT. The source addresses passed to
> > remap_anon_pages should be set as VM_DONTCOPY with MADV_DONTFORK to
> > avoid any risk of the mapcount of the pages increasing, if fork runs
> > in parallel in another thread, before or while remap_anon_pages runs.
>
> Have you considered triggering COW instead of adding limitation on
> pages' mapcount? The limitation looks artificial from interface POV.
I haven't considered it, mostly because I see it as a feature that it
returns -EBUSY. I prefer to avoid the risk of userland getting a
successful retval but internally the kernel silently behaving
non-zerocopy by mistake because some userland bug forgot to set
MADV_DONTFORK on the src_vma.
COW would be not zerocopy so it's not ok. We get sub 1msec latency for
userfaults through 10gbit and we don't want to risk wasting CPU
caches.
I however considered allowing to extend the strict behavior (i.e. the
feature) later in a backwards compatible way. We could provide a
non-zerocopy beahvior with a RAP_ALLOW_COW flag that would then turn
the -EBUSY error into a copy.
It's also more complex to implement the cow now, so it would make the
code that really matters, harder to review. So it may be preferable to
extend this later in a backwards compatible way with a new
RAP_ALLOW_COW flag.
The current handling the flags is already written in a way that should
allow backwards compatible extension with RAP_ALLOW_*:
#define RAP_ALLOW_SRC_HOLES (1UL<<0)
SYSCALL_DEFINE4(remap_anon_pages,
unsigned long, dst_start, unsigned long, src_start,
unsigned long, len, unsigned long, flags)
[..]
long err = -EINVAL;
[..]
if (flags & ~RAP_ALLOW_SRC_HOLES)
return err;
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, (continued)
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Linus Torvalds, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Andrea Arcangeli, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Andrea Arcangeli, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Andy Lutomirski, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Peter Feiner, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Linus Torvalds, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Dr. David Alan Gilbert, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Paolo Bonzini, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Dr. David Alan Gilbert, 2014/10/07
Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages, Kirill A. Shutemov, 2014/10/07
- Re: [Qemu-devel] [PATCH 10/17] mm: rmap preparation for remap_anon_pages,
Andrea Arcangeli <=
[Qemu-devel] [PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked, Andrea Arcangeli, 2014/10/03
[Qemu-devel] [PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked, Andrea Arcangeli, 2014/10/03
[Qemu-devel] [PATCH 12/17] mm: sys_remap_anon_pages, Andrea Arcangeli, 2014/10/03
[Qemu-devel] [PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER, Andrea Arcangeli, 2014/10/03
[Qemu-devel] [PATCH 08/17] mm: madvise MADV_USERFAULT, Andrea Arcangeli, 2014/10/03