qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v0 0/7] Background snapshots


From: Andrea Arcangeli
Subject: Re: [Qemu-devel] [PATCH v0 0/7] Background snapshots
Date: Wed, 25 Jul 2018 16:04:56 -0400
User-agent: Mutt/1.10.1 (2018-07-13)

On Wed, Jul 25, 2018 at 08:17:37PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (address@hidden) wrote:
> > On Fri, Jun 29, 2018 at 12:53:59PM +0100, Dr. David Alan Gilbert wrote:
> > > * Denis Plotnikov (address@hidden) wrote:
> > > > The patch set adds the ability to make external snapshots while VM is 
> > > > running.
> > > 
> > > cc'ing in Andrea since this uses sigsegv's to avoid userfault-wp that
> > > isn't there yet.
> > > 
> > > Hi Denis,
> > >   How robust are you finding this SEGV based trick; for example what
> > > about things like the kernel walking vhost queues or similar kernel
> > > nasties?
> > 
> > (I'm commenting on this old series to keep the discussion together)
> > 
> > If we want to make this series really work for people, we should
> > possibly need to know whether it could work with vhost (otherwise we
> > might need to go back to userfaultfd write-protection).
> > 
> > I digged a bit on the vhost-net IO, it should be using two ways to
> > write to guest memory:
> > 
> > - copy_to_user(): this should possibly still be able to be captured by
> >   mprotect() (after some confirmation from Paolo, but still we'd
> >   better try it out)
> 
> What confuses me here is who is going to get the signal from this and
> how we recover from the signal - or does it come back as an error
> on the vhost fd somehow?

The problem is having to start to handle manually all sigsegv in
vhost-net by trapping copy_to_user returning less than the full buffer
size or put_user returning -EFAULT.

Those errors would need to be forwarded by vhost-net to qemu userland
to call mprotect after copying the data.

That's not conceptually different from having uffd-wp sending the
message except that will then require zero changes to vhost-net and
every other piece of kernel code that may have to write to the write
protected memory.

It may look like the uffd-wp model is wish-feature similar to an
optimization, but without the uffd-wp model when the WP fault is
triggered by kernel code, the sigsegv model falls apart and requires
all kind of ad-hoc changes just for this single feature. Plus uffd-wp
has other benefits: it makes it all reliable in terms of not
increasing the number of vmas in use during the snapshot. Finally it
makes it faster too with no mmap_sem for reading and no sigsegv
signals.

The non cooperative features got merged first because there was much
activity on the kernel side on that front, but this is just an ideal
time to nail down the remaining issues in uffd-wp I think. That I
believe is time better spent than trying to emulate it with sigsegv
and changing all drivers to send new events down to qemu specific to
the sigsegv handling. We considered this before doing uffd for
postcopy too but overall it's unreliable and more work (no single
change was then needed to KVM code with uffd to handle postcopy and
here it should be the same).

Thanks,
Andrea



reply via email to

[Prev in Thread] Current Thread [Next in Thread]