qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] snabbswitch integration with QEMU for userspace etherne


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] snabbswitch integration with QEMU for userspace ethernet I/O
Date: Wed, 29 May 2013 16:21:43 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Wed, May 29, 2013 at 12:08:59PM +0300, Michael S. Tsirkin wrote:
> On Wed, May 29, 2013 at 09:49:29AM +0200, Stefan Hajnoczi wrote:
> > On Tue, May 28, 2013 at 08:17:42PM +0300, Michael S. Tsirkin wrote:
> > > On Tue, May 28, 2013 at 12:00:38PM -0500, Anthony Liguori wrote:
> > > > Julian Stecklina <address@hidden> writes:
> > > > 
> > > > > On 05/28/2013 12:10 PM, Luke Gorrie wrote:
> > > > >> On 27 May 2013 11:34, Stefan Hajnoczi <address@hidden
> > > > >> <mailto:address@hidden>> wrote:
> > > > >> 
> > > > >>     vhost_net is about connecting the a virtio-net speaking process 
> > > > >> to a
> > > > >>     tun-like device.  The problem you are trying to solve is 
> > > > >> connecting a
> > > > >>     virtio-net speaking process to Snabb Switch.
> > > > >> 
> > > > >> 
> > > > >> Yep!
> > > > >
> > > > > Since I am on a similar path as Luke, let me share another idea.
> > > > >
> > > > > What about extending qemu in a way to allow PCI device models to be
> > > > > implemented in another process.
> > > > 
> > > > We aren't going to support any interface that enables out of tree
> > > > devices.  This is just plugins in a different form with even more
> > > > downsides.  You cannot easily keep track of dirty info, the guest
> > > > physical address translation to host is difficult to keep in sync
> > > > (imagine the complexity of memory hotplug).
> > > > 
> > > > Basically, it's easy to hack up but extremely hard to do something that
> > > > works correctly overall.
> > > > 
> > > > There isn't a compelling reason to implement something like this other
> > > > than avoiding getting code into QEMU.  Best to just submit your device
> > > > to QEMU for inclusion.
> > > > 
> > > > If you want to avoid copying in a vswitch, better to use something like
> > > > vmsplice as I outlined in another thread.
> > > > 
> > > > > This is not as hard as it may sound.
> > > > > qemu would open a domain socket to this process and map VM memory over
> > > > > to the other side. This can be accomplished by having file descriptors
> > > > > in qemu to VM memory (reusing -mem-path code) and passing those over 
> > > > > the
> > > > > domain socket. The other side can then just mmap them. The socket 
> > > > > would
> > > > > also be used for configuration and I/O by the guest on the PCI
> > > > > I/O/memory regions. You could also use this to do IRQs or use 
> > > > > eventfds,
> > > > > whatever works better.
> > > > >
> > > > > To have a zero copy userspace switch, the switch would offer 
> > > > > virtio-net
> > > > > devices to any qemu that wants to connect to it and implement the
> > > > > complete device logic itself. Since it has access to all guest memory,
> > > > > it can just do memcpy for packet data. Of course, this only works for
> > > > > 64-bit systems, because you need vast amounts of virtual address 
> > > > > space.
> > > > > In my experience, doing this in userspace is _way less painful_.
> > > > >
> > > > > If you can get away with polling in the switch the overhead of doing 
> > > > > all
> > > > > this in userspace is zero. And as long as you can rate-limit explicit
> > > > > notifications over the socket even that overhead should be okay.
> > > > >
> > > > > Opinions?
> > > > 
> > > > I don't see any compelling reason to do something like this.  It's
> > > > jumping through a tremendous number of hoops to avoid putting code that
> > > > belongs in QEMU in tree.
> > > > 
> > > > Regards,
> > > > 
> > > > Anthony Liguori
> > > > 
> > > > >
> > > > > Julian
> > > 
> > > OTOH an in-tree device that runs in a separate process would
> > > be useful e.g. for security.
> > > For example, we could limit a virtio-net device process
> > > to only access tap and vhost files.
> > 
> > For tap or vhost files only this is good for security.  I'm not sure it
> > has many advantages over a QEMU process under SELinux though.
> 
> At the moment SELinux necessarily gives QEMU rights to
> e.g. access the filesystem.
> This process would only get access to tap and vhost.
> 
> We can also run it as a different user.
> Defence in depth.
> 
> We can also limit e.g. the CPU of this process aggressively
> (as it's not doing anything on data path).
> 
> I could go on.
> 
> And it's really easy too, until you want to use it in production,
> at which point you need to cover lots of
> nasty details like hotplug and migration.

I think there are diminishing returns.  Once QEMU is isolated so it
cannot open arbitrary files, just has access to the resources granted by
the management tool on startup, etc then I'm not sure it's worth the
complexity and performance-cost of splitting the model up into even
smaller pieces.  IMO there isn't a trust boundary that's worth isolating
here (compare to sshd privilege separation where separate uids really
make sense and are necessary, with QEMU having multiple uids that lack
capabilities to do much doesn't win much over the SELinux setup).

> > Obviously when the switch process has shared memory access to multiple
> > guests' RAM, the security is worse than a QEMU process solution but
> > better than a vhost kernel solution.
> > So the security story is not a clear win.
> > 
> > Stefan
> 
> How exactly you pass packets between guest and host is very unlikely to
> affect your security in a meaningful way.
> 
> Except, if you lose networking, orif it's just slow beyond any measure,
> you are suddenly more secure against network-based attacks.

The fact that a single switch process has shared memory access to all
guests' RAM is critical.  If the switch process is exploited, then that
exposes other guests' data!  (Think of a multi-tenant host with guests
belonging to different users.)

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]