qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: use VFIO over a UNIX domain socket to implement device offloadi


From: Stefan Hajnoczi
Subject: Re: RFC: use VFIO over a UNIX domain socket to implement device offloading
Date: Mon, 15 Jun 2020 11:49:29 +0100

On Tue, Jun 09, 2020 at 11:25:41PM -0700, John G Johnson wrote:
> > On Jun 2, 2020, at 8:06 AM, Alex Williamson <alex.williamson@redhat.com> 
> > wrote:
> > 
> > On Wed, 20 May 2020 17:45:13 -0700
> > John G Johnson <john.g.johnson@oracle.com> wrote:
> > 
> >>> I'm confused by VFIO_USER_ADD_MEMORY_REGION vs VFIO_USER_IOMMU_MAP_DMA.
> >>> The former seems intended to provide the server with access to the
> >>> entire GPA space, while the latter indicates an IOVA to GPA mapping of
> >>> those regions.  Doesn't this break the basic isolation of a vIOMMU?
> >>> This essentially says to me "here's all the guest memory, but please
> >>> only access these regions for which we're providing DMA mappings".
> >>> That invites abuse.
> >>> 
> >> 
> >>    The purpose behind separating QEMU into multiple processes is
> >> to provide an additional layer protection for the infrastructure against
> >> a malign guest, not for the guest against itself, so preventing a server
> >> that has been compromised by a guest from accessing all of guest memory
> >> adds no additional benefit.  We don’t even have an IOMMU in our current
> >> guest model for this reason.
> > 
> > One of the use cases we see a lot with vfio is nested assignment, ie.
> > we assign a device to a VM where the VM includes a vIOMMU, such that
> > the guest OS can then assign the device to userspace within the guest.
> > This is safe to do AND provides isolation within the guest exactly
> > because the device only has access to memory mapped to the device, not
> > the entire guest address space.  I don't think it's just the hypervisor
> > you're trying to protect, we can't assume there are always trusted
> > drivers managing the device.
> > 
> 
>       We intend to support an IOMMU.  The question seems to be whether
> it’s implemented in the server or client.  The current proposal has it
> in the server, ala vhost-user, but we are fine with moving it.

It's challenging to implement a fast and secure IOMMU. The simplest
approach is secure but not fast: add protocol messages for
DMA_READ(iova, length) and DMA_WRITE(iova, buffer, length).

An issue with file descriptor passing is that it's hard to revoke access
once the file descriptor has been passed. memfd supports sealing with
fnctl(F_ADD_SEALS) it doesn't revoke mmap(MAP_WRITE) on other processes.

Memory Protection Keys don't seem to be useful here either and their
availability is limited (see pkeys(7)).

One crazy idea is to use KVM as a sandbox for running the device and let
the vIOMMU control the page tables instead of the device (guest). That
way the hardware MMU provides memory translation, but I think this is
impractical because the guest environment is too different from the
Linux userspace environment.

As a starting point adding DMA_READ/DMA_WRITE messages would provide the
functionality and security. Unfortunately it makes DMA expensive and
performance will suffer.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]