qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] rfc: vhost user enhancements for vm2vm communication


From: Nakajima, Jun
Subject: Re: [Qemu-devel] rfc: vhost user enhancements for vm2vm communication
Date: Tue, 6 Oct 2015 14:42:34 -0700

Hi Michael,

Looks like the discussions tapered off, but do you have a plan to
implement this if people are eventually fine with it? We want to
extend this to support multiple VMs.

On Mon, Aug 31, 2015 at 11:35 AM, Nakajima, Jun <address@hidden> wrote:
> On Mon, Aug 31, 2015 at 7:11 AM, Michael S. Tsirkin <address@hidden> wrote:
>> Hello!
>> During the KVM forum, we discussed supporting virtio on top
>> of ivshmem. I have considered it, and came up with an alternative
>> that has several advantages over that - please see below.
>> Comments welcome.
>
> Hi Michael,
>
> I like this, and it should be able to achieve what I presented at KVM
> Forum (vhost-user-shmem).
> Comments below.
>
>>
>> -----
>>
>> Existing solutions to userspace switching between VMs on the
>> same host are vhost-user and ivshmem.
>>
>> vhost-user works by mapping memory of all VMs being bridged into the
>> switch memory space.
>>
>> By comparison, ivshmem works by exposing a shared region of memory to all 
>> VMs.
>> VMs are required to use this region to store packets. The switch only
>> needs access to this region.
>>
>> Another difference between vhost-user and ivshmem surfaces when polling
>> is used. With vhost-user, the switch is required to handle
>> data movement between VMs, if using polling, this means that 1 host CPU
>> needs to be sacrificed for this task.
>>
>> This is easiest to understand when one of the VMs is
>> used with VF pass-through. This can be schematically shown below:
>>
>> +-- VM1 --------------+            +---VM2-----------+
>> | virtio-pci          +-vhost-user-+ virtio-pci -- VF | -- VFIO -- IOMMU -- 
>> NIC
>> +---------------------+            +-----------------+
>>
>>
>> With ivshmem in theory communication can happen directly, with two VMs
>> polling the shared memory region.
>>
>>
>> I won't spend time listing advantages of vhost-user over ivshmem.
>> Instead, having identified two advantages of ivshmem over vhost-user,
>> below is a proposal to extend vhost-user to gain the advantages
>> of ivshmem.
>>
>>
>> 1: virtio in guest can be extended to allow support
>> for IOMMUs. This provides guest with full flexibility
>> about memory which is readable or write able by each device.
>
> I assume that you meant VFIO only for virtio by "use of VFIO".  To get
> VFIO working for general direct-I/O (including VFs) in guests, as you
> know, we need to virtualize IOMMU (e.g. VT-d) and the interrupt
> remapping table on x86 (i.e. nested VT-d).
>
>> By setting up a virtio device for each other VM we need to
>> communicate to, guest gets full control of its security, from
>> mapping all memory (like with current vhost-user) to only
>> mapping buffers used for networking (like ivshmem) to
>> transient mappings for the duration of data transfer only.
>
> And I think that we can use VMFUNC to have such transient mappings.
>
>> This also allows use of VFIO within guests, for improved
>> security.
>>
>> vhost user would need to be extended to send the
>> mappings programmed by guest IOMMU.
>
> Right. We need to think about cases where other VMs (VM3, etc.) join
> the group or some existing VM leaves.
> PCI hot-plug should work there (as you point out at "Advantages over
> ivshmem" below).
>
>>
>> 2. qemu can be extended to serve as a vhost-user client:
>> remote VM mappings over the vhost-user protocol, and
>> map them into another VM's memory.
>> This mapping can take, for example, the form of
>> a BAR of a pci device, which I'll call here vhost-pci -
>> with bus address allowed
>> by VM1's IOMMU mappings being translated into
>> offsets within this BAR within VM2's physical
>> memory space.
>
> I think it's sensible.
>
>>
>> Since the translation can be a simple one, VM2
>> can perform it within its vhost-pci device driver.
>>
>> While this setup would be the most useful with polling,
>> VM1's ioeventfd can also be mapped to
>> another VM2's irqfd, and vice versa, such that VMs
>> can trigger interrupts to each other without need
>> for a helper thread on the host.
>>
>>
>> The resulting channel might look something like the following:
>>
>> +-- VM1 --------------+  +---VM2-----------+
>> | virtio-pci -- iommu +--+ vhost-pci -- VF | -- VFIO -- IOMMU -- NIC
>> +---------------------+  +-----------------+
>>
>> comparing the two diagrams, a vhost-user thread on the host is
>> no longer required, reducing the host CPU utilization when
>> polling is active.  At the same time, VM2 can not access all of VM1's
>> memory - it is limited by the iommu configuration setup by VM1.
>>
>>
>> Advantages over ivshmem:
>>
>> - more flexibility, endpoint VMs do not have to place data at any
>>   specific locations to use the device, in practice this likely
>>   means less data copies.
>> - better standardization/code reuse
>>   virtio changes within guests would be fairly easy to implement
>>   and would also benefit other backends, besides vhost-user
>>   standard hotplug interfaces can be used to add and remove these
>>   channels as VMs are added or removed.
>> - migration support
>>   It's easy to implement since ownership of memory is well defined.
>>   For example, during migration VM2 can notify hypervisor of VM1
>>   by updating dirty bitmap each time is writes into VM1 memory.
>
> Also, the ivshmem functionality could be implemented by this proposal:
> - vswitch (or some VM) allocates memory regions in its address space, and
> - it sets up that IOMMU mappings on the VMs be translated into the regions
>
>>
>> Thanks,
>>
>> --
>> MST



-- 
Jun
Intel Open Source Technology Center



reply via email to

[Prev in Thread] Current Thread [Next in Thread]