qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v4 PATCH 00/49] Initial support of multi-process qemu - status


From: Felipe Franciosi
Subject: Re: [RFC v4 PATCH 00/49] Initial support of multi-process qemu - status update
Date: Fri, 20 Dec 2019 16:00:12 +0000

Heya,

> On Dec 20, 2019, at 3:25 PM, Alex Williamson <address@hidden> wrote:
> 
> On Fri, 20 Dec 2019 14:14:33 +0000
> Felipe Franciosi <address@hidden> wrote:
> 
>>> On Dec 20, 2019, at 9:50 AM, Paolo Bonzini <address@hidden> wrote:
>>> 
>>> On 20/12/19 10:47, Stefan Hajnoczi wrote:  
>>>>> If it doesn't create too large of a burden to support both, then I think
>>>>> it is very desirable. IIUC, this is saying a kernel based solution as the
>>>>> optimized/optimal solution, and userspace UNIX socket based option as the
>>>>> generic "works everywhere" fallback solution.  
>>>> I'm slightly in favor of the kernel implementation because it keeps us
>>>> better aligned with VFIO.  That means solving problems in one place only
>>>> and less reinventing the wheel.  
>>> 
>>> I think there are anyway going to be some differences with VFIO.
>>> 
>>> For example, currently VFIO requires pinning user memory.  Is that a
>>> limitation for muser too?  If so, that would be a big disadvantage; if
>>> not, however, management tools need to learn that muser devices unlike
>>> other VFIO devices do not prevent overcommit.  
>> 
>> More or less. We pin them today, but I think we don't really have to.
>> I created an issue to look into it:
>> https://github.com/nutanix/muser/issues/28 
>> 
>> In any case, if Qemu is ballooning and calls UNMAP_DMA for memory that
>> has been ballooned out, then we would release it.
> 
> That's exactly the problem with ballooning and vfio, it doesn't unmap
> memory, it just zaps it out of the VM, to be demand faulted back in
> later.  It's very vCPU-centric.  Memory hotplug is the only case where
> we'll see a memory region get unmapped.
> 
>> The reason we keep it pinned is to support libmuser restarts. IIRC,
>> VFIO doesn't need to pin pages for mdev devices (that's the job of the
>> mdev driver on the other end via vfio_pin_pages()). It only keeps the
>> DMA entries in a RB tree.
>> 
>> If my understanding is right, then we can probably just keep the map
>> Qemu registered (without holding the pages) and call vfio_pin_pages()
>> on demand when libmuser restarts.
>> 
>> For context, this is how the DMA memory registration works today:
>> 
>> 1) Qemu calls ioctl(vfio_fd, IOMMU_MAP_DMA, &vm_map);
>> 
>> 2) The iommu driver notifies muser.ko
>> 
>> 3) Muser.ko pins the pages (in get_dma_map(), called from below)
>> (https://github.com/nutanix/muser/blob/master/kmod/muser.c#L711)
> 
> Yikes, it pins every page??  vfio_pin_pages() intends for the vendor
> driver to be much smarter than this :-\  Thanks,

We can't afford a kernel round trip every time we need to translate
GPAs, so that's how we solved it. There's an action item to do pin in
groups of 512 (which is the limit we saw in vfio_pin_pages()). Can you
elaborate on the problems of the approach and whether there's
something better we can do?

F.

> 
> Alex
> 
>> 4) Muser.ko notifies libmuser about the memory registration
>> (The iommu driver context goes to sleep, hence the pinning)
>> 
>> 5) Libmuser wakes up and calls mmap() on muser.ko
>> 
>> 6) Muser.ko inserts the VM memory in libmuser's context
>> (https://github.com/nutanix/muser/blob/master/kmod/muser.c#L543)
>> 
>> 7) Libmuser tells muser.ko that it's done
>> 
>> 8) Muser.ko iommu callback context that was sleeping wakes up
>> 
>> 9) Muser.ko places the memory in a "dma_list" for the mudev and returns.
>> 
>> We could potentially modify the last step to unpin and keep only what
>> we need for a future call to vfio_pin_pages(), but I need to check if
>> that works.
>> 
>> Cheers,
>> Felipe
>> 
>>> 
>>> Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]