[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC v4 PATCH 00/49] Initial support of multi-process qemu - status
Re: [RFC v4 PATCH 00/49] Initial support of multi-process qemu - status update
Thu, 2 Jan 2020 11:03:22 +0000
> On Jan 2, 2020, at 10:42 AM, Stefan Hajnoczi <address@hidden> wrote:
> On Fri, Dec 20, 2019 at 10:22:37AM +0000, Daniel P. Berrangé wrote:
>> On Fri, Dec 20, 2019 at 09:47:12AM +0000, Stefan Hajnoczi wrote:
>>> On Thu, Dec 19, 2019 at 12:55:04PM +0000, Daniel P. Berrangé wrote:
>>>> On Thu, Dec 19, 2019 at 12:33:15PM +0000, Felipe Franciosi wrote:
>>>>>> On Dec 19, 2019, at 11:55 AM, Stefan Hajnoczi <address@hidden> wrote:
>>>>>> On Tue, Dec 17, 2019 at 10:57:17PM +0000, Felipe Franciosi wrote:
>>>>>>>> On Dec 17, 2019, at 5:33 PM, Stefan Hajnoczi <address@hidden> wrote:
>>>>>>>> On Mon, Dec 16, 2019 at 07:57:32PM +0000, Felipe Franciosi wrote:
>>>>>>>>>> On 16 Dec 2019, at 20:47, Elena Ufimtseva <address@hidden> wrote:
>>>>>>>>>> On Fri, Dec 13, 2019 at 10:41:16AM +0000, Stefan Hajnoczi wrote:
>>>>> To be clear: I'm very happy to have a userspace-only option for this,
>>>>> I just don't want to ditch the kernel module (yet, anyway). :)
>>>> If it doesn't create too large of a burden to support both, then I think
>>>> it is very desirable. IIUC, this is saying a kernel based solution as the
>>>> optimized/optimal solution, and userspace UNIX socket based option as the
>>>> generic "works everywhere" fallback solution.
>>> I'm slightly in favor of the kernel implementation because it keeps us
>>> better aligned with VFIO. That means solving problems in one place only
>>> and less reinventing the wheel.
>>> Knowing that a userspace implementation is possible is a plus though.
>>> Maybe that option will become attractive in the future and someone will
>>> develop it. In fact, a userspace implementation may be a cool Google
>>> Summer of Code project idea that I'd like to co-mentor.
>> If it is technically viable as an approach, then I think we should be
>> treating a fully unprivileged muser-over-UNIX socket as a higher priority
>> than just "maybe a GSoC student will want todo it".
>> Libvirt is getting strong message from KubeVirt project that they want to
>> be running both libvirtd and QEMU fully unprivileged. This allows their
>> containers to be unprivileged. Anything that requires privileges requires
>> jumping through extra hoops writing custom code in KubeVirt to do things
>> outside libvirt in side loaded privileged containers and this limits how
>> where those features can be used.
> Okay this makes sense.
> There needs to be a consensus on whether to go with a qdev-over-socket
> approach that is QEMU-specific and strongly discourages third-party
> device distribution or a muser-over-socket approach that offers a stable
> API for VMM interoperability and third-party device distribution.
The reason I dislike yet another offloading protocol (ie. there is
vhost, there is vfio, and then there would be qdev-over-socket) is
that we keep reinventing the wheel. I very much prefer picking
something solid (eg. VFIO) and keep investing on it.
> Interoperability between VMMs and also DPDK/SPDK is important because
> they form today's open source virtualization community. No one project
> or codebase covers all use cases or interesting developments. If we are
> short-sighted and prevent collaboration then we'll become isolated.
> On the other hand, I'm personally opposed to proprietary vendors that
> contribute very little to open source. We make that easier by offering
> a stable API for third-party devices. A stable API discourages open
> source contributions while allowing proprietary vendors to benefit from
> the work that the open source community is doing.
I appreciate the concern. However, my opinion is that vendors cannot
be stopped by providing them with unstable APIs. There are plenty of
examples where projects were forked and maintained separately to keep
certain things under control and that is bad for everyone. The
community doesn't get contributions back, and vendors have extra pain
to maintain the forks. Furthermore, service vendors will always get
away with murder by copying whatever they like and using however they
please (since they are not sharing the software).
I would rather look at examples like KVM. It's a relatively stable API
with several proprietary users. Nevertheless, we see loads of
contributions to it (perhaps less than we would want, but plenty).
> One way to choose a position is to balance up the open source vs
> proprietary applications of a stable API. At this point in time I think
> the DPDK/SPDK and rust-vmm communities bring enough to the table that
> it's worth fostering collaboration through a stable API. The benefit of
> having the stable API is large enough that the disadvantage of making
> life easier for proprietary vendors can be accepted.
I agree with you as per reasoning above.
> This is just a more elaborate explanation for the "the cat is out of the
> bag" comments that have already been made on licensing. Does anyone
> still disagree or want to discuss further?
> If there is agreement that a stable API is okay then I think the
> practical way to do this is to first merge a cleaned-up version of
> multi-process QEMU as an unstable experimental API. Once it's being
> tested and used we can write a protocol specification and publish it as
> a stable interface when the spec has addressed most use cases.
> Does this sound good?
In that case, wouldn't it be preferable to revive our proposal from
Edinburgh (KVM Forum 2018)? Our prototypes moved more of the Qemu VFIO
code to "common" and added a "user" backend underneath it, similar to
how vhost-user-scsi moved some of vhost-scsi to vhost-scsi-common and
added vhost-user-scsi. It was centric on PCI, but it doesn't have to
be. The other side can be implemented in libmuser for facilitating things.
I even recall highlighting that vhost-user could be moved underneath
that later, greatly simplifying lots of other Qemu code.
Re: [RFC v4 PATCH 00/49] Initial support of multi-process qemu - status update, Elena Ufimtseva, 2020/01/02