qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC v2 00/12] virtio-net: add support for SR-IOV emulation


From: Akihiko Odaki
Subject: Re: [PATCH RFC v2 00/12] virtio-net: add support for SR-IOV emulation
Date: Tue, 12 Dec 2023 18:34:59 +0900
User-agent: Mozilla Thunderbird

On 2023/12/12 13:12, Jason Wang wrote:
On Mon, Dec 11, 2023 at 4:29 PM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:

On 2023/12/11 16:26, Jason Wang wrote:
On Mon, Dec 11, 2023 at 1:30 PM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:

On 2023/12/11 11:52, Jason Wang wrote:
On Sun, Dec 10, 2023 at 12:06 PM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:

Introduction
------------

This series is based on the RFC series submitted by Yui Washizu[1].
See also [2] for the context.

This series enables SR-IOV emulation for virtio-net. It is useful
to test SR-IOV support on the guest, or to expose several vDPA devices
in a VM. vDPA devices can also provide L2 switching feature for
offloading though it is out of scope to allow the guest to configure
such a feature.

The PF side code resides in virtio-pci. The VF side code resides in
the PCI common infrastructure, but it is restricted to work only for
virtio-net-pci because of lack of validation.

User Interface
--------------

A user can configure a SR-IOV capable virtio-net device by adding
virtio-net-pci functions to a bus. Below is a command line example:
     -netdev user,id=n -netdev user,id=o
     -netdev user,id=p -netdev user,id=q
     -device pcie-root-port,id=b
     -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f
     -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f
     -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f
     -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f

The VFs specify the paired PF with "sriov-pf" property. The PF must be
added after all VFs. It is user's responsibility to ensure that VFs have
function numbers larger than one of the PF, and the function numbers
have a consistent stride.

This seems not user friendly. Any reason we can't just allow user to
specify the stride here?

It should be possible to assign addr automatically without requiring
user to specify the stride. I'll try that in the next version.


Btw, I vaguely remember qemu allows the params to be accepted as a
list. If this is true, we can accept a list of netdev here?

Yes, rocker does that. But the problem is not just about getting
parameters needed for VFs, which I forgot to mention in the cover letter
and will explain below.



Keeping VF instances
--------------------

A problem with SR-IOV emulation is that it needs to hotplug the VFs as
the guest requests. Previously, this behavior was implemented by
realizing and unrealizing VFs at runtime. However, this strategy does
not work well for the proposed virtio-net emulation; in this proposal,
device options passed in the command line must be maintained as VFs
are hotplugged, but they are consumed when the machine starts and not
available after that, which makes realizing VFs at runtime impossible.

Could we store the device options in the PF?

I wrote it's to store the device options, but the problem is actually
more about realizing VFs at runtime instead of at the initialization time.

Realizing VFs at runtime have two major problems. One is that it delays
the validations of options; invalid options will be noticed when the
guest requests to realize VFs.

If PCI spec allows the failure when creating VF, then it should not be
a problem.

I doubt the spec cares such a failure at all. VF enablement should
always work for a real hardware. It's neither user-friendly to tell
configuration errors at runtime.

I'm not sure which options we should care about? Did you mean netdev
options or the virtio-net specific ones?

If VF stick to the same options as PF (except for the SRIOV), it
should be validated during the PF initialization.

I'm aware that it's necessary to validate netdev options and PCI function numbers (a.k.a. addr/devfn). I'm not sure if the other options may result in an invalid VF configuration.

That said, I think it's better to let the VF realization code validate the configuration at PF realization - it's less error-prone and potentially requires less code. It also benefits existing SR-IOV devices (igb and nvme) so I'm going to push that change forward whether it will be needed for virtio-net SR-IOV emulation.

Assuming the change to realize VFs early is going to happen for igb and nvme, most of the changes *only* needed by virtio-net SR-IOV emulation is done by:
patch 10 "pcie_sriov: Allow user to create SR-IOV device".




netdevs also warn that they are not used
at initialization time, not knowing that they will be used by VFs later.

We could invent things to calm down this false positive.

References to other QEMU objects in the option may also die before VFs
are realized.

Is there any other thing than netdev we need to consider?

You will also want to set a distinct mac for each VF. Other properties
does not matter much in my opinion.

Qemu doesn't check mac duplication now. So it's up to the mgmt layer.

Right. mac is not important; it's just nice to have.

Regards,
Akihiko Odaki



reply via email to

[Prev in Thread] Current Thread [Next in Thread]