qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 00/15] hw/nvme: SR-IOV with Virtualization Enhancements


From: Klaus Jensen
Subject: Re: [PATCH 00/15] hw/nvme: SR-IOV with Virtualization Enhancements
Date: Tue, 2 Nov 2021 08:24:53 +0100

On Oct 27 18:49, Lukasz Maniak wrote:
> On Tue, Oct 26, 2021 at 08:20:12PM +0200, Klaus Jensen wrote:
> > On Oct  7 18:23, Lukasz Maniak wrote:
> > > Hi,
> > > 
> > > This series of patches is an attempt to add support for the following
> > > sections of NVMe specification revision 1.4:
> > > 
> > > 8.5 Virtualization Enhancements (Optional)
> > >     8.5.1 VQ Resource Definition
> > >     8.5.2 VI Resource Definition
> > >     8.5.3 Secondary Controller States and Resource Configuration
> > >     8.5.4 Single Root I/O Virtualization and Sharing (SR-IOV)
> > > 
> > > The NVMe controller's Single Root I/O Virtualization and Sharing
> > > implementation is based on patches introducing SR-IOV support for PCI
> > > Express proposed by Knut Omang:
> > > https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05155.html
> > > 
> > > However, based on what I was able to find historically, Knut's patches
> > > have not yet been pulled into QEMU due to no example of a working device
> > > up to this point:
> > > https://lists.gnu.org/archive/html/qemu-devel/2017-10/msg02722.html
> > > 
> > > In terms of design, the Physical Function controller and the Virtual
> > > Function controllers are almost independent, with few exceptions:
> > > PF handles flexible resource allocation for all its children (VFs have
> > > read-only access to this data), and reset (PF explicitly calls it on VFs).
> > > Since the MMIO access is serialized, no extra precautions are required
> > > to handle concurrent resets, as well as the secondary controller state
> > > access doesn't need to be atomic.
> > > 
> > > A controller with full SR-IOV support must be capable of handling the
> > > Namespace Management command. As there is a pending review with this
> > > functionality, this patch list is not duplicating efforts.
> > > Yet, NS management patches are not required to test the SR-IOV support.
> > > 
> > > We tested the patches on Ubuntu 20.04.3 LTS with kernel 5.4.0. We have
> > > hit various issues with NVMe CLI (list and virt-mgmt commands) between
> > > releases from version 1.09 to master, thus we chose this golden NVMe CLI
> > > hash for testing: a50a0c1.
> > > 
> > > The implementation is not 100% finished and certainly not bug free,
> > > since we are already aware of some issues e.g. interaction with
> > > namespaces related to AER, or unexpected (?) kernel behavior in more
> > > complex reset scenarios. However, our SR-IOV implementation is already
> > > able to support typical SR-IOV use cases, so we believe the patches are
> > > ready to share with the community.
> > > 
> > > Hope you find some time to review the work we did, and share your
> > > thoughts.
> > > 
> > > Kind regards,
> > > Lukasz
> > 
> > Hi Lukasz,
> > 
> > If possible, can you share a brief guide on testing this? I keep hitting
> > an assert
> > 
> >   qemu-system-x86_64: ../hw/pci/pci.c:1215: pci_register_bar: Assertion 
> > `!pci_is_vf(pci_dev)' failed.
> > 
> > when I try to modify sriov_numvfs. This should be fixed anyway, but I
> > might be doing something wrong in the first place.
> 
> Hi Klaus,
> 
> Let me share all the details about the steps I did to run 7 fully
> functional VF controllers and failed to reproduce the assert.
> 
> I rebased v1 patches to eliminate any recent regression onto the current
> master 931ce30859.
> 
> I configured build as follows:
> ./configure \
> --target-list=x86_64-softmmu \
> --enable-kvm
> 
> Then I launched QEMU using these options:
> ./qemu-system-x86_64 \
> -m 4096 \
> -smp 4 \
> -drive file=qemu-os.qcow2 \
> -nographic \
> -enable-kvm \
> -machine q35 \
> -device pcie-root-port,slot=0,id=rp0 \
> -device nvme-subsys,id=subsys0 \
> -device 
> nvme,serial=1234,id=nvme0,subsys=subsys0,bus=rp0,sriov_max_vfs=127,sriov_max_vq_per_vf=2,sriov_max_vi_per_vf=1
> 
> Next, I issued below commands as root to configure VFs:
> nvme virt-mgmt /dev/nvme0 -c 0 -r 1 -a 1 -n 0
> nvme virt-mgmt /dev/nvme0 -c 0 -r 0 -a 1 -n 0
> nvme reset /dev/nvme0
> echo 1 > /sys/bus/pci/rescan
> 
> nvme virt-mgmt /dev/nvme0 -c 1 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 1 -r 0 -a 9 -n 0
> nvme virt-mgmt /dev/nvme0 -c 2 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 2 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 2 -r 0 -a 9 -n 0
> nvme virt-mgmt /dev/nvme0 -c 3 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 3 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 3 -r 0 -a 9 -n 0
> nvme virt-mgmt /dev/nvme0 -c 4 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 4 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 4 -r 0 -a 9 -n 0
> nvme virt-mgmt /dev/nvme0 -c 5 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 5 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 5 -r 0 -a 9 -n 0
> nvme virt-mgmt /dev/nvme0 -c 6 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 6 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 6 -r 0 -a 9 -n 0
> nvme virt-mgmt /dev/nvme0 -c 7 -r 1 -a 8 -n 1
> nvme virt-mgmt /dev/nvme0 -c 7 -r 0 -a 8 -n 2
> nvme virt-mgmt /dev/nvme0 -c 7 -r 0 -a 9 -n 0
> 
> echo 7 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
> 
> If you use only up to patch 05 inclusive then this command should do all
> the job:
> echo 7 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
> 
> The OS I used for the host and guest was Ubuntu 20.04.3 LTS.
> 
> Can you share more call stack for assert or the configuration you are
> trying to run?
> 
> Thanks,
> Lukasz
> 

Hi Lukasz,

Thanks, this all works for me and in general it all looks pretty good to
me. I don't have any big reservations about this series (the hw/nvme
parts).

However, the assert.

I did the right procedure, but if the device has a CMB, then changing
sriov_numvfs asserts QEMU. I.e., add `cmb_size_mb=64` to the controller
parameters.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]