qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable


From: Klaus Jensen
Subject: Re: [PATCH 10/15] hw/nvme: Make max_ioqpairs and msix_qsize configurable in runtime
Date: Wed, 3 Nov 2021 13:11:15 +0100

On Oct 21 15:40, Łukasz Gieryk wrote:
> On Wed, Oct 20, 2021 at 09:06:06PM +0200, Klaus Jensen wrote:
> > On Oct  7 18:24, Lukasz Maniak wrote:
> > > From: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > > 
> > > The Nvme device defines two properties: max_ioqpairs, msix_qsize. Having
> > > them as constants is problematic for SR-IOV support.
> > > 
> > > The SR-IOV feature introduces virtual resources (queues, interrupts)
> > > that can be assigned to PF and its dependent VFs. Each device, following
> > > a reset, should work with the configured number of queues. A single
> > > constant is no longer sufficient to hold the whole state.
> > > 
> > > This patch tries to solve the problem by introducing additional
> > > variables in NvmeCtrl’s state. The variables for, e.g., managing queues
> > > are therefore organized as:
> > > 
> > >  - n->params.max_ioqpairs – no changes, constant set by the user.
> > > 
> > >  - n->max_ioqpairs - (new) value derived from n->params.* in realize();
> > >                      constant through device’s lifetime.
> > > 
> > >  - n->(mutable_state) – (not a part of this patch) user-configurable,
> > >                         specifies number of queues available _after_
> > >                         reset.
> > > 
> > >  - n->conf_ioqpairs - (new) used in all the places instead of the ‘old’
> > >                       n->params.max_ioqpairs; initialized in realize()
> > >                       and updated during reset() to reflect user’s
> > >                       changes to the mutable state.
> > > 
> > > Since the number of available i/o queues and interrupts can change in
> > > runtime, buffers for sq/cqs and the MSIX-related structures are
> > > allocated big enough to handle the limits, to completely avoid the
> > > complicated reallocation. A helper function (nvme_update_msixcap_ts)
> > > updates the corresponding capability register, to signal configuration
> > > changes.
> > > 
> > > Signed-off-by: Łukasz Gieryk <lukasz.gieryk@linux.intel.com>
> > 
> > Instead of this, how about adding new parameters, say, sriov_vi_private
> > and sriov_vq_private. Then, max_ioqpairs and msix_qsize are still the
> > "physical" limits and the new parameters just reserve some for the
> > primary controller, the rest being available for flexsible resources.
> 
> Compare your configuration:
> 
>     max_ioqpairs     = 26
>     sriov_max_vfs    = 4
>     sriov_vq_private = 10
> 
> with mine:
> 
>     max_ioqpairs        = 10
>     sriov_max_vfs       = 4
>     sriov_max_vq_per_vf = 4
> 
> In your version, if I wanted to change max_vfs but keep the same number
> of flexible resources per VF, then I would have to do some math and
> update max_ioparis. And then I also would have to adjust the other
> interrupt-related parameter, as it's also affected. In my opinion
> it's quite inconvenient.

True, that is probably inconvenient, but we have tools to do this math
for us. I very much prefer to be explicit in these parameters.

Also, see my comment on patch 12. If we keep this meaning of
max_ioqpairs, then we have reasonable defaults for the number of private
resources (if no flexible resources are required) and I think we can
control all parameters in the capabilities structures (with a little
math).

>  
> Now, even if I changed the semantic of params, I would still need most
> of this patch. (Let’s keep the discussion regarding if max_* fields are
> necessary in the other thread).
> 
> Without virtualization, the maximum number of queues is constant. User
> (i.e., nvme kernel driver) can only query this value (e.g., 10) and
> needs to follow this limit.
> 
> With virtualization, the flexible resources kick in. Let's continue with
> the sample numbers defined earlier (10 private + 16 flexible resources).
> 
> 1) The device boots, all 16 flexible queues are assigned to the primary
>    controller.
> 2) Nvme kernel driver queries for the limit (10+16=26) and can create/use
>    up to this many queues. 
> 3) User via the virtualization management command unbinds some (let's
>    say 2) of the flexible queues from the primary controller and assigns
>    them to a secondary controller.
> 4) After reset, the Physical Function Device reports different limit
>    (24), and when the Virtual Device shows up, it will report 1 (adminQ
>    consumed the other resource). 
> 
> So I need additional variable in the state to store the intermediate
> limit (24 or 1), as none of the existing params has the correct value,
> and all the places that validate limits must work on the value.
> 

I do not contest that you need additional state to keep track of
assigned resources. That seems totally reasonable.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]