qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/1] msix: correct pba size calculation used for


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH 1/1] msix: correct pba size calculation used for msix_exclusive_bar initialization
Date: Mon, 14 Jan 2019 18:14:13 -0500

On Mon, Dec 17, 2018 at 11:15:46AM +0800, Dongli Zhang wrote:
> Hi Michael,
> 
> CCed Jason as this is related to commit a0ccd2123ee2 "pci: remove hard-coded 
> bar
> size in msix_init_exclusive_bar()"
> 
> Please see my reply inline.
> 
> On 12/17/2018 10:24 AM, Michael S. Tsirkin wrote:
> > On Mon, Dec 17, 2018 at 07:34:39AM +0800, Dongli Zhang wrote:
> >> The bar_pba_size is more than what the pba is expected to have, although
> >> this usually would not affect the bar_size used for dev->msix_exclusive_bar
> >> initialization.
> >>
> >> Signed-off-by: Dongli Zhang <address@hidden>
> > 
> > 
> > If this does ever have an effect, we need a compat config
> > for old machine types.
> > Could you explain a bit more? Are there configs affected?
> > What are these?
> 
> Here I copy parts of msix_init_exclusive_bar() and msix_init().
> 
> 'bar_pba_size' is computed (line 348) in msix_init_exclusive_bar(), while
> 'pba_size' is computed (line 291) in msix_init().
> 
> msix_init_exclusive_bar() is one of callers of msix_init().
> 
> I think both 'bar_pba_size' and 'pba_size' are indicating the size of pba. The
> difference is, while 'bar_pba_size' is used to initialize the size of
> MemoryRegion (dev->msix_exclusive_bar), 'pba_size' is used to initialize the
> config space in msix_init().
> 
> However, the equations to compute the two variables are different. Why would 
> we
> use different size for MemoryRegion and config space?

The math in msix_init_exclusive_bar allocates
too much memory in some cases.

For example consider nentries = 8.
msix_exclusive_bar will give us bar_pba_size = 16.
So 16 bytes.  However 8 bytes would be enough - this
is all that the spec requires.

So in practice bar_pba_size sometimes allocates an
extra 8 bytes but never more.

Since each MSIX entry size is 16 bytes, and since
we make sure that table+pba is a power of two,
this always leaves a multiple of 16 bytes for the PBA,
so extra 8 bytes have no effect.

I have merged your patch with the above explanation.

Thanks for the contribution!



> 
> Thanks to line 365, both equations would reach at the same bar_size.
> 
> For instance, assuming nvme with 1333 queues,
> 
> with "uint32_t bar_pba_size = (nentries / 8 + 1) * 8;", bar_pba_size=1336 
> (line
> 348) and bar_size=32768 (line 365).
> 
> with "uint32_t bar_pba_size = QEMU_ALIGN_UP(nentries, 64) / 8;",
> bar_pba_size=168 (line 348) and bar_size=32768 (line 365).
> 
> That's why I mentioned "this usually would not affect the bar_size used for
> dev->msix_exclusive_bar initialization."
> 
> 
> 
> 341 int msix_init_exclusive_bar(PCIDevice *dev, unsigned short nentries,
> 342                             uint8_t bar_nr, Error **errp)
> 343 {
> 344     int ret;
> 345     char *name;
> 346     uint32_t bar_size = 4096;
> 347     uint32_t bar_pba_offset = bar_size / 2;
> 348     uint32_t bar_pba_size = (nentries / 8 + 1) * 8;
> 349
> 350     /*
> 351      * Migration compatibility dictates that this remains a 4k
> 352      * BAR with the vector table in the lower half and PBA in
> 353      * the upper half for nentries which is lower or equal to 128.
> 354      * No need to care about using more than 65 entries for legacy
> 355      * machine types who has at most 64 queues.
> 356      */
> 357     if (nentries * PCI_MSIX_ENTRY_SIZE > bar_pba_offset) {
> 358         bar_pba_offset = nentries * PCI_MSIX_ENTRY_SIZE;
> 359     }
> 360
> 361     if (bar_pba_offset + bar_pba_size > 4096) {
> 362         bar_size = bar_pba_offset + bar_pba_size;
> 363     }
> 364
> 365     bar_size = pow2ceil(bar_size);
> 366
> 367     name = g_strdup_printf("%s-msix", dev->name);
> 368     memory_region_init(&dev->msix_exclusive_bar, OBJECT(dev), name, 
> bar_size);
> 369     g_free(name);
> 370
> 371     ret = msix_init(dev, nentries, &dev->msix_exclusive_bar, bar_nr,
> 372                     0, &dev->msix_exclusive_bar,
> 373                     bar_nr, bar_pba_offset,
> 374                     0, errp);
> 375     if (ret) {
> 376         return ret;
> 377     }
> 378
> 379     pci_register_bar(dev, bar_nr, PCI_BASE_ADDRESS_SPACE_MEMORY,
> 380                      &dev->msix_exclusive_bar);
> 381
> 382     return 0;
> 383 }
> 
> 
> 269 int msix_init(struct PCIDevice *dev, unsigned short nentries,
> 270               MemoryRegion *table_bar, uint8_t table_bar_nr,
> 271               unsigned table_offset, MemoryRegion *pba_bar,
> 272               uint8_t pba_bar_nr, unsigned pba_offset, uint8_t cap_pos,
> 273               Error **errp)
> 274 {
> 275     int cap;
> 276     unsigned table_size, pba_size;
> 277     uint8_t *config;
> 278
> 279     /* Nothing to do if MSI is not supported by interrupt controller */
> 280     if (!msi_nonbroken) {
> 281         error_setg(errp, "MSI-X is not supported by interrupt 
> controller");
> 282         return -ENOTSUP;
> 283     }
> 284
> 285     if (nentries < 1 || nentries > PCI_MSIX_FLAGS_QSIZE + 1) {
> 286         error_setg(errp, "The number of MSI-X vectors is invalid");
> 287         return -EINVAL;
> 288     }
> 289
> 290     table_size = nentries * PCI_MSIX_ENTRY_SIZE;
> 291     pba_size = QEMU_ALIGN_UP(nentries, 64) / 8;
> 
> 
> Dongli Zhang
> 
> 
> > 
> > 
> >> ---
> >>  hw/pci/msix.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> >> index 702dac4..234c0a3 100644
> >> --- a/hw/pci/msix.c
> >> +++ b/hw/pci/msix.c
> >> @@ -345,7 +345,7 @@ int msix_init_exclusive_bar(PCIDevice *dev, unsigned 
> >> short nentries,
> >>      char *name;
> >>      uint32_t bar_size = 4096;
> >>      uint32_t bar_pba_offset = bar_size / 2;
> >> -    uint32_t bar_pba_size = (nentries / 8 + 1) * 8;
> >> +    uint32_t bar_pba_size = QEMU_ALIGN_UP(nentries, 64) / 8;
> >>  
> >>      /*
> >>       * Migration compatibility dictates that this remains a 4k
> >> -- 
> >> 2.7.4



reply via email to

[Prev in Thread] Current Thread [Next in Thread]