qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI


From: Jan Kiszka
Subject: Re: [Qemu-devel] MSI-X bug with ivshmem since msix_reset moved to PCI
Date: Fri, 24 Aug 2012 10:39:15 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2012-08-24 10:36, Michael S. Tsirkin wrote:
> On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
>> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
>>> On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
>>>> On 2012-08-24 01:13, Cam Macdonell wrote:
>>>>> Hi Jan,
>>>>>
>>>>> I've bisected a bug in which MSI interrupts are not being delivered to
>>>>> the following patch, where msix_reset was moved in tot he PCI core.
>>>>>
>>>>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
>>>>> Author: Jan Kiszka <address@hidden>
>>>>> Date:   Tue May 15 20:09:56 2012 -0300
>>>>>
>>>>>     msi: Invoke msi/msix_reset from PCI core
>>>>>
>>>>>     There is no point in pushing this burden to the devices, they tend to
>>>>>     forget to call them (like intel-hda, ahci, xhci did). Instead, reset
>>>>>     functions are now called from pci_device_reset. They do nothing if
>>>>>     MSI/MSI-X is not in use.
>>>>>
>>>>> I've been debugging and it seems that when msix_notify() is triggered
>>>>> the second test in the "if" fails
>>>>>
>>>>> /* Send an MSI-X message */
>>>>> void msix_notify(PCIDevice *dev, unsigned vector)
>>>>> {
>>>>>     MSIMessage msg;
>>>>>
>>>>>     if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
>>>>>         return;
>>>>>
>>>>>    …
>>>>> }
>>>>>
>>>>> here is some MSI-X debugging statements
>>>>>
>>>>> msix_init
>>>>> IVSHMEM: msix initialized (1 vectors)
>>>>> IVSHMEM: using vector 0
>>>>> IVSHMEM: ivshmem_reset
>>>>> IVSHMEM: using vector 0
>>>>> msix_reset
>>>>> msix_free_irq_entries 0x7fd52d1cea20
>>>>>
>>>>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
>>>>> may be the cause.
>>>>
>>>> I suppose you mean it sets the msix_entry_used array to 0.
>>>>
>>>>>
>>>>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
>>>>> by the msix_reset?
>>>>
>>>> Actually, the whole msix vector usage tracking is useless today, this
>>>> just shows its downsides (in the absence of benefits). Megasas is
>>>> affected by this problem as well, virtio not as it calls msix_vector_use
>>>> during the configuration process the guest driver triggers.
>>>>
>>>> Two options:
>>>>  - I can send my removal patch for msix_vector_use/unuse that I was
>>>>    only planning for 1.3 so far, and we kill this pitfall earlier.
>>>>  - We re-add msix_vector_use calls to the affected device models for
>>>>    1.2 and drop them later again for 1.3 when removing usage tracking.
>>>> [The third option to keep the usage tracking is a non-option for me. ;)]
>>>>
>>>> Michael?
>>>
>>> Second option seems more prudent to me. Can you send a patch pls?
>>
>> In fact, it's not as easy. ivshmem already tries to restore the usage
>> flag but fails due to reset handler ordering. I do not see ATM where
>> there is some "enable device" during re-initialization, at least for
>> ivshmem. Haven't checked megasas yet.
>>
>> I tend to think removing is simpler and less risky. Please have a look
>> at the patch I sent earlier.
>>
>> Jan
>>
>>
> 
> Actually virtio is the only one that changes vector use
> so the only one that needs to reset it.
> 
> I am thinking we should find a way to move vector use
> out from core to virtio-pci.
> But for now something like the below should do the trick.
> Untested unfortunately, will test virtio Sunday
> but would appreciate review and testing reports
> esp with ivshmem etc.
> 
> -->
> 
> msix: avoid need to use/unuse vectors for most devices
> 
> The facility to use/unuse vectors is helpful for virtio
> but little else since everyone just seems to use
> vectors in their init function.
> 
> Avoid clearing msix vector use info on reset and load.
> For virtio, clear it explicitly.
> 
> Signed-off-by: Michael S. Tsirkin <address@hidden>
> 
> ---
> 
> diff --git a/hw/msix.c b/hw/msix.c
> index 800fc32..d040cc2 100644
> --- a/hw/msix.c
> +++ b/hw/msix.c
> @@ -340,6 +340,15 @@ static void msix_free_irq_entries(PCIDevice *dev)
>      }
>  }
>  
> +static void msix_clear_all_vectors(PCIDevice *dev)
> +{
> +    int vector;
> +
> +    for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
> +        msix_clr_pending(dev, vector);
> +    }
> +}
> +
>  /* Clean up resources for the device. */
>  void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion 
> *pba_bar)
>  {
> @@ -394,7 +403,7 @@ void msix_load(PCIDevice *dev, QEMUFile *f)
>          return;
>      }
>  
> -    msix_free_irq_entries(dev);
> +    msix_clear_all_vectors(dev);
>      qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
>      qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
>      msix_update_function_masked(dev);
> @@ -440,7 +449,7 @@ void msix_reset(PCIDevice *dev)
>      if (!msix_present(dev)) {
>          return;
>      }
> -    msix_free_irq_entries(dev);
> +    msix_clear_all_vectors(dev);
>      dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
>           ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
>      memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 125eded..ca0b204 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -131,6 +131,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile 
> *f)
>      if (ret) {
>          return ret;
>      }
> +    msix_unuse_all_vectors(&proxy->pci_dev);
>      msix_load(&proxy->pci_dev, f);
>      if (msix_present(&proxy->pci_dev)) {
>          qemu_get_be16s(f, &proxy->vdev->config_vector);
> @@ -246,6 +247,7 @@ void virtio_pci_reset(DeviceState *d)
>      VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
>      virtio_pci_stop_ioeventfd(proxy);
>      virtio_reset(proxy->vdev);
> +    msix_unuse_all_vectors(&proxy->pci_dev);
>      proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
>  }
>  
> 

This just prolongs the life of a wrong API, but I'm fine for 1.2 if it
helps. However, I will push on removing all this for 1.3.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]