qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PULL 61/73] hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register


From: Peter Xu
Subject: Re: [PULL 61/73] hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register
Date: Tue, 25 Apr 2023 20:42:17 -0400

Hi, Michael, Jonathan,

On Tue, Mar 07, 2023 at 08:13:53PM -0500, Michael S. Tsirkin wrote:
> From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> This register in AER should be both writeable and should
> have a default value with a couple of the errors masked
> including the Uncorrectable Internal Error used by CXL for
> it's error reporting.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Message-Id: <20230302133709.30373-2-Jonathan.Cameron@huawei.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: Fan Ni <fan.ni@samsung.com>
> ---
>  include/hw/pci/pcie_regs.h | 3 +++
>  hw/pci/pcie_aer.c          | 4 ++++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/include/hw/pci/pcie_regs.h b/include/hw/pci/pcie_regs.h
> index 1fe0bdd25b..4972106c42 100644
> --- a/include/hw/pci/pcie_regs.h
> +++ b/include/hw/pci/pcie_regs.h
> @@ -141,6 +141,9 @@ typedef enum PCIExpLinkWidth {
>                                           PCI_ERR_UNC_ATOP_EBLOCKED |    \
>                                           PCI_ERR_UNC_TLP_PRF_BLOCKED)
>  
> +#define PCI_ERR_UNC_MASK_DEFAULT        (PCI_ERR_UNC_INTN | \
> +                                         PCI_ERR_UNC_TLP_PRF_BLOCKED)
> +
>  #define PCI_ERR_UNC_SEVERITY_DEFAULT    (PCI_ERR_UNC_DLP |              \
>                                           PCI_ERR_UNC_SDN |              \
>                                           PCI_ERR_UNC_FCP |              \
> diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
> index 9a19be44ae..909e027d99 100644
> --- a/hw/pci/pcie_aer.c
> +++ b/hw/pci/pcie_aer.c
> @@ -112,6 +112,10 @@ int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver, 
> uint16_t offset,
>  
>      pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
>                   PCI_ERR_UNC_SUPPORTED);
> +    pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
> +                 PCI_ERR_UNC_MASK_DEFAULT);
> +    pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
> +                 PCI_ERR_UNC_SUPPORTED);

This breaks the simplest migration from QEMU 8.0->7.2 binaries on all
machine types I think as long as the cap is present, e.g. the default
e1000e provided by the default q35 machine can already hit it with all
default cmdline:

  ./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]

7.2 binary will have empty wmask for PCI_ERR_UNCOR_MASK, meanwhile I think
it can also see a non-zero value, then the migration will fail at:

vmstate_load 0000:00:02.0/e1000e, e1000e                                        
           
qemu-7.2: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 
cmask: ff wmask: 0 w1cmask:0
qemu-7.2: Failed to load PCIDevice:config   
qemu-7.2: Failed to load e1000e:parent_obj                                      
          
qemu-7.2: error while loading state for instance 0x0 of device 
'0000:00:02.0/e1000e'      
qemu-7.2: load of migration failed: Invalid argument

We probably at least want to have the default value to be still zero, and
we'd need to make sure it'll not be modified by the guest, iiuc.

Below oneliner works for me and makes the migration work again:

===8<===
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 103667c368..563a37b79c 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -113,7 +113,7 @@ int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver, uint16_t 
offset,
     pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
                  PCI_ERR_UNC_SUPPORTED);
     pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
-                 PCI_ERR_UNC_MASK_DEFAULT);
+                 0/*PCI_ERR_UNC_MASK_DEFAULT*/);
     pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
                  PCI_ERR_UNC_SUPPORTED);
===8<===

Anyone could have a look on a solid solution from PCI side?

Copy Juan and Leonardo.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]