qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' at


From: Cédric Le Goater
Subject: Re: [Qemu-devel] [PATCH] vfio/pci: do not set the PCIDevice 'has_rom' attribute
Date: Mon, 9 Jul 2018 09:04:47 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0

On 07/06/2018 07:16 PM, Alex Williamson wrote:
> On Fri,  6 Jul 2018 18:36:14 +0200
> Cédric Le Goater <address@hidden> wrote:
> 
>> PCI devices needing a ROM allocate an optional MemoryRegion with
>> pci_add_option_rom(). pci_del_option_rom() does the cleanup when the
>> device is destroyed. The only action taken by this routine is to call
>> vmstate_unregister_ram() which clears the id string of the optional
>> ROM RAMBlock and now, also flags the RAMBlock as non-migratable. This
>> was recently added by commit b895de502717 ("migration: discard
>> non-migratable RAMBlocks"), .
>>
>> VFIO devices do their own loading of the PCI option ROM in
>> vfio_pci_size_rom(). The memory region is switched to an I/O region
>> and the PCI attribute 'has_rom' is set but the RAMBlock of the ROM
>> region is not allocated. When the associated PCI device is deleted,
>> pci_del_option_rom() calls vmstate_unregister_ram() which tries to
>> flag a NULL RAMBlock, leading to a SEGV.
>>
>> It seems that 'has_rom' was set to have memory_region_destroy()
>> called, but since commit 469b046ead06 ("memory: remove
>> memory_region_destroy") this is not necessary anymore as the
>> MemoryRegion is freed automagically.
>>
>> Remove the PCIDevice 'has_rom' attribute setting in vfio.
>>
>> Signed-off-by: Cédric Le Goater <address@hidden>
> 
> I think the segfault can be attributed to:
> 
> fa53a0e53efd ("memory: drop find_ram_block()")
> 
> Prior to that vmstate_unregister_ram() called
> memory_region_get_ram_addr() which would have resulted in
> RAM_ADDR_INVALID.  This would have been passed to
> qemu_ram_unset_idstr() which would have used find_ram_block() to lookup
> the RAMBlock, which would be NULL for the invalid address, safely
> avoiding any sort of segfault.

Yes, but since, commit b895de502717 ("migration: discard non-migratable 
RAMBlocks") added : 

 void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev)
 {
     qemu_ram_unset_idstr(mr->ram_block);
+    qemu_ram_unset_migratable(mr->ram_block);
 }

and qemu_ram_unset_migratable() does not check the block validity.

C.

> TL;DR, I'll add the above commit with a Fixes: tag for stable and
> downstream releases, looks good otherwise.  Thanks,
> 
> Alex
> 
>> ---
>>
>>  Tested on a KVM POWER9 pseries machine and a Mellanox MT27710
>>  Ethernet controller. Performed a couple of plug/unplug, migrated, and
>>  did a couple more unplug/plug before powering off.
>>
>>  The same tests were done with the previous patches which were
>>  addressing the issue at a different level : 
>>
>>    1. [PATCH] exec.c: check RAMBlock validity before changing its flag
>>       https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00009.html
>>
>>    2. [PATCH] pci: remove pci_del_option_rom()
>>       https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg01651.html
>>
>>  Do we still want to remove pci_del_option_rom() ?
>>
>>  I caught this bug while deleting a passthrough device from a pseries
>>  machine. Here is the stack:
>>    
>>    #0  qemu_ram_unset_migratable (rb=0x0) at 
>> /home/legoater/work/qemu/qemu-xive-3.0.git/exec.c:1994
>>    #1  0x000000010072def0 in vmstate_unregister_ram (mr=0x101796af0, 
>> dev=<optimized out>)
>>    #2  0x0000000100694e5c in pci_del_option_rom (pdev=0x101796330)
>>    #3  pci_qdev_unrealize (dev=<optimized out>, errp=<optimized out>)
>>    #4  0x00000001005ff910 in device_set_realized (obj=0x101796330, 
>> value=<optimized out>, errp=0x0)
>>    #5  0x00000001007a487c in property_set_bool (obj=0x101796330, 
>> v=<optimized out>, name=<optimized out>, 
>>    #6  0x00000001007a7878 in object_property_set (obj=0x101796330, 
>> v=0x7fff70033110, 
>>    #7  0x00000001007aaf1c in object_property_set_qobject (obj=0x101796330, 
>> value=<optimized out>, 
>>    #8  0x00000001007a7b90 in object_property_set_bool (obj=0x101796330, 
>> value=<optimized out>, 
>>    #9  0x00000001005fcdd8 in device_unparent (obj=0x101796330)
>>    #10 0x00000001007a6dd0 in object_finalize_child_property (obj=<optimized 
>> out>, name=<optimized out>, 
>>    #11 0x00000001007a50c0 in object_property_del_child (obj=0x10111f800, 
>> child=0x101796330, 
>>    #12 0x0000000100425cc0 in spapr_phb_remove_pci_device_cb (dev=0x101796330)
>>    #13 0x0000000100427974 in spapr_drc_release (drc=0x1017e2df0)
>>    #14 0x0000000100429098 in spapr_drc_detach (drc=0x1017e2df0)
>>    #15 0x00000001004294e0 in drc_isolate_physical (drc=0x1017e2df0)
>>    #16 0x000000010042a50c in rtas_set_isolation_state (state=0, 
>> idx=<optimized out>)
>>  
>>  hw/vfio/pci.c |    1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index a1577dea7fdb..6cbb8fa0549d 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -990,7 +990,6 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
>>      pci_register_bar(&vdev->pdev, PCI_ROM_SLOT,
>>                       PCI_BASE_ADDRESS_SPACE_MEMORY, &vdev->pdev.rom);
>>  
>> -    vdev->pdev.has_rom = true;
>>      vdev->rom_read_failed = false;
>>  }
>>  
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]