qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

QOM address space handling


From: Mark Cave-Ayland
Subject: QOM address space handling
Date: Tue, 10 Nov 2020 11:14:39 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0

Hi all,

This email follows on from my investigation of intermittent Travis-CI failures in make check's device-introspect test when trying to add the patch at https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06093.html to my last qemu-sparc pull request.

The patch itself seems fairly harmless: moving the sun4u-iommu device as a QOM child of the sabre PCI host bridge device. So why was "make check" randomly segfaulting on Travis-CI?

The hardest part was trying to reproduce the issue to debug it: eventually after a number of Travis-CI runs I discovered I could generate the same problem locally if I ran "make check" around 15-20 times in a row, and that gave me a backtrace that looked like this:

0x0000000000614b69 in address_space_init (as=0x16f684d8,
root=0x16f68530, name=0x9a1db2 "iommu-as") at ../softmmu/memory.c:2780
2780        QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link);
(gdb) bt
#0  0x0000000000614b69 in address_space_init (as=0x16f684d8,
 root=0x16f68530, name=0x9a1db2 "iommu-as") at
../softmmu/memory.c:2780
#1  0x00000000005b8f6a in iommu_init (obj=0x16f681c0) at 
../hw/sparc64/sun4u_iommu.c:301
#2  0x000000000070a997 in object_init_with_type (obj=0x16f681c0,
 ti=0x1629fac0) at ../qom/object.c:375

With the debugger attached I was able to figure out what was happening: the sun4u-iommu device creates the iommu-as address space during instance init, but doesn't have a corresponding instance finalize to remove it which leaves a dangling pointer in the address_spaces QTAILQ.

Normally this doesn't matter because IOMMUs are created once during machine init, but device-introspect-test instantiates sun4u-iommu (and with the patch sabre also adds it as a child object during instance init) which adds more dangling pointers to the address_spaces list. Every so often the dangling pointers end up pointing to memory that gets reused by another QOM object, eventually causing random segfaults during instance finalize and/or property iteration.

There are 2 possible solutions here: 1) ensure QOM objects that add address spaces during instance init have a corresponding instance finalize function to remove them or 2) move the creation of address spaces from instance init to realize.

Does anyone have any arguments for which solution is preferred?

As part of this work I hacked up an address_space_count() function in memory.c that returns the size of the address_spaces QTAILQ and added a printf() to display the value during instance init and finalize which demonstrates the problem nicely. This means it should be possible to add a similar to check to device-introspect-test in future to prevent similar errors from happening again.


ATB,

Mark.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]