[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
QOM address space handling
From: |
Mark Cave-Ayland |
Subject: |
QOM address space handling |
Date: |
Tue, 10 Nov 2020 11:14:39 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.4.0 |
Hi all,
This email follows on from my investigation of intermittent Travis-CI failures in
make check's device-introspect test when trying to add the patch at
https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06093.html to my last
qemu-sparc pull request.
The patch itself seems fairly harmless: moving the sun4u-iommu device as a QOM child
of the sabre PCI host bridge device. So why was "make check" randomly segfaulting on
Travis-CI?
The hardest part was trying to reproduce the issue to debug it: eventually after a
number of Travis-CI runs I discovered I could generate the same problem locally if I
ran "make check" around 15-20 times in a row, and that gave me a backtrace that
looked like this:
0x0000000000614b69 in address_space_init (as=0x16f684d8,
root=0x16f68530, name=0x9a1db2 "iommu-as") at ../softmmu/memory.c:2780
2780 QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link);
(gdb) bt
#0 0x0000000000614b69 in address_space_init (as=0x16f684d8,
root=0x16f68530, name=0x9a1db2 "iommu-as") at
../softmmu/memory.c:2780
#1 0x00000000005b8f6a in iommu_init (obj=0x16f681c0) at
../hw/sparc64/sun4u_iommu.c:301
#2 0x000000000070a997 in object_init_with_type (obj=0x16f681c0,
ti=0x1629fac0) at ../qom/object.c:375
With the debugger attached I was able to figure out what was happening: the
sun4u-iommu device creates the iommu-as address space during instance init, but
doesn't have a corresponding instance finalize to remove it which leaves a dangling
pointer in the address_spaces QTAILQ.
Normally this doesn't matter because IOMMUs are created once during machine init, but
device-introspect-test instantiates sun4u-iommu (and with the patch sabre also adds
it as a child object during instance init) which adds more dangling pointers to the
address_spaces list. Every so often the dangling pointers end up pointing to memory
that gets reused by another QOM object, eventually causing random segfaults during
instance finalize and/or property iteration.
There are 2 possible solutions here: 1) ensure QOM objects that add address spaces
during instance init have a corresponding instance finalize function to remove them
or 2) move the creation of address spaces from instance init to realize.
Does anyone have any arguments for which solution is preferred?
As part of this work I hacked up an address_space_count() function in memory.c that
returns the size of the address_spaces QTAILQ and added a printf() to display the
value during instance init and finalize which demonstrates the problem nicely. This
means it should be possible to add a similar to check to device-introspect-test in
future to prevent similar errors from happening again.
ATB,
Mark.
- QOM address space handling,
Mark Cave-Ayland <=