On Thu, 30 Sep 2021, Laurent Vivier wrote:
Failover needs to detect the end of the PCI unplug to start migration
after the VFIO card has been unplugged.
To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset in
pcie_unplug_device().
But since
17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35")
we have switched to ACPI unplug and these functions are not called anymore
and the flag not set. So failover migration is not able to detect if card
is really unplugged and acts as it's done as soon as it's started. So it
doesn't wait the end of the unplug to start the migration. We don't see any
problem when we test that because ACPI unplug is faster than PCIe native
hotplug and when the migration really starts the unplug operation is
already done.
See c000a9bd06ea ("pci: mark device having guest unplug request pending")
a99c4da9fc2a ("pci: mark devices partially unplugged")
Ok so I have a basic question about partially_hotplugged flag in the
device struct (there were no comments added in a99c4da9fc2a39847
explaining it). It seems we return early from pcie_unplug_device() when
this flag is set from failover_unplug_primary() in virtio-net. What is the
purpose of this flag? It seems we are not doing a full unplug of the
primary device?