qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 18/24] qdev: hotplug: provide do_unplug handl


From: David Hildenbrand
Subject: Re: [Qemu-devel] [PATCH v4 18/24] qdev: hotplug: provide do_unplug handler
Date: Tue, 2 Oct 2018 11:49:09 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0

On 01/10/2018 15:24, Igor Mammedov wrote:
> On Fri, 28 Sep 2018 14:21:33 +0200
> David Hildenbrand <address@hidden> wrote:
> 
>> On 27/09/2018 15:01, Igor Mammedov wrote:
>>> On Wed, 26 Sep 2018 11:42:13 +0200
>>> David Hildenbrand <address@hidden> wrote:
>>>   
>>>> The unplug and unplug_request handlers are special: They are not
>>>> executed when unrealizing a device, but rather trigger the removal of a
>>>> device from device_del() via object_unparent() - to effectively
>>>> unrealize a device.
>>>>
>>>> If such a device has a child bus and another device attached to
>>>> that bus (e.g. how virtio devices are created with their proxy device),
>>>> we will not get a call to the unplug handler. As we want to support
>>>> hotplug handlers (and especially also some unplug logic to undo resource
>>>> assignment) for such devices, we cannot simply call the unplug handler
>>>> when unrealizing - it has a different semantic ("trigger removal").
>>>>
>>>> To handle this scenario, we need a do_unplug handler, that will be
>>>> executed for all devices with a hotplug handler.  
>>> could you clarify what would be call flow for unplug in this case
>>> starting from 'device_del'?  
>>
>> Let's work it through for virtio-pmem:
>>
>> qemu-system-x86_64 -machine pc -m 8G,maxmem=20G \
>>   [...] \
>>   -object memory-backend-file,id=mem1,share,mem-path=/dev/zero,size=4G \
>>   -device virtio-pmem-pci,id=vp1,memdev=mem1 -monitor stdio
>>
>> info qtree gives us:
>>
>>    bus: pci.0
>>       type PCI
>>       dev: virtio-pmem-pci, id "vp1"
>>      [...]
>>         bus: virtio-bus
>>           type virtio-pci-bus
>>           dev: virtio-pmem, id ""
>>             memaddr = 9663676416 (0x240000000)
>>             memdev = "/objects/mem1"
>>          [...]
>>
>> "device_del vp1":
>>
>> qmp_device_del(vp1)->qdev_unplug(vp1)->hotplug_handler_unplug_request(vp1)
>>
>> piix4_device_unplug_request_cb(vp1)->acpi_pcihp_device_unplug_cb(vp1)
>>
>> -> Guest has to process the request and respond  
>>
>> acpi_pcihp_eject_slot(vp1)->object_unparent(vp1)
> that's one of the possible call flows, unplug could also originate
> from shpc or native pci-e hot-plug.
> PCI unplug hasn't ever been factored out from old PCI device/bus code,
> so PCIDevice::unrealize takes care of parent resource teardown.
> (well, there wasn't any reason to factor it out till we started
> talking about hybrid devices).
> We probably should do the same refactoring like it was done for
> pc-dimm/cpu unplug
> (see qdev_get_hotplug_handler()+hotplug_handler_unplug() usage)
> 
>> Now, this triggers the unplug of the device hierarchy:
>>
>> object_unparent(vp1)->device_unparent(vp1)>device_set_realized(vp1, 0)
>>
>> ->bus_set_realized(virtio-bus, 0)->device_set_realized(virtio-pmem, 0)  
>>
>> This is the place where this hooks is comes into play:
>>
>> ->hotplug_handler_do_unplug(virtio-pmem)->machine  
>> handler->virtio_pmem_do_unplug(virtio-pmem)
>>
>> Followed by object_unparent(virtio-bus)->bus_unparent(virtio-bus)
>> Followed by object_unparent(virtio-pmem)->device_unparent(virtio-pmem)
>>
>>
>> At this place, the hierarchy is gone. Hotplug succeeded and the
>> virtio-pmem device (memory device) has been properly unplugged.
> I'm concerned that both plug and unplug flows are implicit
> and handled as if it were separate devices without enforcing
> a particular ordering of (un)plug handlers.
> It would work right now but it looks rather fragile to me.

In my ideal world, the plug+unplug handlers would only perform checks
and essentially trigger an object_unparent(). (either directly or by
some guest action).

Inside object_unparent(), the call flow of unrealize steps is defined.
By moving the "real unplug" part into "do_unplug" and therefor
essentially calling it when unrealizing, we could generalize this for
all unplug handlers.

I think, order of realization and therefore the order of hotplug handler
calls is strictly defined already. Same applies to unrealization if we
would factor the essential parts out into e.g. "do_unplug". That order
is strictly encoded in device_set_realized() and bus_set_realized().

> 
> If I remember right, the suggested and partially implemented idea
> in one of your previous series was to override default hotplug
> handler with a machine one for plugged in device [1][2].
> However impl. wasn't exactly what I've suggested since it matches
> all memory-devices.
> 
> 1) qdev: let machine hotplug handler to override bus hotplug handler
> 2) pc: route all memory devices through  the machine hotplug handler
> 
> So lets reiterate, we have TYPE_VIRTIO_PMEM and TYPE_VIRTIO_PMEM_PCI
> the former implements TYPE_MEMORY_DEVICE interface and the later is
> a wrapper PCI/whatnot device shim.
> So when you plug that composite device you'd get 2 independent
> plug hooks called, which makes it unrelable/broken design.

Can you elaborate why this is broken? I don't consider the
realize/unrealize order broken, and that is where we plug into. But yes,
we logically plug a device hierarchy and therefore get a separate
hotplug handler calls.

> 
> My next question would be why TYPE_VIRTIO_PMEM_PCI can't implement 
> TYPE_MEMORY_DEVICE interface and TYPE_VIRTIO_PMEM be a simple VIRTIO
> device without any hotplug hooks (so shim device would proxy all
> memory-device logic to its child)?
> /huh, then you don't need get_device_id() as well/

I had the same idea while going through different options. Then we would
have to forward all calls directly to the child. We cannot reuse
TYPE_MEMORY_DEVICE, so we would either need a new interface or define
the functions we want manually for each such device.

> 
> That way using [2] and [1 - modulo it should match only concrete type]
> machine would be able to override hotplug handlers for TYPE_VIRTIO_PMEM_PCI
> and explicitly call machine + pci hotplug handlers in necessary order.
> 
> flow would look like:
>   [acpi|shcp|native pci-e eject]->  
>        hotplug_ctrl = qdev_get_hotplug_handler(dev);
>        hotplug_handler_unplug(hotplug_ctrl, dev, &local_err); ->
>             machine_unplug()
>                machine_virtio_pci_pmem_cb(): 
>                   // we now that's device has 2 stage hotplug handlers,
>                   // so we can arrange hotplug sequence in necessary order
>                   hotplug_ctrl2 = qdev_get_bus_hotplug_handler(dev);
> 
>                   //then do unplug in whatever order that's correct,
>                   // I'd assume tear down/stop PCI device first, flushing
>                   // command virtio command queues and that unplug memory 
> itself.
>                   hotplug_handler_unplug(hotplug_ctrl2, dev, &local_err);
>                   memory_device_unplug()
> 
> Similar logic applies to device_add/device_del paths, with a difference that
> origin point would be monitor/qmp.

Let's see. User calls device_del(). That triggers an unplug_request. For
virtio-pmem, there is nothing to do.

eject hook is called by the guest. For now we do an object_unparent.
This would now be wrong. We would have to call a proper hotplug handler
chain (I guess that's the refactoring you mentioned above).

> 
> Point is to have a single explicit callback chain that applies to a concrete
> device type. That way if ever change an ordering of calling plug callbacks
> in qdev core, the expected for a device callback sequence would still
> remain in place ensuring that device (un)plugged as expected.

I haven't tested yet if this will work, but I can give it a try. I
learned that in QEMU things often seem easier than they actually are :)

> 
> Also it should be easier to trace for a reader, than 2 disjoint callbacks of
> composite device (which only know to author of that device and then only
> till he/she remembers how that thing works).

In my view it makes things slightly more complicated, because you have
to follow a hotplug handler chain that plugs devices via proxy devices.
(e.g. passing through TYPE_MEMORY_DEVICE calls to a child, and therefore
essentially hotplugging the child), instead of only watching out for
which device get's hotplugged and finding exactly one hotplug handler.
Of course, for a device hierarchy, multiple devices get logically
hotplugged.

-- 

Thanks,

David / dhildenb



reply via email to

[Prev in Thread] Current Thread [Next in Thread]