qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH QEMU v25 17/17] qapi: Add VFIO devices migration stats in Mig


From: Markus Armbruster
Subject: Re: [PATCH QEMU v25 17/17] qapi: Add VFIO devices migration stats in Migration stats
Date: Thu, 25 Jun 2020 07:51:30 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Kirti Wankhede <kwankhede@nvidia.com> writes:

> On 6/23/2020 12:51 PM, Markus Armbruster wrote:
>> QAPI review only.
>>
>> The only changes since I reviewed v23 is the rename of VfioStats member
>> @bytes to @transferred, and the move of MigrationInfo member @vfio next
>> to @ram and @disk.  Good.  I'm copying my other questions in the hope of
>> getting answers :)
>>
>> Kirti Wankhede <kwankhede@nvidia.com> writes:
>>
>>> Added amount of bytes transferred to the target VM by all VFIO devices
>>>
>>> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
>> [...]
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index d5000558c6c9..952864b05455 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -146,6 +146,18 @@
>>>               'active', 'postcopy-active', 'postcopy-paused',
>>>               'postcopy-recover', 'completed', 'failed', 'colo',
>>>               'pre-switchover', 'device', 'wait-unplug' ] }
>>> +##
>>> +# @VfioStats:
>>> +#
>>> +# Detailed VFIO devices migration statistics
>>> +#
>>> +# @transferred: amount of bytes transferred to the target VM by VFIO 
>>> devices
>>> +#
>>> +# Since: 5.1
>>> +#
>>> +##
>>> +{ 'struct': 'VfioStats',
>>> +  'data': {'transferred': 'int' } }
>>
>> Pardon my ignorance...  What exactly do VFIO devices transfer to the
>> target VM? How is that related to MigrationInfo member @ram? 
>>
>
> Sorry I missed to reply your question on earlier version.

Happens :)

> VFIO device transfer vfio device's state, data from VFIO device and
> guest memory pages pinned for dma operation.
> For example in case of GPU, vfio device state is GPUs current state to
> be saved that will be restored during resume and device data is data
> from onboard framebuffer. Pinned memory is marked dirty and
> transferred to target VM as part of global dirty page tracking for
> RAM.
> VFIO device can add significant amount of data in migration stream
> (depending on FB size in GB), transferred byte count is important
> parameter to be monitored.

Can we work this into documentation somehow?

Have you considered adding something on VFIO migration to docs/?  Then a
link with a short description could suffice here.

>> MigrationStats has much more information, and some of it is pretty
>> useful to track how migration is doing, in particular whether it
>> converges, and how fast.  Absent in VfioStats due to "not implemented",
>> or due to "can't be done"?
>>
>
> Vfio device migration interface is same as RAM's migration interface
> (using SaveVMHandlers). Converge part is already take care by
> .save_live_pending hook where *res_precopy_only is set to vfio devices
> pending_bytes, migration->pending_bytes
>
> How fast - I'm not sure how this can be calculated.

My concern is providing management applications the means they need to
monitor migration.  Have you solicited input from management application
developers on what's needed?

"Same as RAM's migration" makes me suspect the same stats are needed.
This may well be a subset of the stats provided for RAM.

Missing stats we need can be added on top, as long as it's done in a
timely manner.  But we better know how to compute them, or how to do
without.

> Thanks,
> Kirti
>
>> Byte counts should use QAPI type 'size'.  Many existing ones don't.
>> Since MigrationStats uses 'int', I'll let the migration maintainers
>> decide whether they want 'int' or 'size' here.
>>
>>>   ##
>>>   # @MigrationInfo:
>>> @@ -207,11 +219,16 @@
>>>   #
>>>   # @socket-address: Only used for tcp, to know what the real port is 
>>> (Since 4.0)
>>>   #
>>> +# @vfio: @VfioStats containing detailed VFIO devices migration statistics,
>>> +#        only returned if VFIO device is present, migration is supported 
>>> by all
>>> +#         VFIO devices and status is 'active' or 'completed' (since 5.1)
>>> +#
>>>   # Since: 0.14.0
>>>   ##
>>>   { 'struct': 'MigrationInfo',
>>>     'data': {'*status': 'MigrationStatus', '*ram': 'MigrationStats',
>>>              '*disk': 'MigrationStats',
>>> +           '*vfio': 'VfioStats',
>>>              '*xbzrle-cache': 'XBZRLECacheStats',
>>>              '*total-time': 'int',
>>>              '*expected-downtime': 'int',
>>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]