qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 03/16] migration: Move setup_time to mig_stats


From: David Edmondson
Subject: Re: [PATCH v2 03/16] migration: Move setup_time to mig_stats
Date: Tue, 16 May 2023 12:07:32 +0100

Juan Quintela <quintela@redhat.com> writes:

> David Edmondson <david.edmondson@oracle.com> wrote:
>> Juan Quintela <quintela@redhat.com> writes:
>>
>>> It is a time that needs to be cleaned each time cancel migration.
>>> Once there create migration_time_since() to calculate how time since a
>>> time in the past.
>>>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>>
>>> ---
>>>
>>> Rename to migration_time_since (cédric)
>>> ---
>>>  migration/migration-stats.h | 13 +++++++++++++
>>>  migration/migration.h       |  1 -
>>>  migration/migration-stats.c |  7 +++++++
>>>  migration/migration.c       |  9 ++++-----
>>>  4 files changed, 24 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/migration/migration-stats.h b/migration/migration-stats.h
>>> index e782f1b0df..21402af9e4 100644
>>> --- a/migration/migration-stats.h
>>> +++ b/migration/migration-stats.h
>>> @@ -75,6 +75,10 @@ typedef struct {
>>>       * Number of bytes sent during precopy stage.
>>>       */
>>>      Stat64 precopy_bytes;
>>> +    /*
>>> +     * How long has the setup stage took.
>>> +     */
>>> +    Stat64 setup_time;
>>
>> Is this really a Stat64? It doesn't appear to need the atomic update
>> feature.
>
> What this whole Migration Atomic Counters series try to do is that
> everything becomes atomic and then we can use everything everywhere.
>
> Before this series we had (I am simplifying here):
>
> - transferred, precopy_bytes, postcopy_bytes, downtime_bytes -> atomic,
>   you can use it anywhere
>
> - qemu_file transferred -> you can only use it from the main migration
>   thread
>
> - qemu_file rate_limit -> you can only use it from the main migration
>   thread
>
> And we had to update the three counters in every place that we did a
> write wehad to update all of them.
>
> You can see the contorsions that we go to to update the rate_limit and
> the qemu_file transferred fields.
>
> After the series (you need to get what it is already on the tree, this
> series, QEMUFileHooks cleanup, and another serie on my tree waiting for
> this to be commited), you got three counters:
>
> - qemu_file: atomic, everytime we do a qemu_file write we update it
> - multifd_bytes: atomic, everytime that we do a write in a multifd
>   channel, we update it.
> - rdma_bytes: atomic, everytime we do a write through RDMA we update it.
>
> And that is it.
>
> Both rate_limit and transferred are derived from these three counters:
>
> - at any point in time migration_transferred_bytes() returns the amount
>   of bytes written since the start of the migration:
>      qemu_file_bytes + multifd_bytes + rdma_bytes.
>
> - transferred on this period:
>        at_start_of_period = migration_transferred_bytes().
>        trasferred_in_this_period = migration_transferred_bytes() - 
> at_start_of_period;
>
> - Similar for precopy_bytes, postcopy_bytes and downtime_bytes.  When we
>   move from one stage to the next, we store what is the value of the
>   previous stage.
>
> The counters that we use to calculate the rate limit are updated around
> 10 times per second (can be a bit bigger at the end of periods,
> iterations, ...)  So performance is not extra critical.
>
> But as we have way less atomic operations (really one per real write),
> we don't really care a lot if we do some atomic operations when a normal
> operation will do.
>
> I.e. I think we have two options:
>
> - have the remaining counters that are only used in the main migration
>   thread not be atomic.  Document them and remember to do the correct
>   thing everytime we use it.  If we need to use it in another thread,
>   just change it to atomic.
>
> - Make all counters atomic. No need to document anything.  And you can
>   call any operation/counter/... in migration-stats.c from anywhere.
>
> I think that the second option is better.  But I can hear reasons from
> people that think that the 1st one is better.

For the counters, no argument - making them all atomic seems like the
right way forward.

start_time isn't a counter, and isn't manipulated at multiple points in
the code by different actors.

I don't hate it being a Stat64, it just seems odd when the other 'time'
related variables are not.

> Comments?
>
> Later, Juan.
-- 
You can't hide from the flipside.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]