qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 06/10] migration: Introduce dirty-limit capability


From: Markus Armbruster
Subject: Re: [PATCH v4 06/10] migration: Introduce dirty-limit capability
Date: Mon, 27 Mar 2023 08:41:28 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Hyman Huang <huangy81@chinatelecom.cn> writes:

> 在 2023/3/24 22:32, Markus Armbruster 写道:
>> Hyman Huang <huangy81@chinatelecom.cn> writes:
>> 
>>> 在 2023/3/24 20:11, Markus Armbruster 写道:
>>>> huangy81@chinatelecom.cn writes:
>>>>
>>>>> From: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>>>>
>>>>> Introduce migration dirty-limit capability, which can
>>>>> be turned on before live migration and limit dirty
>>>>> page rate durty live migration.
>>>>>
>>>>> Introduce migrate_dirty_limit function to help check
>>>>> if dirty-limit capability enabled during live migration.
>>>>>
>>>>> Meanwhile, refactor vcpu_dirty_rate_stat_collect
>>>>> so that period can be configured instead of hardcoded.
>>>>>
>>>>> dirty-limit capability is kind of like auto-converge
>>>>> but using dirty limit instead of traditional cpu-throttle
>>>>> to throttle guest down. To enable this feature, turn on
>>>>> the dirty-limit capability before live migration using
>>>>> migrate-set-capabilities, and set the parameters
>>>>> "x-vcpu-dirty-limit-period", "vcpu-dirty-limit" suitably
>>>>> to speed up convergence.
>>>>>
>>>>> Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
>>>>> Acked-by: Peter Xu <peterx@redhat.com>
>>>> [...]
>>>>
>>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>>> index d33cc2d582..b7a92be055 100644
>>>>> --- a/qapi/migration.json
>>>>> +++ b/qapi/migration.json
>>>>> @@ -477,6 +477,8 @@
>>>>>    #                    will be handled faster.  This is a performance 
>>>>> feature and
>>>>>    #                    should not affect the correctness of postcopy 
>>>>> migration.
>>>>>    #                    (since 7.1)
>>>>> +# @dirty-limit: Use dirty-limit to throttle down guest if enabled.
>>>>> +#               (since 8.0)
>>>>
>>>> Feels too terse.  What exactly is used and how?  It's not the capability
>>>> itself (although the text sure sounds like it).  I guess it's the thing
>>>> you set with command set-vcpu-dirty-limit.
>>>>
>>>> Is that used only when the capability is set?
>>>
>>> Yes, live migration set "dirty-limit" value when that capability is set,
>>> the comment changes to "Apply the algorithm of dirty page rate limit to 
>>> throttle down guest if capability is set, rather than auto-converge".
>>>
>>> Please continue to polish the doc if needed. Thanks.
>>
>> Let's see whether I understand.
>>
>> Throttling happens only during migration.
>>
>> There are two throttling algorithms: "auto-converge" (default) and
>> "dirty page rate limit".
>>
>> The latter can be tuned with set-vcpu-dirty-limit.
>> Correct?
>
> Yes
>
>> What happens when migration capability dirty-limit is enabled, but the
>> user hasn't set a limit with set-vcpu-dirty-limit, or canceled it with
>> cancel-vcpu-dirty-limit?
>
> dirty-limit capability use the default value if user hasn't set.

What is the default value?  I can't find it in the doc comments.

> In the path of cancel-vcpu-dirty-limit, canceling should be check and not be 
> allowed if migration is in process.

Can you change the dirty limit with set-vcpu-dirty-limit while migration
is in progress?  Let's see...

Has the dirty limit any effect while migration is not in progress?

> see the following code in commit:
> [PATCH v4 08/10] migration: Implement dirty-limit convergence algo
>
> --- a/softmmu/dirtylimit.c
> +++ b/softmmu/dirtylimit.c
> @@ -438,6 +438,8 @@ void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index,
>                                   int64_t cpu_index,
>                                   Error **errp)
>  {
> +    MigrationState *ms = migrate_get_current();
> +
>      if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
>          return;
>      }
> @@ -451,6 +453,15 @@ void qmp_cancel_vcpu_dirty_limit(bool has_cpu_index,
>          return;
>      }
>
> +    if (migration_is_running(ms->state) &&
> +        (!qemu_thread_is_self(&ms->thread)) &&
> +        migrate_dirty_limit() &&
> +        dirtylimit_in_service()) {
> +        error_setg(errp, "can't cancel dirty page limit while"
> +                   " migration is running");
> +        return;
> +    }

We can get here even when migration_is_running() is true.  Seems to
contradict your claim "no cancel while migration is in progress".  Am I
confused?

Please drop the superfluous parenthesis around !qemu_thread_is_self().

> +
>      dirtylimit_state_lock();
>
>      if (has_cpu_index) {
> @@ -486,6 +497,8 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
>                                uint64_t dirty_rate,
>                                Error **errp)
>  {
> +    MigrationState *ms = migrate_get_current();
> +
>      if (!kvm_enabled() || !kvm_dirty_ring_enabled()) {
>          error_setg(errp, "dirty page limit feature requires KVM with"
>                     " accelerator property 'dirty-ring-size' set'");
> @@ -502,6 +515,15 @@ void qmp_set_vcpu_dirty_limit(bool has_cpu_index,
>          return;
>      }
>
> +    if (migration_is_running(ms->state) &&
> +        (!qemu_thread_is_self(&ms->thread)) &&
> +        migrate_dirty_limit() &&
> +        dirtylimit_in_service()) {
> +        error_setg(errp, "can't cancel dirty page limit while"
> +                   " migration is running");

Same condition, i.e. we dirty limit change is possible exactly when
cancel is.  Correct?

> +        return;
> +    }
> +
>      dirtylimit_state_lock();
>
>      if (!dirtylimit_in_service()) {

Maybe it's just me still not understanding things, but the entire
interface feels overly complicated.

Here's my current mental model of what you're trying to provide.

There are two throttling algorithms: "auto-converge" (default) and
"dirty page rate limit".  The user can select one.

The latter works with a user-configurable dirty limit.

Changing these configuration bits is only possible in certain states.
Which ones is not clear to me, yet.

Is this accurate and complete?

Are commands set-vcpu-dirty-limit, cancel-vcpu-dirty-limit,
query-vcpu-dirty-limit useful without this series?

If not, then committing them as stable interfaces was clearly premature.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]