qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V3 4/7] Migration/colo.c: Add new COLOExitReason


From: Zhang, Chen
Subject: Re: [Qemu-devel] [PATCH V3 4/7] Migration/colo.c: Add new COLOExitReason to handle all failover state
Date: Mon, 11 Mar 2019 09:08:50 +0000

-----Original Message-----
From: Markus Armbruster [mailto:address@hidden 
Sent: Sunday, March 10, 2019 1:06 AM
To: Zhang, Chen <address@hidden>
Cc: Li Zhijian <address@hidden>; Zhang Chen <address@hidden>; Dr. David Alan 
Gilbert <address@hidden>; Juan Quintela <address@hidden>; zhanghailiang 
<address@hidden>; Eric Blake <address@hidden>; qemu-dev <address@hidden>; 
Zhang, Chen <address@hidden>
Subject: Re: [Qemu-devel] [PATCH V3 4/7] Migration/colo.c: Add new 
COLOExitReason to handle all failover state

Zhang Chen <address@hidden > writes:

> From: Zhang Chen <address@hidden>
>
> In this patch we add the processing state for COLOExitReason, because 
> we have to identify COLO in the failover processing state or failover 
> error state. In the way, we can handle all the failover state.
> We have improved the description of the COLOExitReason by the way.
>
> Signed-off-by: Zhang Chen <address@hidden>
> ---
>  migration/colo.c    | 24 +++++++++++++-----------
>  qapi/migration.json | 15 +++++++++------
>  2 files changed, 22 insertions(+), 17 deletions(-)
>
> diff --git a/migration/colo.c b/migration/colo.c index 
> 89325952c7..dbe2b88807 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -267,7 +267,11 @@ COLOStatus *qmp_query_colo_status(Error **errp)
       switch (failover_get_state()) {

failover_get_state() returns FailoverStatus, i.e. one of FAILOVER_STATUS_NONE, 
_REQUIRE, _ACTIVE, _COMPLETED, _RELAUNCH.

Yes, have any concern here?

       case FAILOVER_STATUS_NONE:
           s->reason = COLO_EXIT_REASON_NONE;
           break;
       case FAILOVER_STATUS_REQUIRE:
>          s->reason = COLO_EXIT_REASON_REQUEST;
>          break;
>      default:
> -        s->reason = COLO_EXIT_REASON_ERROR;
> +        if (migration_in_colo_state()) {
> +            s->reason = COLO_EXIT_REASON_PROCESSING;
> +        } else {
> +            s->reason = COLO_EXIT_REASON_ERROR;
> +        }
>      }
>  
>      return s;

In which FailoverStatus can migration_in_colo_state() return true?

It is no close connection about the FailoverStatus,  except the 
"FAILOVER_STATUS_REQUIRE", 
other status still have the possibility be triggered here , if 
migration_in_colo_state() return true means
we still in COLO failover process, otherwise means we have exit COLO with some 
failover error.

> @@ -579,16 +583,13 @@ out:
       /*
        * There are only two reasons we can get here, some error happened
>       * or the user triggered failover.
>       */
>      switch (failover_get_state()) {
> -    case FAILOVER_STATUS_NONE:
> -        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
> -                                  COLO_EXIT_REASON_ERROR);
> -        break;
>      case FAILOVER_STATUS_COMPLETED:
>          qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
>                                    COLO_EXIT_REASON_REQUEST);
>          break;
>      default:
> -        abort();
> +        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
> +                                  COLO_EXIT_REASON_ERROR);
>      }
>  
>      /* Hope this not to be too long to wait here */

Change of behavior for FAILOVER_STATUS_REQUIRE, _ACTIVE, _COMPLETED,
_RELAUNCH: send a COLO_EXIT event instead of calling abort().  Why?

Because if COLO occur failover error, we want to make the VM back to normal 
running status rather than just crash VM.


> @@ -850,17 +851,18 @@ out:
>          error_report_err(local_err);
>      }
>  
> +    /*
> +     * There are only two reasons we can get here, some error happened
> +     * or the user triggered failover.
> +     */
>      switch (failover_get_state()) {
> -    case FAILOVER_STATUS_NONE:
> -        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
> -                                  COLO_EXIT_REASON_ERROR);
> -        break;
>      case FAILOVER_STATUS_COMPLETED:
>          qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
>                                    COLO_EXIT_REASON_REQUEST);
>          break;
>      default:
> -        abort();
> +        qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
> +                                  COLO_EXIT_REASON_ERROR);
>      }
>  
>      if (fb) {

Same question.

> diff --git a/qapi/migration.json b/qapi/migration.json index 
> 7a795ecc16..089ed67807 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -983,19 +983,22 @@
>  ##
>  # @COLOExitReason:
>  #
> -# The reason for a COLO exit
> +# Describe the reason for COLO exit.

The old text was better.

OK, I will remove it.

>  #
> -# @none: no failover has ever happened. This can't occur in the -# 
> COLO_EXIT event, only in the result of query-colo-status.
> +# @none: failover has never happened. This state does not occur # in 
> +the COLO_EXIT event, and is only visible in the result of # 
> +query-colo-status.

This might be a small improvment.

>  #
> -# @request: COLO exit is due to an external request
> +# @request: COLO exit caused by an external request.
>  #
> -# @error: COLO exit is due to an internal error
> +# @error: COLO exit caused by an internal error.

I like the old text better.

Sorry, I'm not a native speaker.
I will remove it in next version.

> +#
> +# @processing: COLO is currently handling a failover (since 4.0).
>  #
>  # Since: 3.1
>  ##
>  { 'enum': 'COLOExitReason',
> -  'data': [ 'none', 'request', 'error' ] }
> +  'data': [ 'none', 'request', 'error' , 'processing' ] }
>  
>  ##
>  # @x-colo-lost-heartbeat:

The patch conflates addition of a new member with doc improvements.
Tolerable since both are small.

Yes, Just tiny fix...

Thanks
Zhang Chen




reply via email to

[Prev in Thread] Current Thread [Next in Thread]