[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown re
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown related socket fd when do failover |
Date: |
Tue, 15 Dec 2015 09:44:08 +0000 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
* zhanghailiang (address@hidden) wrote:
> If the net connection between COLO's two sides is broken while colo/colo
> incoming
> thread is blocked in 'read'/'write' socket fd. It will not detect this error
> until
> connect timeout. It will be a long time.
>
> Here we shutdown all the related socket file descriptors to wake up the
> blocking
> operation in failover BH. Besides, we should close the corresponding file
> descriptors
> after failvoer BH shutdown them, or there will be an error.
>
> Signed-off-by: zhanghailiang <address@hidden>
> Signed-off-by: Li Zhijian <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
> ---
> v12:
> - Shutdown both QEMUFile's fd though they may use the same fd. (Dave's
> suggestion)
> v11:
> - Only shutdown fd for once
>
> Signed-off-by: zhanghailiang <address@hidden>
> ---
> migration/colo.c | 42 ++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/migration/colo.c b/migration/colo.c
> index d06c14f..58531e7 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -60,6 +60,18 @@ static void secondary_vm_do_failover(void)
> /* recover runstate to normal migration finish state */
> autostart = true;
> }
> + /*
> + * Make sure colo incoming thread not block in recv or send,
> + * If mis->from_src_file and mis->to_src_file use the same fd,
> + * The second shutdown() will return -1, we ignore this value,
> + * it is harmless.
> + */
> + if (mis->from_src_file) {
> + qemu_file_shutdown(mis->from_src_file);
> + }
> + if (mis->to_src_file) {
> + qemu_file_shutdown(mis->to_src_file);
> + }
>
> old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
> FAILOVER_STATUS_COMPLETED);
> @@ -82,6 +94,18 @@ static void primary_vm_do_failover(void)
> migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
> MIGRATION_STATUS_COMPLETED);
>
> + /*
> + * Make sure colo thread no block in recv or send,
> + * The s->rp_state.from_dst_file and s->to_dst_file may use the
> + * same fd, but we still shutdown the fd for twice, it is harmless.
> + */
> + if (s->to_dst_file) {
> + qemu_file_shutdown(s->to_dst_file);
> + }
> + if (s->rp_state.from_dst_file) {
> + qemu_file_shutdown(s->rp_state.from_dst_file);
> + }
> +
> old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
> FAILOVER_STATUS_COMPLETED);
> if (old_state != FAILOVER_STATUS_HANDLING) {
> @@ -348,7 +372,7 @@ static void colo_process_checkpoint(MigrationState *s)
> }
>
> out:
> - if (ret < 0) {
> + if (ret < 0 || (!ret && !failover_request_is_active())) {
> error_report("%s: %s", __func__, strerror(-ret));
> qapi_event_send_colo_exit(COLO_MODE_PRIMARY, COLO_EXIT_REASON_ERROR,
> true, strerror(-ret), NULL);
> @@ -360,6 +384,15 @@ out:
> qsb_free(buffer);
> buffer = NULL;
>
> + /* Hope this not to be too long to loop here */
> + while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
> + ;
> + }
> + /*
> + * Must be called after failover BH is completed,
> + * Or the failover BH may shutdown the wrong fd, that
> + * re-used by other thread after we release here.
> + */
> if (s->rp_state.from_dst_file) {
> qemu_fclose(s->rp_state.from_dst_file);
> }
> @@ -519,7 +552,7 @@ void *colo_process_incoming_thread(void *opaque)
> }
>
> out:
> - if (ret < 0) {
> + if (ret < 0 || (!ret && !failover_request_is_active())) {
> error_report("colo incoming thread will exit, detect error: %s",
> strerror(-ret));
> qapi_event_send_colo_exit(COLO_MODE_SECONDARY,
> COLO_EXIT_REASON_ERROR,
> @@ -539,6 +572,11 @@ out:
> */
> colo_release_ram_cache();
>
> + /* Hope this not to be too long to loop here */
> + while (failover_get_state() != FAILOVER_STATUS_COMPLETED) {
> + ;
> + }
> + /* Must be called after failover BH is completed */
> if (mis->to_src_file) {
> qemu_fclose(mis->to_src_file);
> }
> --
> 1.8.3.1
>
>
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
- [Qemu-devel] [PATCH COLO-Frame v12 22/38] COLO failover: Introduce state to record failover process, (continued)
- [Qemu-devel] [PATCH COLO-Frame v12 22/38] COLO failover: Introduce state to record failover process, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 23/38] COLO: Implement failover work for Primary VM, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 27/38] COLO failover: Don't do failover during loading VM's state, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 18/38] COLO: Flush PVM's cached RAM into SVM's memory, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 26/38] COLO failover: Shutdown related socket fd when do failover, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 24/38] COLO: Implement failover work for Secondary VM, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 19/38] COLO: Add checkpoint-delay parameter for migrate-set-parameters, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 28/38] COLO: Process shutdown command for VM in COLO state, zhanghailiang, 2015/12/15
- [Qemu-devel] [PATCH COLO-Frame v12 30/38] savevm: Split load vm state function qemu_loadvm_state, zhanghailiang, 2015/12/15