[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH COLO-Frame v17 21/34] COLO failover: Shutdown relate
From: |
zhanghailiang |
Subject: |
[Qemu-devel] [PATCH COLO-Frame v17 21/34] COLO failover: Shutdown related socket fd when do failover |
Date: |
Fri, 3 Jun 2016 15:52:33 +0800 |
If the net connection between COLO's two sides is broken while COLO or
COLO incoming thread is blocked in read()/write() socket fd.
It will not detect this error until connection is timeout.
That will be a long time.
Here we shutdown all the related socket file descriptors to wake up the
blocking operation in failover BH. Besides, we should close the corresponding
file descriptors after failvoer BH shutdown them, or there will be an error.
Signed-off-by: zhanghailiang <address@hidden>
Signed-off-by: Li Zhijian <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Cc: Dr. David Alan Gilbert <address@hidden>
---
v17:
- Rename colo_sem to colo_exit_sem.
v13:
- Add Reviewed-by tag
- Use semaphore to notify colo/colo incoming loop that
failover work is finished.
v12:
- Shutdown both QEMUFile's fd though they may use the
same fd. (Dave's suggestion)
v11:
- Only shutdown fd for once
---
include/migration/migration.h | 3 +++
migration/colo.c | 43 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 566b2a5..74f49ee 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -113,6 +113,7 @@ struct MigrationIncomingState {
QemuThread colo_incoming_thread;
/* The coroutine we should enter (back) after failover */
Coroutine *migration_incoming_co;
+ QemuSemaphore colo_incoming_sem;
/* See savevm.c */
LoadStateEntry_Head loadvm_handlers;
@@ -181,6 +182,8 @@ struct MigrationState
QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest)
src_page_requests;
/* The RAMBlock used in the last src_page_request */
RAMBlock *last_req_rb;
+ /* The semaphore is used to notify COLO thread that failover is finished */
+ QemuSemaphore colo_exit_sem;
/* The last error that occurred */
Error *error;
diff --git a/migration/colo.c b/migration/colo.c
index db6534a..ff7b77b 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -60,6 +60,18 @@ static void secondary_vm_do_failover(void)
/* recover runstate to normal migration finish state */
autostart = true;
}
+ /*
+ * Make sure colo incoming thread not block in recv or send,
+ * If mis->from_src_file and mis->to_src_file use the same fd,
+ * The second shutdown() will return -1, we ignore this value,
+ * It is harmless.
+ */
+ if (mis->from_src_file) {
+ qemu_file_shutdown(mis->from_src_file);
+ }
+ if (mis->to_src_file) {
+ qemu_file_shutdown(mis->to_src_file);
+ }
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
@@ -68,6 +80,8 @@ static void secondary_vm_do_failover(void)
"secondary VM", old_state);
return;
}
+ /* Notify COLO incoming thread that failover work is finished */
+ qemu_sem_post(&mis->colo_incoming_sem);
/* For Secondary VM, jump to incoming co */
if (mis->migration_incoming_co) {
qemu_coroutine_enter(mis->migration_incoming_co, NULL);
@@ -82,6 +96,18 @@ static void primary_vm_do_failover(void)
migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
MIGRATION_STATUS_COMPLETED);
+ /*
+ * Wake up COLO thread which may blocked in recv() or send(),
+ * The s->rp_state.from_dst_file and s->to_dst_file may use the
+ * same fd, but we still shutdown the fd for twice, it is harmless.
+ */
+ if (s->to_dst_file) {
+ qemu_file_shutdown(s->to_dst_file);
+ }
+ if (s->rp_state.from_dst_file) {
+ qemu_file_shutdown(s->rp_state.from_dst_file);
+ }
+
old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
FAILOVER_STATUS_COMPLETED);
if (old_state != FAILOVER_STATUS_HANDLING) {
@@ -89,6 +115,8 @@ static void primary_vm_do_failover(void)
old_state);
return;
}
+ /* Notify COLO thread that failover work is finished */
+ qemu_sem_post(&s->colo_exit_sem);
}
void colo_do_failover(MigrationState *s)
@@ -374,6 +402,14 @@ out:
COLO_EXIT_REASON_REQUEST, NULL);
}
+ /* Hope this not to be too long to wait here */
+ qemu_sem_wait(&s->colo_exit_sem);
+ qemu_sem_destroy(&s->colo_exit_sem);
+ /*
+ * Must be called after failover BH is completed,
+ * Or the failover BH may shutdown the wrong fd that
+ * re-used by other threads after we release here.
+ */
if (s->rp_state.from_dst_file) {
qemu_fclose(s->rp_state.from_dst_file);
}
@@ -382,6 +418,7 @@ out:
void migrate_start_colo_process(MigrationState *s)
{
qemu_mutex_unlock_iothread();
+ qemu_sem_init(&s->colo_exit_sem, 0);
migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
colo_process_checkpoint(s);
@@ -421,6 +458,8 @@ void *colo_process_incoming_thread(void *opaque)
Error *local_err = NULL;
int ret;
+ qemu_sem_init(&mis->colo_incoming_sem, 0);
+
migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_COLO);
@@ -551,6 +590,10 @@ out:
*/
colo_release_ram_cache();
+ /* Hope this not to be too long to loop here */
+ qemu_sem_wait(&mis->colo_incoming_sem);
+ qemu_sem_destroy(&mis->colo_incoming_sem);
+ /* Must be called after failover BH is completed */
if (mis->to_src_file) {
qemu_fclose(mis->to_src_file);
}
--
1.8.3.1
- [Qemu-devel] [PATCH COLO-Frame v17 02/34] migration: Introduce capability 'x-colo' to migration, (continued)
- [Qemu-devel] [PATCH COLO-Frame v17 02/34] migration: Introduce capability 'x-colo' to migration, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 07/34] COLO: Implement COLO checkpoint protocol, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 09/34] COLO: Save PVM state to secondary side when do checkpoint, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 13/34] COLO: Flush PVM's cached RAM into SVM's memory, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 12/34] COLO: Load VMState into buffer before restore it, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 15/34] COLO: Synchronize PVM's state to SVM periodically, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 14/34] COLO: Add checkpoint-delay parameter for migrate-set-parameters, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 08/34] COLO: Add a new RunState RUN_STATE_COLO, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 27/34] migration/savevm: Export two helper functions for savevm process, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 10/34] COLO: Load PVM's dirty pages into SVM's RAM cache temporarily, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 21/34] COLO failover: Shutdown related socket fd when do failover,
zhanghailiang <=
- [Qemu-devel] [PATCH COLO-Frame v17 03/34] COLO: migrate colo related info to secondary node, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 11/34] ram/COLO: Record the dirty pages that SVM received, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 18/34] COLO: Implement failover work for Primary VM, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 22/34] COLO failover: Don't do failover during loading VM's state, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 29/34] COLO: Split qemu_savevm_state_begin out of checkpoint process, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 28/34] COLO: Separate the process of saving/loading ram and device state, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 25/34] savevm: Introduce two helper functions for save/find loadvm_handlers entry, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 19/34] COLO: Implement failover work for Secondary VM, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 24/34] COLO: Update the global runstate after going into colo state, zhanghailiang, 2016/06/03
- [Qemu-devel] [PATCH COLO-Frame v17 05/34] migration: Integrate COLO checkpoint process into loadvm, zhanghailiang, 2016/06/03