[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH COLO-Frame v11 24/39] COLO: Implement failover w
From: |
Hailiang Zhang |
Subject: |
Re: [Qemu-devel] [PATCH COLO-Frame v11 24/39] COLO: Implement failover work for Secondary VM |
Date: |
Fri, 11 Dec 2015 16:27:39 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 |
On 2015/12/11 2:50, Dr. David Alan Gilbert wrote:
* zhanghailiang (address@hidden) wrote:
If users require SVM to takeover work, colo incoming thread should
exit from loop while failover BH helps backing to migration incoming
coroutine.
Signed-off-by: zhanghailiang <address@hidden>
Signed-off-by: Li Zhijian <address@hidden>
---
migration/colo.c | 42 +++++++++++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/migration/colo.c b/migration/colo.c
index 7a42fc6..f31e957 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -46,6 +46,33 @@ static bool colo_runstate_is_stopped(void)
return runstate_check(RUN_STATE_COLO) || !runstate_is_running();
}
+static void secondary_vm_do_failover(void)
+{
+ int old_state;
+ MigrationIncomingState *mis = migration_incoming_get_current();
+
+ migrate_set_state(&mis->state, MIGRATION_STATUS_COLO,
+ MIGRATION_STATUS_COMPLETED);
+
+ if (!autostart) {
+ error_report("\"-S\" qemu option will be ignored in secondary side");
+ /* recover runstate to normal migration finish state */
+ autostart = true;
+ }
You might find libvirt will need something different for it to be
involved during the failover; but for now OK.
+ old_state = failover_set_state(FAILOVER_STATUS_HANDLING,
+ FAILOVER_STATUS_COMPLETED);
+ if (old_state != FAILOVER_STATUS_HANDLING) {
+ error_report("Serious error while do failover for secondary VM,"
+ "old_state: %d", old_state);
Same suggestion as previous patch just to improve the error message.
OK, will fix it in next version.
+ return;
+ }
+ /* For Secondary VM, jump to incoming co */
+ if (mis->migration_incoming_co) {
+ qemu_coroutine_enter(mis->migration_incoming_co, NULL);
+ }
+}
+
static void primary_vm_do_failover(void)
{
MigrationState *s = migrate_get_current();
@@ -74,6 +101,8 @@ void colo_do_failover(MigrationState *s)
if (get_colo_mode() == COLO_MODE_PRIMARY) {
primary_vm_do_failover();
+ } else {
+ secondary_vm_do_failover();
}
}
@@ -404,6 +433,12 @@ void *colo_process_incoming_thread(void *opaque)
continue;
}
}
+
+ if (failover_request_is_active()) {
+ error_report("failover request");
+ goto out;
+ }
+
/* FIXME: This is unnecessary for periodic checkpoint mode */
ret = colo_ctl_put(mis->to_src_file, COLO_COMMAND_CHECKPOINT_REPLY,
0);
if (ret < 0) {
@@ -473,10 +508,11 @@ out:
qemu_fclose(fb);
}
qsb_free(buffer);
-
- qemu_mutex_lock_iothread();
+ /* Here, we can ensure BH is hold the global lock, and will join colo
+ * incoming thread, so here it is not necessary to lock here again,
+ * or there will be a deadlock error.
+ */
colo_release_ram_cache();
- qemu_mutex_unlock_iothread();
OK, I think I understand that - becuase we know there is a failover request
active, then it must be holding the lock?
Yes, we come here only when failover happened, and since Secondary VM
does failover in BH with holding iothread lock, and it will enter
migration_incoming_co
at the end. The migration_incoming_co() will wait for colo incoming thread to
finish.
So it can't try to get iothread lock, or there will be an deadlock error.
Other than the error message improvement:
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Dave
Thanks,
Hailiang
if (mis->to_src_file) {
qemu_fclose(mis->to_src_file);
--
1.8.3.1
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
.