|
From: | Hailiang Zhang |
Subject: | Re: [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo checkpoint protocol |
Date: | Tue, 12 Jan 2016 20:57:43 +0800 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 |
On 2016/1/11 20:47, Markus Armbruster wrote:
Hailiang Zhang <address@hidden> writes:Hi Markus, On 2015/12/19 16:54, Markus Armbruster wrote:Jumping in at v12 for a bit of QAPI review (and whatever else catched my eye nearby), please pardon my ignorance of COLO in general, and previous review of this series in particular.Thanks all the same :)[...]diff --git a/migration/colo.c b/migration/colo.c index 0ab9618..0ce2a6e 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -10,10 +10,12 @@ * later. See the COPYING file in the top-level directory. */ +#include <unistd.h> #include "sysemu/sysemu.h" #include "migration/colo.h" #include "trace.h" #include "qemu/error-report.h" +#include "qemu/sockets.h" bool colo_supported(void) { @@ -34,6 +36,100 @@ bool migration_incoming_in_colo_state(void) return mis && (mis->state == MIGRATION_STATUS_COLO); } +static int colo_put_cmd(QEMUFile *f, uint32_t cmd) +{ + int ret; + + if (cmd >= COLO_COMMAND_MAX) {Needs a trivial rebase due to commit 7fb1cf1.+ error_report("%s: Invalid cmd", __func__); + return -EINVAL;Can this run in a context with different error handling needs? Or asked differently: who may ultimately handle this error? Whoever that may be, how does it need to report errors? Peeking ahead: the immediate callers don't handle this error, they just pass it on their callers. I'm asking because I'm trying to understand whether error_report() is appropriate here, or whether you need to use error_setg(), and leave the actual reporting to the spot that ultimately handles this error.Hmm, i know what you mean, we handled them all together after exit from the colo process loop, Use error_setg() seems to be a good idea, with this modification, we can also drop the return value. I will fix it in next version.+ } + qemu_put_be32(f, cmd); + qemu_fflush(f); + + ret = qemu_file_get_error(f); + trace_colo_put_cmd(COLOCommand_lookup[cmd]); + + return ret; +}Looks like @cmd is a COLOCommand. Why is the parameter type uint32_t?OK, i will change it to use enum COLOCommand.+ +static int colo_get_cmd(QEMUFile *f, uint32_t *cmd) +{ + int ret; + + *cmd = qemu_get_be32(f); + ret = qemu_file_get_error(f); + if (ret < 0) { + return ret; + } + if (*cmd >= COLO_COMMAND_MAX) { + error_report("%s: Invalid cmd", __func__); + return -EINVAL; + } + trace_colo_get_cmd(COLOCommand_lookup[*cmd]); + return 0; +}Same question. The "get" in the name suggests the function returns the value gotten, like similarly named function elsewhere in migration/ do.Do you mean it should return the cmd value directly, not though parameter way ? After we convert it to use error_setg() to indicate success or not, we can do like that. I will fix it.Sounds good to me.
I have fixed them in v13 version :)
[...]diff --git a/qapi-schema.json b/qapi-schema.json index c9ff34e..85f7800 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -720,6 +720,31 @@ { 'command': 'migrate-start-postcopy' } ## +# @COLOCommand +# +# The commands for COLO fault tolerance +# +# @checkpoint-ready: SVM is ready for checkpointing +# +# @checkpoint-request: PVM tells SVM to prepare for new checkpointing +# +# @checkpoint-reply: SVM gets PVM's checkpoint request +# +# @vmstate-send: VM's state will be sent by PVM. +# +# @vmstate-size: The total size of VMstate. +# +# @vmstate-received: VM's state has been received by SVM. +# +# @vmstate-loaded: VM's state has been loaded by SVM. +# +# Since: 2.6 +## +{ 'enum': 'COLOCommand', + 'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply', + 'vmstate-send', 'vmstate-size','vmstate-received', + 'vmstate-loaded' ] } +Space after 'vmstate-size', please.'vmstate-size' is not used in this patch. You may want to add it with its first use instead.OK, i will move it to the corresponding patch.Should this enum really be named "COLOCommand"? 'checkpoint-ready', 'checkpoint-request', 'vmstate-send' look like commands to me, but the others look like replies.Yes, COLOCommand is not so exact. what about name it COLOProtocol?A protocol specifies valid sequences of messages, and what they mean. This isn't a protocol, it's a message within a protocol. COLOMessage?
Yes, COLOMessage is more precise, i will fix it in next version.
# @MouseInfo: # # Information about a mouse device. diff --git a/trace-events b/trace-events index 5565e79..39fdd8d 100644 --- a/trace-events +++ b/trace-events @@ -1579,6 +1579,8 @@ postcopy_ram_incoming_cleanup_join(void) "" # migration/colo.c colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'" +colo_put_cmd(const char *msg) "Send '%s' cmd" +colo_get_cmd(const char *msg) "Receive '%s' cmd" # kvm-all.c kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"I like how this commit creates just the two state machines, and leaves filling in their actions to later commits. Helps ignorant rewiewers like me :)Do you mean i should split this patch ? Leave this patch with the simplest colo process, maybe just 'ready, request, reply', and add the other states in later patch?No, I *like* how you split up the work.
OK. Thanks. Hailiang
.
[Prev in Thread] | Current Thread | [Next in Thread] |