qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo check


From: Hailiang Zhang
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v12 10/38] COLO: Implement colo checkpoint protocol
Date: Tue, 12 Jan 2016 20:57:43 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

On 2016/1/11 20:47, Markus Armbruster wrote:
Hailiang Zhang <address@hidden> writes:

Hi Markus,

On 2015/12/19 16:54, Markus Armbruster wrote:
Jumping in at v12 for a bit of QAPI review (and whatever else catched my
eye nearby), please pardon my ignorance of COLO in general, and previous
review of this series in particular.


Thanks all the same :)
[...]
diff --git a/migration/colo.c b/migration/colo.c
index 0ab9618..0ce2a6e 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -10,10 +10,12 @@
    * later.  See the COPYING file in the top-level directory.
    */

+#include <unistd.h>
   #include "sysemu/sysemu.h"
   #include "migration/colo.h"
   #include "trace.h"
   #include "qemu/error-report.h"
+#include "qemu/sockets.h"

   bool colo_supported(void)
   {
@@ -34,6 +36,100 @@ bool migration_incoming_in_colo_state(void)
       return mis && (mis->state == MIGRATION_STATUS_COLO);
   }

+static int colo_put_cmd(QEMUFile *f, uint32_t cmd)
+{
+    int ret;
+
+    if (cmd >= COLO_COMMAND_MAX) {

Needs a trivial rebase due to commit 7fb1cf1.


+        error_report("%s: Invalid cmd", __func__);
+        return -EINVAL;

Can this run in a context with different error handling needs?

Or asked differently: who may ultimately handle this error?  Whoever
that may be, how does it need to report errors?

Peeking ahead: the immediate callers don't handle this error, they just
pass it on their callers.

I'm asking because I'm trying to understand whether error_report() is
appropriate here, or whether you need to use error_setg(), and leave the
actual reporting to the spot that ultimately handles this error.


Hmm, i know what you mean, we handled them all together after exit
from the colo process loop,
Use error_setg() seems to be a good idea, with this modification, we
can also drop the return
value. I will fix it in next version.


+    }
+    qemu_put_be32(f, cmd);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    trace_colo_put_cmd(COLOCommand_lookup[cmd]);
+
+    return ret;
+}

Looks like @cmd is a COLOCommand.  Why is the parameter type uint32_t?


OK, i will change it to use enum COLOCommand.

+
+static int colo_get_cmd(QEMUFile *f, uint32_t *cmd)
+{
+    int ret;
+
+    *cmd = qemu_get_be32(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        return ret;
+    }
+    if (*cmd >= COLO_COMMAND_MAX) {
+        error_report("%s: Invalid cmd", __func__);
+        return -EINVAL;
+    }
+    trace_colo_get_cmd(COLOCommand_lookup[*cmd]);
+    return 0;
+}

Same question.

The "get" in the name suggests the function returns the value gotten,
like similarly named function elsewhere in migration/ do.

Do you mean it should return the cmd value directly, not though parameter way ?
After we convert it to use error_setg() to indicate success or not, we
can do like that.
I will fix it.

Sounds good to me.


I have fixed them in v13 version :)

[...]
diff --git a/qapi-schema.json b/qapi-schema.json
index c9ff34e..85f7800 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -720,6 +720,31 @@
   { 'command': 'migrate-start-postcopy' }

   ##
+# @COLOCommand
+#
+# The commands for COLO fault tolerance
+#
+# @checkpoint-ready: SVM is ready for checkpointing
+#
+# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
+#
+# @checkpoint-reply: SVM gets PVM's checkpoint request
+#
+# @vmstate-send: VM's state will be sent by PVM.
+#
+# @vmstate-size: The total size of VMstate.
+#
+# @vmstate-received: VM's state has been received by SVM.
+#
+# @vmstate-loaded: VM's state has been loaded by SVM.
+#
+# Since: 2.6
+##
+{ 'enum': 'COLOCommand',
+  'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
+            'vmstate-send', 'vmstate-size','vmstate-received',
+            'vmstate-loaded' ] }
+

Space after 'vmstate-size', please.


'vmstate-size' is not used in this patch.  You may want to add it with
its first use instead.


OK, i will move it to the corresponding patch.

Should this enum really be named "COLOCommand"?  'checkpoint-ready',
'checkpoint-request', 'vmstate-send' look like commands to me, but the
others look like replies.


Yes, COLOCommand is not so exact. what about name it COLOProtocol?

A protocol specifies valid sequences of messages, and what they mean.
This isn't a protocol, it's a message within a protocol.  COLOMessage?


Yes, COLOMessage is more precise, i will fix it in next version.


   # @MouseInfo:
   #
   # Information about a mouse device.
diff --git a/trace-events b/trace-events
index 5565e79..39fdd8d 100644
--- a/trace-events
+++ b/trace-events
@@ -1579,6 +1579,8 @@ postcopy_ram_incoming_cleanup_join(void) ""

   # migration/colo.c
   colo_vm_state_change(const char *old, const char *new) "Change
'%s' => '%s'"
+colo_put_cmd(const char *msg) "Send '%s' cmd"
+colo_get_cmd(const char *msg) "Receive '%s' cmd"

   # kvm-all.c
   kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"

I like how this commit creates just the two state machines, and leaves
filling in their actions to later commits.  Helps ignorant rewiewers
like me :)



Do you mean i should split this patch ? Leave this patch with the
simplest colo process,
maybe just 'ready, request, reply', and add the other states in later patch?

No, I *like* how you split up the work.


OK.

Thanks.
Hailiang

.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]