qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH COLO-Frame v12 11/38] COLO: Add a new RunState R


From: Hailiang Zhang
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v12 11/38] COLO: Add a new RunState RUN_STATE_COLO
Date: Tue, 12 Jan 2016 20:54:14 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

On 2016/1/11 21:16, Markus Armbruster wrote:
Hailiang Zhang <address@hidden> writes:

On 2015/12/19 17:27, Markus Armbruster wrote:
zhanghailiang <address@hidden> writes:

Guest will enter this state when paused to save/restore VM state
under colo checkpoint.

Cc: Eric Blake <address@hidden>
Cc: Markus Armbruster <address@hidden>
Signed-off-by: zhanghailiang <address@hidden>
Signed-off-by: Li Zhijian <address@hidden>
Signed-off-by: Gonglei <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Eric Blake <address@hidden>
---
   qapi-schema.json | 5 ++++-
   vl.c             | 8 ++++++++
   2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/qapi-schema.json b/qapi-schema.json
index 85f7800..0423b47 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -154,12 +154,15 @@
   # @watchdog: the watchdog action is configured to pause and has been 
triggered
   #
   # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @colo: guest is paused to save/restore VM state under colo checkpoint (since
+# 2.6)
   ##
   { 'enum': 'RunState',
     'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
               'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
               'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'colo' ] }

   ##
   # @StatusInfo:
diff --git a/vl.c b/vl.c
index f84fde8..fca630b 100644
--- a/vl.c
+++ b/vl.c
@@ -594,6 +594,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
       { RUN_STATE_INMIGRATE, RUN_STATE_WATCHDOG },
       { RUN_STATE_INMIGRATE, RUN_STATE_GUEST_PANICKED },
       { RUN_STATE_INMIGRATE, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_INMIGRATE, RUN_STATE_COLO },

       { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
       { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
@@ -603,6 +604,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {

       { RUN_STATE_PAUSED, RUN_STATE_RUNNING },
       { RUN_STATE_PAUSED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_PAUSED, RUN_STATE_COLO},

       { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING },
       { RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE },
@@ -613,9 +615,12 @@ static const RunStateTransition runstate_transitions_def[] 
= {

       { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
       { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_COLO},

       { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },

+    { RUN_STATE_COLO, RUN_STATE_RUNNING },
+
       { RUN_STATE_RUNNING, RUN_STATE_DEBUG },
       { RUN_STATE_RUNNING, RUN_STATE_INTERNAL_ERROR },
       { RUN_STATE_RUNNING, RUN_STATE_IO_ERROR },
@@ -626,6 +631,7 @@ static const RunStateTransition runstate_transitions_def[] 
= {
       { RUN_STATE_RUNNING, RUN_STATE_SHUTDOWN },
       { RUN_STATE_RUNNING, RUN_STATE_WATCHDOG },
       { RUN_STATE_RUNNING, RUN_STATE_GUEST_PANICKED },
+    { RUN_STATE_RUNNING, RUN_STATE_COLO},

       { RUN_STATE_SAVE_VM, RUN_STATE_RUNNING },

@@ -636,9 +642,11 @@ static const RunStateTransition runstate_transitions_def[] 
= {
       { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
       { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
       { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_SUSPENDED, RUN_STATE_COLO},

       { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
       { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_WATCHDOG, RUN_STATE_COLO},

       { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING },
       { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE },

Pardon my ignorance, but could you explain the new run state in a bit
more detail for me?


OK, in normally, we only need switch between COLO and RUNNING state.
But we can't forbid users to issue other command while VM is COLO state.

In every checkpoint, we have to pause to send VM's state to SVM, and before we
pause VM, users may issue 'stop' command, which will change state to
'RUN_STATE_PAUSE',
we don't want to abort VM because of this command. (Actually, we will
support 'stop' VM
during VM is in COLO state). So we need the state machine
'RUN_STATE_PAUSED -> RUN_STATE_COLO'.

What's the next state then?


We may switch to RUN_STATE_RUNNING, actually, here, the RUN_STATE_COLO is only 
used to
indicate that VM is stopped in COLO process.

We enter COLO state just after a full migration process which the last
state will be
'RUN_STATE_FINISH_MIGRATE' or 'RUN_STATE_INMIGRATE', before we enter
COLO loop, we may get
'x-colo-lost-heartbeat', and will run into 'RUN_STATE_COLO' pause, so we need
state machines 'RUN_STATE_FINISH_MIGRATE -> RUN_STATE_COLO'and
'RUN_STATE_INMIGRATE, RUN_STATE_COLO'.
The reason we need RUN_STATE_SUSPENDED -> RUN_STATE_COLO is, guest or
users may issue standby command.
We need to ensure VM not be crashed.

Actually, we may need more states which can go to 'colo' state, maybe
just follow the cases of
'MIGRATE' state.

I believe we should fully work out the state transitions added by COLO.
I like to write that down in this form:

     (state, trigger) -> (action, state')


I'm a little confused, for runstate_transitions_def, it seems that,
the state transition is a simple way: (state1, state2). Here we only switch to
RUN_STATE_COLO state when we need to do something with VM is paused.

Example:

     (running, checkpoint) -> (begin-checkpointing, colo)


Do you want me to add these new states into runstate_transitions_def ?
What's the real status (running or stopping) of 'checkpoint' and 'colo' for VM 
here ?

with a suitable explanation of 'checkpoint' and 'begin-checkpointing'.

For brevity, multiple

     (state1, trigger) -> (action, state')
     (state2, trigger) -> (action, state')
     ...
     (stateN, trigger) -> (action, state')

can be abbreviated to

     ({state1, state2, stateN}, trigger) -> (action, state')

Example:

     ({running, paused, ...}, checkpoint) -> (begin-checkpointing, colo)

For clarity, chains of state transitions should be described in the
order they happen.

Pictures showing the states connected with transition arrows labelled
with the trigger can help.

Two properties to check:

1. Correctness: every state transition thus written down does the right
    thing.

2. Completeness: for every pair (state, trigger), we got a state
    transition, or an explanation why it cannot happen.

Thanks,
zhanghailiang

Your additions to runstate_transitions_def[] show we can go *from* state
'colo' only to state 'running', but we can go *to* state 'colo' from
various other states.  This may well be sane, but it's not *obviously*
sane :)

.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]