Re: [Qemu-devel] [Spice-devel] client_migrate_info

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Spice-devel] client_migrate_info - do we need a new co

From:	Yonit Halperin
Subject:	Re: [Qemu-devel] [Spice-devel] client_migrate_info - do we need a new command?
Date:	Wed, 14 Dec 2011 11:26:59 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:7.0) Gecko/20110927 Thunderbird/7.0

Hi,
On 12/13/2011 08:19 PM, Anthony Liguori wrote:

In our call today, Avi asked that we evaluate whether the interface for
client_migrate_info is the Right Interface before we introduce a new
command to work around the fact that async commands are broken.

I looked into this today and here's what I came to.

1) What are the failure scenarios?

The issue is qerror_report(). Roughly speaking, qerror_report either
prints to stderr or it associates an error with the current monitor
command.

The problem with this is that qerror_report() is used all over the code
base today and if an error occurs in a device that has nothing to do
with the command, instead of printing to stderr, the command will fail
with a bizarre error reason (even though it really succeeded).

2) Does the command have the right semantics?

The command has the following doc:

client_migrate_info
------------------

Set the spice/vnc connection info for the migration target. The spice/vnc
server will ask the spice/vnc client to automatically reconnect using the
new parameters (if specified) once the vm migration finished successfully.

Arguments:

- "protocol": protocol: "spice" or "vnc" (json-string)
- "hostname": migration target hostname (json-string)
- "port": spice/vnc tcp port for plaintext channels (json-int, optional)
- "tls-port": spice tcp port for tls-secured channels (json-int, optional)
- "cert-subject": server certificate subject (json-string, optional)

Example:

-> { "execute": "client_migrate_info",
"arguments": { "protocol": "spice",
"hostname": "virt42.lab.kraxel.org",
"port": 1234 } }
<- { "return": {} }

Originally, the command was a normal sync command and my understanding
is that it simply posted notification to the clients. Apparently, users
of the interface need to actually know when the client has Ack'd this
operation because otherwise it's racy since a disconnect may occur
before the client processes the redirection.

It's racy because the migration can start before the client manages toconnect to the migration target. And since the target is unresponsiveduring migration, the client will manage to connect to it only aftermigration completes; but that can take a while, and the client's ticketmight expire till then.

OTOH, that means that what we really need is 1) tell connected clients
that they need to redirect 2) notification when/if connected clients are
prepared to redirect.

The trouble with using a async command for this is that the time between
(1) & (2) may be arbitrarily long. Since most QMP clients today always
use a NULL tag, that effectively means the monitor is blocked for an
arbitrarily long time while this operation is in flight.

I don't know if libspice uses a timeout for this operation,

We use a timeout of 10 Sec
but if it

doesn't, this could block arbitrarily long. Even with tagging, we don't
have a way to cancel in flight commands so blocking for arbitrary time
periods is problematic.

I think splitting this into two commands, one that requests the clients
to redirect and then an event that lets a tool know that the clients are
ready to migrate ends up being nicer. It means that we never end up with
a blocked QMP session and clients are more likely to properly deal with
the fact that an event may take arbitrarily long to happen.

Clients can also implement their own cancel logic by choosing to stop
waiting for an event to happen and then ignoring spurious events.

So regardless of the async issue, I think splitting this command is the
right thing to do long term.

I just want to emphasize that using client_migrate_info for connectingto to target is more of a workaround. IMHO, the more complete solutionwould have been (similar to the one we have in Rhel5):1) Add a migration notifier for pre-starting migration. Introducecompletion cb to these notifiers.

2) actually start migration only after the completion cb is called.

Then, the client_migrate_info can go back to be sync.

If we already plan to make changes, maybe they should be aimed to such asolution.Btw, if we also had such notifier for pre-finish migration (beforestarting the target vm), we could even turn the client migration to bereally seamless again.


Regards,
Yonit.


Regards,

Anthony Liguori
_______________________________________________
Spice-devel mailing list
address@hidden
http://lists.freedesktop.org/mailman/listinfo/spice-devel

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] client_migrate_info - do we need a new command?, Anthony Liguori, 2011/12/13
- Re: [Qemu-devel] client_migrate_info - do we need a new command?, Luiz Capitulino, 2011/12/13
- Re: [Qemu-devel] [Spice-devel] client_migrate_info - do we need a new command?, Yonit Halperin <=
- Re: [Qemu-devel] client_migrate_info - do we need a new command?, Avi Kivity, 2011/12/14
- Re: [Qemu-devel] client_migrate_info - do we need a new command?, Gerd Hoffmann, 2011/12/15

Prev by Date: [Qemu-devel] [PATCH] spapr: Add support for -vga option
Next by Date: Re: [Qemu-devel] [Qemu-trivial] [PATCH] doc: Remove Symbian Virtual Platform
Previous by thread: Re: [Qemu-devel] client_migrate_info - do we need a new command?
Next by thread: Re: [Qemu-devel] client_migrate_info - do we need a new command?
Index(es):
- Date
- Thread