[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] When does live migration give up?

From: Alex Bligh
Subject: Re: [Qemu-devel] When does live migration give up?
Date: Wed, 4 Sep 2013 23:37:42 +0100

> Do you mean something like this?
>   destination
>      socket()
>      bind() to { sin_port = 0, sin_addr.s_addr = INADDR_ANY }
>      listen()
>      getsockname()
>      send address to source
>      accept()
>      start QEMU with file descriptor returned by accept
>   source
>      read address
>      socket()
>      connect()
>      pass socket file descriptor to QEMU and migrate to it
> Anything that doesn't use sin_port = 0 and getsockname() is prone to
> race conditions.

From memory we bind() to a specific randomly chosen port and if
that fails retry until bind() succeeds. This is because we
want the port to be within a given range. I believe that is
race free as only one bind() can run at once.

>> Approx 10% of migrations die after many minutes on the
>> customer's platform. This does not appear to happen if migrations are
>> not carried out 50 at a time.
> Dying after many minutes usually means that the destination is not set
> up the same as the source, as you said below.

Hmmm. OK I thought that produced an immediate error. Is there any way
of logging what's up to stderr or similar etc?


> Paolo
>> We appear to be getting something other than 'ms' returned through the
>> monitoring system. Unhelpfully what that is is not logged.
>> Is there anything (apart from the socket closing prematurely) which can
>> cause a failed migration after many minutes? We've seen problems where
>> the destination is not set up the same as the source (e.g. different
>> numbers of NICs) but IIRC that fails much earlier.
>> To make things easier (cough), this is qemu 1.0 (as shipped with Ubuntu
>> Precise).

Alex Bligh

reply via email to

[Prev in Thread] Current Thread [Next in Thread]