[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] spapr: Migrate CAS reboot flag

From: Cédric Le Goater
Subject: Re: [PATCH] spapr: Migrate CAS reboot flag
Date: Tue, 21 Jan 2020 07:57:29 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2

On 1/21/20 4:41 AM, David Gibson wrote:
> On Wed, Jan 15, 2020 at 07:10:47PM +0100, Cédric Le Goater wrote:
>> On 1/15/20 6:48 PM, Greg Kurz wrote:
>>> Migration can potentially race with CAS reboot. If the migration thread
>>> completes migration after CAS has set spapr->cas_reboot but before the
>>> mainloop could pick up the reset request and reset the machine, the
>>> guest is migrated unrebooted and the destination doesn't reboot it
>>> either because it isn't aware a CAS reboot was needed (eg, because a
>>> device was added before CAS). This likely result in a broken or hung
>>> guest.
>>> Even if it is small, the window between CAS and CAS reboot is enough to
>>> re-qualify spapr->cas_reboot as state that we should migrate. Add a new
>>> subsection for that and always send it when a CAS reboot is pending.
>>> This may cause migration to older QEMUs to fail but it is still better
>>> than end up with a broken guest.
>>> The destination cannot honour the CAS reboot request from a post load
>>> handler because this must be done after the guest is fully restored.
>>> It is thus done from a VM change state handler.
>>> Reported-by: Lukáš Doktor <address@hidden>
>>> Signed-off-by: Greg Kurz <address@hidden>
>> Cédric Le Goater <address@hidden>
>> Nice work ! That was quite complex to catch !
> It is a very nice analysis.  However, I'm disinclined to merge this
> for the time being.
> My preferred approach would be to just eliminate CAS reboots
> altogether, since that has other benefits.  I'm feeling like this
> isn't super-urgent, since CAS reboots are extremely rare in practice,
> now that we've eliminated the one for the irq switchover.

Yes. The possibility of a migration in the window between CAS and 
CAS reboot must be even more rare.


> However, if it's not looking like we'll be ready to do that as the
> qemu-5.0 release approaches, then I'll be more than willing to
> reconsider this.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]