qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/4] ppc: nested TCG migration (KVM-on-TCG)


From: Mark Cave-Ayland
Subject: Re: [RFC PATCH 0/4] ppc: nested TCG migration (KVM-on-TCG)
Date: Thu, 24 Feb 2022 21:00:24 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.0

On 24/02/2022 18:58, Fabiano Rosas wrote:

This series implements the migration for a TCG pseries guest running a
nested KVM guest. This is just like migrating a pseries TCG guest, but
with some extra state to allow a nested guest to continue to run on
the destination.

Unfortunately the regular TCG migration scenario (not nested) is not
fully working so I cannot be entirely sure the nested migration is
correct. I have included a couple of patches for the general migration
case that (I think?) improve the situation a bit, but I'm still seeing
hard lockups and other issues with more than 1 vcpu.

This is more of an early RFC to see if anyone spots something right
away. I haven't made much progress in debugging the general TCG
migration case so if anyone has any input there as well I'd appreciate
it.

Thanks

Fabiano Rosas (4):
   target/ppc: TCG: Migrate tb_offset and decr
   spapr: TCG: Migrate spapr_cpu->prod
   hw/ppc: Take nested guest into account when saving timebase
   spapr: Add KVM-on-TCG migration support

  hw/ppc/ppc.c                    | 17 +++++++-
  hw/ppc/spapr.c                  | 19 ++++++++
  hw/ppc/spapr_cpu_core.c         | 77 +++++++++++++++++++++++++++++++++
  include/hw/ppc/spapr_cpu_core.h |  2 +-
  target/ppc/machine.c            | 61 ++++++++++++++++++++++++++
  5 files changed, 174 insertions(+), 2 deletions(-)

FWIW I noticed there were some issues with migrating the decrementer on Mac machines a while ago which causes a hang on the destination with TCG (for MacOS on a x86 host in my case). Have a look at the following threads for reference:

https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg00546.html
https://lists.gnu.org/archive/html/qemu-devel/2016-01/msg04622.html

IIRC there is code that assumes any migration in PPC is being done live, and so adjusts the timebase on the destination to reflect wall clock time by recalculating tb_offset. I haven't looked at the code for a while but I think the outcome was that there needs to be 2 phases in migration: the first is to migrate the timebase as-is for guests that are paused during migration, whilst the second is to notify hypervisor-aware guest OSs such as Linux to make the timebase adjustment if required if the guest is running.


ATB,

Mark.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]