[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-ppc] Problem with "savevm" on ppc64
From: |
David Gibson |
Subject: |
Re: [Qemu-ppc] Problem with "savevm" on ppc64 |
Date: |
Mon, 24 Oct 2016 12:22:27 +1100 |
User-agent: |
Mutt/1.7.1 (2016-10-04) |
On Fri, Oct 21, 2016 at 08:45:21AM +0200, Thomas Huth wrote:
> On 21.10.2016 07:26, David Gibson wrote:
> > On Thu, Oct 20, 2016 at 03:17:12PM +0200, Thomas Huth wrote:
> >> Hi all,
> >>
> >> I'm currently facing a strange problem with the "savevm" HMP command on
> >> ppc64 with TCG and the pseries machine. Steps for reproduction:
> >>
> >> 1) Create a disk image:
> >> qemu-img create -f qcow2 /tmp/test.qcow2 1M
> >>
> >> 2) Start QEMU (in TCG mode):
> >> qemu-system-ppc64 -nographic -vga none -m 256 -hda /tmp/test.qcow2
> >>
> >> 3) Hit "CTRL-a c" to enter the HMP monitor
> >>
> >> 4) Type "savevm test1" to save a snapshot
> >>
> >> The savevm command then hangs forever and the test.qcow2 image keeps
> >> growing and growing.
> >>
> >> It seems like qemu_savevm_state_iterate() does not make any more
> >> progress because ram_save_iterate() keeps returning 0 ... but why can
> >> that happen?
> >
> > Hmm. You don't mention installing anything on the disk image, so I'm
> > assuming the VM is just sitting in SLOF, unable to boot.
>
> Right. This is basically what is currently happening with the failing
> test tests/qemu-iotests/007 on ppc64.
>
> [...]
> > So, looking at this I think it's unsafe. htab_save_first_pass() never
> > examines dirty bits, so we could get:
> > htab_save_first_pass() called once, saves part of HPT
> > guest dirties an HPTE in the already saved region
> > enter migration completion stage
> > htab_save_first_pass() saves the remainder of the HPT, returns 1
> >
> > That would trigger the code to think the HPT migration is complete,
> > without ever saving the HPTE that got dirtied part way through.
>
> There's still htab_save_complete() which seems always to be called at
> the end - and this function calls htab_save_later_pass() again to save
> the remaining entries. But I am really no expert in this part of the
> code, so maybe I've also missed something here.
No, I think you're right. Yes, that should be ok. Reinforces really
that the return value from the iterate function is basically useless.
The decision about whether to complete or not is made on other
factors.
> > But.. then I looked further and got *really* confused.
> >
> > The comment above qemu_savevm_state_iterate() and the logic in
> > qemu_savevm_state() say that the iterate function returns >0 to
> > indicate that it is done and we can move onto the completion phase.
> >
> > But both ram_save_iterate() and block_save_iterate() seem to have that
> > inverted: they return >0 if they actually saved something.
>
> Yes, that confused me completely, too! Something is really fishy in the
> logic here. I hope that one of the migration experts can enlighten us
> how it is really meant to work...
>
> Thomas
>
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature