qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] NVDIMM live migration broken?


From: Haozhong Zhang
Subject: Re: [Qemu-devel] NVDIMM live migration broken?
Date: Tue, 27 Jun 2017 22:30:01 +0800
User-agent: NeoMutt/20170428 (1.8.2)

On 06/26/17 13:56 +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 26, 2017 at 10:05:01AM +0800, Haozhong Zhang wrote:
> > On 06/23/17 10:55 +0100, Stefan Hajnoczi wrote:
> > > On Fri, Jun 23, 2017 at 08:13:13AM +0800, address@hidden wrote:
> > > > On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> > > > > I tried live migrating a guest with NVDIMM on qemu.git/master 
> > > > > (edf8bc984):
> > > > > 
> > > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > > >          -object 
> > > > > memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > >        -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > >        -drive if=virtio,file=test.img,format=raw
> > > > > 
> > > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > > >          -object 
> > > > > memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > >        -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > >        -drive if=virtio,file=test.img,format=raw \
> > > > >        -incoming tcp::1234
> > > > > 
> > > > >   (qemu) migrate tcp:127.0.0.1:1234
> > > > > 
> > > > > The guest kernel panics or hangs every time on the destination.  It
> > > > > happens as long as the nvdimm device is present - I didn't even mount 
> > > > > it
> > > > > inside the guest.
> > > > > 
> > > > > Is migration expected to work?
> > > > 
> > > > Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
> > > > have a look at this issue.
> > > 
> > > Great, thanks!
> > > 
> > > David Gilbert suggested the following on IRC, it sounds like a good
> > > starting point for debugging:
> > > 
> > > Launch the destination QEMU with -S (vcpus will be paused) and after
> > > migration has completed, compare the NVDIMM contents on source and
> > > destination.
> > > 
> > 
> > Which host and guest kernel are you testing? Is any workload running
> > in guest when migration?
> > 
> > I just tested QEMU commit edf8bc984 with host/guest kernel 4.8.0, and
> > could not reproduce the issue.
> 
> I can still reproduce the problem on qemu.git edf8bc984.
> 
> My guest kernel is fairly close to yours.  The host kernel is newer.
> 
> Host kernel: 4.11.6-201.fc25.x86_64
> Guest kernel: 4.8.8-300.fc25.x86_64
> 
> Command-line:
> 
>   qemu-system-x86_64 \
>       -enable-kvm \
>       -cpu host \
>       -machine pc,nvdimm \
>       -m 1G,slots=4,maxmem=8G \
>       -object 
> memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
>       -device nvdimm,id=nvdimm1,memdev=mem1 \
>       -drive if=virtio,file=test.img,format=raw \
>       -display none \
>       -serial stdio \
>       -monitor unix:/tmp/monitor.sock,server,nowait
> 
> Start migration at the guest login prompt.  You don't need to log in or
> do anything inside the guest.
> 
> There seems to be a guest RAM corruption because I get different
> backtraces inside the guest every time.
> 
> The problem goes away if I remove -device nvdimm.
> 

I managed to reproduce this bug. After bisect between good v2.8.0 and
bad edf8bc984, it looks a regression introduced by
    6b6712efccd "ram: Split dirty bitmap by RAMBlock"
This commit may result in guest crash after migration if any host
memory backend is used.

Could you test whether the attached draft patch fixes this bug? If yes,
I will make a formal patch later.

Thanks,
Haozhong

Attachment: migration-fix.patch
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]