[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Live migration fails with "Mismatched RAM page size ram-node0 (local
Dr. David Alan Gilbert
Re: Live migration fails with "Mismatched RAM page size ram-node0 (local) 2097152 != 1526773257204281392"
Tue, 2 Feb 2021 16:51:23 +0000
* Damir Chanyshev (email@example.com) wrote:
> Qemu version 5.1 host os Debian 10.7
> Two exactly the same machines ( except ram size 380G and 1.5T )
> Live migration fails (from host with 380G ram to 1.5T) with errors like this:
> Feb 02 16:26:13 QEMU: kvm: load of migration failed: Invalid argument
> Feb 02 16:26:13 QEMU: kvm: error while loading state for
> instance 0x0 of device 'ram'
> Feb 02 16:26:13 QEMU: kvm: Mismatched RAM page size ram-node0
> (local) 2097152 != 1526773257204281392
> I think it's some overflow issue.
That's a fun error; I've not seen anyone manage to trigger that before.
Could you please post the qemu command line from both the source and the
My guess here is that the use of huge pages is different on the source
and destination; when the destination is using huge pages it will read
the page size of the block from the stream and compare it to the page
size it's using - they should match (if postcopy is enabled).
To me it looks like the destination is using 2MB huge pages
(probably explicitly from something like /dev/hugepages)
and maybe the source isn't; the source (because it's not using
hugepages) didn't bother sending the page size, so the destination
then reads some junk off the stream; that junk is probably the name
of the next RAMBlock, and it's probably a PCI device, so that
huge number is hex 15303030303A3030 which is 21 bytes long
which looks like the start of a PCI address; maybe for video RAM.
Or in a simple answer; if you've got postcopy enabled, and you're
using hugepages, make sure you use them consistently on source
> Damir Chanyshev
Dr. David Alan Gilbert / firstname.lastname@example.org / Manchester, UK