On Tue, Mar 13, 2018 at 03:57:34PM +0800, address@hidden wrote:
From: Xiao Guangrong <address@hidden>
Currently the page being compressed is allowed to be updated by
the VM on the source QEMU, correspondingly the destination QEMU
just ignores the decompression error. However, we completely miss
the chance to catch real errors, then the VM is corrupted silently
To make the migration more robuster, we copy the page to a buffer
first to avoid it being written by VM, then detect and handle the
errors of both compression and decompression errors properly
Not sure I missed anything important, but I'll just shoot my thoughts
as questions (again)...
Actually this is a more general question? Say, even without
compression, we can be sending a page that is being modified.
However, IMHO we don't need to worry that, since if that page is
modified, we'll definitely send that page again, so the new page will
replace the old. So on destination side, even if decompress() failed
on a page it'll be fine IMHO. Though now we are copying the corrupted
buffer. On that point, I fully agree that we should not - maybe we
can just drop the page entirely?
For non-compress pages, we can't detect that, so we'll copy the page
even if corrupted.
The special part for compression would be: would the deflate() fail if
there is concurrent update to the buffer being compressed? And would
that corrupt the whole compression stream, or it would only fail the
deflate() call?