* zhanghailiang (address@hidden) wrote:
Hi David,
When i migrated VM in postcopy way when configuring VM with '-realtime
mlock=on' option,
It failed, and reports "postcopy_ram_hosttest: remap_anon_pages not available: File
exists" in destination,
Is it a bug of userfaultfd API?
Thanks.
cc: Andrea
reproduce Steps:
Source:
qemu-postcopy/qemu # x86_64-softmmu/qemu-system-x86_64 -msg timestamp=on \
-machine pc-i440fx-2.2,accel=kvm -m 1024 -realtime mlock=on -smp 4 \
-hda /mnt/sdb/pure_IMG/redhat/redhat-6.4-httpd.img -vnc :11 -monitor stdio
Destination:
qemu-postcopy/qemu # x86_64-softmmu/qemu-system-x86_64 -msg timestamp=on \
-machine pc-i440fx-2.2,accel=kvm -m 1024 -realtime mlock=on -smp 4 \
-hda /mnt/sdb/pure_IMG/redhat/redhat-6.4-httpd.img -vnc :12 -monitor stdio \
-incoming unix:/mnt/migrate.sock
(1) migrate_set_capability x-postcopy-ram on
(2) migrate -d unix:/mnt/migrate.sock
In Destination, it fails, reports:
address@hidden qemu_loadvm_state_main QEMU_VM_COMMAND ret: 0
address@hidden qemu_loadvm_state loop: section_type=6
address@hidden loadvm_postcopy_ram_handle_advise
postcopy_ram_hosttest: remap_anon_pages not available: File exists
address@hidden qemu_loadvm_state_main QEMU_VM_COMMAND ret: -1
Yes, I think I need to chat to Andrea about how that's supposed to work with
mlock.
I've added it to my list and we'll figure it out; I suspect on the destination
I need to avoid doing the mlockall until after postcopy completes.
And one more thing, i want to know: ;)
Why we must start precopy first before start postcopy?
Can we do postcopy at the beginning of migration?
You can send migrate_start_postcopy immediately after you send the migrate
command, which is very close to no-precopy; the original API had a timeout
and if you set it to 0 then it would do exactly no-precopy, but the current API
was preferred by reviewers, and is simpler.
With testing, the best performance is from doing one full pass of precopy and
then starting postcopy; that way all of the kernel and other static stuff
has already moved to the destination, and there are much fewer page requests.
Thanks for the report,
Dave
Thanks,
zhanghailiang
On 2014/10/4 1:47, Dr. David Alan Gilbert (git) wrote:
From: "Dr. David Alan Gilbert" <address@hidden>
Hi,
This is the 4th cut of my version of postcopy; it is designed for use with
the Linux kernel additions just posted by Andrea Arcangeli here:
http://marc.info/?l=linux-kernel&m=141235633015100&w=2
(Note: This is a new version compared to my previous postcopy patchset; you'll
need to update the kernel to the new version.)
Other than the new kernel ABI (which is only a small change to the userspace
side);
the major changes are;
a) Code for host page size != target page size
b) Support for migration over fd
From Cristian Klein; this is for libvirt support which Cristian recently
posted to the libvirt list.
c) It's now build bisectable and builds on 32bit
Testing wise; I've now done many thousand of postcopy migrations without
failure (both of idle and busy guests); so it seems pretty solid.
Must-TODO's:
1) A partially repeatable migration_cancel failure
2) virt_test's migrate.with_reboot test is failing
3) The ACPI fix in 2.1 that allowed migrating RAMBlocks to be larger than
the source feels like it needs looking at for postcopy.
4) Paolo's comments with respect to the wakeup_request/is_running code
in the migration thread
5) xbzrle needs disabling once in postcopy
Later-TODO's:
1) Control the rate of background page transfers during postcopy to
reduce their impact on the latency of postcopy requests.
2) Work with RDMA
3) Could destination RP be made blocking (as per discussion with Paolo;
I'm still worried that that changes too many assumptions)
V4:
Initial support for host page size != target page size
- tested heavily on hps==tps
- only partially tested on hps!=tps systems
- This involved quite a bit of rework around the discard code
Updated to new kernel userfault ABI
- It won't work with the previous version
Fix mis-optimisation of postcopy request for wrong RAMBlock
request for block A offset n
un-needed fault for block B/m (already received - no req sent)
request for block B/l - wrongly sent as request for A/l
Fix thinko in discard bitmap processing (missed last word of bitmap)
Symptom: remap failures near the top of RAM if postcopy started late
Fix bug that caused kernel page acknowledgments to be misaligned
May have meant the guest was paused for longer than required
Fix potential for crashing cleaning up failed RP
Fixes in docs (from Yang)
Handle migration by fd as sockets if they are sockets
Build tested on 32bit
Fully build bisectable (x86-64)
Dave
Cristian Klein (1):
Handle bi-directional communication for fd migration
Dr. David Alan Gilbert (46):
QEMUSizedBuffer based QEMUFile
Tests: QEMUSizedBuffer/QEMUBuffer
Start documenting how postcopy works.
qemu_ram_foreach_block: pass up error value, and down the ramblock
name
improve DPRINTF macros, add to savevm
Add qemu_get_counted_string to read a string prefixed by a count byte
Create MigrationIncomingState
socket shutdown
Provide runtime Target page information
Return path: Open a return path on QEMUFile for sockets
Return path: socket_writev_buffer: Block even on non-blocking fd's
Migration commands
Return path: Control commands
Return path: Send responses from destination to source
Return path: Source handling of return path
qemu_loadvm errors and debug
ram_debug_dump_bitmap: Dump a migration bitmap as text
Rework loadvm path for subloops
Add migration-capability boolean for postcopy-ram.
Add wrappers and handlers for sending/receiving the postcopy-ram
migration messages.
QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream
migrate_init: Call from savevm
Allow savevm handlers to state whether they could go into postcopy
postcopy: OS support test
migrate_start_postcopy: Command to trigger transition to postcopy
MIG_STATE_POSTCOPY_ACTIVE: Add new migration state
qemu_savevm_state_complete: Postcopy changes
Postcopy page-map-incoming (PMI) structure
Postcopy: Maintain sentmap and calculate discard
postcopy: Incoming initialisation
postcopy: ram_enable_notify to switch on userfault
Postcopy: Postcopy startup in migration thread
Postcopy: Create a fault handler thread before marking the ram as
userfault
Page request: Add MIG_RPCOMM_REQPAGES reverse command
Page request: Process incoming page request
Page request: Consume pages off the post-copy queue
Add assertion to check migration_dirty_pages
postcopy_ram.c: place_page and helpers
Postcopy: Use helpers to map pages during migration
qemu_ram_block_from_host
Don't sync dirty bitmaps in postcopy
Host page!=target page: Cleanup bitmaps
Postcopy; Handle userfault requests
Start up a postcopy/listener thread ready for incoming page data
postcopy: Wire up loadvm_postcopy_ram_handle_{run,end} commands
End of migration for postcopy
Makefile.objs | 2 +-
arch_init.c | 739 +++++++++++++++++++++++++--
docs/migration.txt | 189 +++++++
exec.c | 76 ++-
hmp-commands.hx | 15 +
hmp.c | 7 +
hmp.h | 1 +
include/exec/cpu-common.h | 8 +-
include/migration/migration.h | 130 +++++
include/migration/postcopy-ram.h | 106 ++++
include/migration/qemu-file.h | 47 ++
include/migration/vmstate.h | 2 +-
include/qemu/sockets.h | 1 +
include/qemu/typedefs.h | 9 +-
include/sysemu/sysemu.h | 43 +-
migration-fd.c | 24 +-
migration-rdma.c | 4 +-
migration.c | 693 +++++++++++++++++++++++++-
postcopy-ram.c | 1016 ++++++++++++++++++++++++++++++++++++++
qapi-schema.json | 14 +-
qemu-file.c | 598 +++++++++++++++++++++-
qmp-commands.hx | 19 +
savevm.c | 881 +++++++++++++++++++++++++++++++--
tests/Makefile | 2 +-
tests/test-vmstate.c | 74 +--
util/qemu-sockets.c | 28 ++
26 files changed, 4550 insertions(+), 178 deletions(-)
create mode 100644 include/migration/postcopy-ram.h
create mode 100644 postcopy-ram.c
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK
.