[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation
From: |
Cristian Klein |
Subject: |
Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation |
Date: |
Wed, 8 Oct 2014 17:36:11 +0900 |
On 07 Oct 2014, at 17:12 , Dr. David Alan Gilbert <address@hidden> wrote:
> * Cristian Klein (address@hidden) wrote:
>> On 04 Oct 2014, at 4:21 , Dr. David Alan Gilbert <address@hidden> wrote:
>>
>>>
>>> I've updated our github at:
>>> https://github.com/orbitfp7/qemu/tree/wp3-postcopy
>>>
>>> to have this version.
>>>
>>> and it corresponds to the tag:
>>> https://github.com/orbitfp7/qemu/releases/tag/wp3-postcopy-v4
>>
>> Hi Dave,
>>
>> I just tested this version of post-copy using the libvirt patches I recently
>> posted and it works a lot better. The video streaming VM migrates with a
>> downtime of less than 1 second. Before post-copy finishes, the VM is a bit
>> slow but otherwise running well.
>>
>> I also tested the patches with a VM doing ?ping? and the downtime was around
>> 0.6 seconds. I suspect that this delay could be caused by libvirt and not by
>> qemu. Notice that, libvirt is a bit special, in the sense that the VM is
>> migrated in suspended state and resumed only after the network was set up on
>> the destination. I will investigate and let you know.
>
> That's great news - although I'm not quite sure what caused the improvement,
> there
> were quite a few minor bug fixes and things but nothing that I can think of
> that
> would directly contribute (except the patches I'd sent you which you'd
> already tried).
Unfortunately, I made an error in my experiments (post-copy started too late).
I re-launched the experiments a few times. A ping VM observes a downtime of
about 2 seconds, whereas a video streaming VM of about 4 seconds.
Cristian
>>
>>> * Dr. David Alan Gilbert (git) (address@hidden) wrote:
>>>> From: "Dr. David Alan Gilbert" <address@hidden>
>>>>
>>>> Hi,
>>>> This is the 4th cut of my version of postcopy; it is designed for use with
>>>> the Linux kernel additions just posted by Andrea Arcangeli here:
>>>>
>>>> http://marc.info/?l=linux-kernel&m=141235633015100&w=2
>>>>
>>>> (Note: This is a new version compared to my previous postcopy patchset;
>>>> you'll
>>>> need to update the kernel to the new version.)
>>>>
>>>> Other than the new kernel ABI (which is only a small change to the
>>>> userspace side);
>>>> the major changes are;
>>>>
>>>> a) Code for host page size != target page size
>>>> b) Support for migration over fd
>>>> From Cristian Klein; this is for libvirt support which Cristian recently
>>>> posted to the libvirt list.
>>>> c) It's now build bisectable and builds on 32bit
>>>>
>>>> Testing wise; I've now done many thousand of postcopy migrations without
>>>> failure (both of idle and busy guests); so it seems pretty solid.
>>>>
>>>> Must-TODO's:
>>>> 1) A partially repeatable migration_cancel failure
>>>> 2) virt_test's migrate.with_reboot test is failing
>>>> 3) The ACPI fix in 2.1 that allowed migrating RAMBlocks to be larger than
>>>> the source feels like it needs looking at for postcopy.
>>>> 4) Paolo's comments with respect to the wakeup_request/is_running code
>>>> in the migration thread
>>>> 5) xbzrle needs disabling once in postcopy
>>>>
>>>> Later-TODO's:
>>>> 1) Control the rate of background page transfers during postcopy to
>>>> reduce their impact on the latency of postcopy requests.
>>>> 2) Work with RDMA
>>>> 3) Could destination RP be made blocking (as per discussion with Paolo;
>>>> I'm still worried that that changes too many assumptions)
>>>>
>>>>
>>>>
>>>> V4:
>>>> Initial support for host page size != target page size
>>>> - tested heavily on hps==tps
>>>> - only partially tested on hps!=tps systems
>>>> - This involved quite a bit of rework around the discard code
>>>> Updated to new kernel userfault ABI
>>>> - It won't work with the previous version
>>>> Fix mis-optimisation of postcopy request for wrong RAMBlock
>>>> request for block A offset n
>>>> un-needed fault for block B/m (already received - no req sent)
>>>> request for block B/l - wrongly sent as request for A/l
>>>> Fix thinko in discard bitmap processing (missed last word of bitmap)
>>>> Symptom: remap failures near the top of RAM if postcopy started late
>>>> Fix bug that caused kernel page acknowledgments to be misaligned
>>>> May have meant the guest was paused for longer than required
>>>> Fix potential for crashing cleaning up failed RP
>>>> Fixes in docs (from Yang)
>>>> Handle migration by fd as sockets if they are sockets
>>>> Build tested on 32bit
>>>> Fully build bisectable (x86-64)
>>>>
>>>>
>>>> Dave
>>>>
>>>> Cristian Klein (1):
>>>> Handle bi-directional communication for fd migration
>>>>
>>>> Dr. David Alan Gilbert (46):
>>>> QEMUSizedBuffer based QEMUFile
>>>> Tests: QEMUSizedBuffer/QEMUBuffer
>>>> Start documenting how postcopy works.
>>>> qemu_ram_foreach_block: pass up error value, and down the ramblock
>>>> name
>>>> improve DPRINTF macros, add to savevm
>>>> Add qemu_get_counted_string to read a string prefixed by a count byte
>>>> Create MigrationIncomingState
>>>> socket shutdown
>>>> Provide runtime Target page information
>>>> Return path: Open a return path on QEMUFile for sockets
>>>> Return path: socket_writev_buffer: Block even on non-blocking fd's
>>>> Migration commands
>>>> Return path: Control commands
>>>> Return path: Send responses from destination to source
>>>> Return path: Source handling of return path
>>>> qemu_loadvm errors and debug
>>>> ram_debug_dump_bitmap: Dump a migration bitmap as text
>>>> Rework loadvm path for subloops
>>>> Add migration-capability boolean for postcopy-ram.
>>>> Add wrappers and handlers for sending/receiving the postcopy-ram
>>>> migration messages.
>>>> QEMU_VM_CMD_PACKAGED: Send a packaged chunk of migration stream
>>>> migrate_init: Call from savevm
>>>> Allow savevm handlers to state whether they could go into postcopy
>>>> postcopy: OS support test
>>>> migrate_start_postcopy: Command to trigger transition to postcopy
>>>> MIG_STATE_POSTCOPY_ACTIVE: Add new migration state
>>>> qemu_savevm_state_complete: Postcopy changes
>>>> Postcopy page-map-incoming (PMI) structure
>>>> Postcopy: Maintain sentmap and calculate discard
>>>> postcopy: Incoming initialisation
>>>> postcopy: ram_enable_notify to switch on userfault
>>>> Postcopy: Postcopy startup in migration thread
>>>> Postcopy: Create a fault handler thread before marking the ram as
>>>> userfault
>>>> Page request: Add MIG_RPCOMM_REQPAGES reverse command
>>>> Page request: Process incoming page request
>>>> Page request: Consume pages off the post-copy queue
>>>> Add assertion to check migration_dirty_pages
>>>> postcopy_ram.c: place_page and helpers
>>>> Postcopy: Use helpers to map pages during migration
>>>> qemu_ram_block_from_host
>>>> Don't sync dirty bitmaps in postcopy
>>>> Host page!=target page: Cleanup bitmaps
>>>> Postcopy; Handle userfault requests
>>>> Start up a postcopy/listener thread ready for incoming page data
>>>> postcopy: Wire up loadvm_postcopy_ram_handle_{run,end} commands
>>>> End of migration for postcopy
>>>>
>>>> Makefile.objs | 2 +-
>>>> arch_init.c | 739 +++++++++++++++++++++++++--
>>>> docs/migration.txt | 189 +++++++
>>>> exec.c | 76 ++-
>>>> hmp-commands.hx | 15 +
>>>> hmp.c | 7 +
>>>> hmp.h | 1 +
>>>> include/exec/cpu-common.h | 8 +-
>>>> include/migration/migration.h | 130 +++++
>>>> include/migration/postcopy-ram.h | 106 ++++
>>>> include/migration/qemu-file.h | 47 ++
>>>> include/migration/vmstate.h | 2 +-
>>>> include/qemu/sockets.h | 1 +
>>>> include/qemu/typedefs.h | 9 +-
>>>> include/sysemu/sysemu.h | 43 +-
>>>> migration-fd.c | 24 +-
>>>> migration-rdma.c | 4 +-
>>>> migration.c | 693 +++++++++++++++++++++++++-
>>>> postcopy-ram.c | 1016
>>>> ++++++++++++++++++++++++++++++++++++++
>>>> qapi-schema.json | 14 +-
>>>> qemu-file.c | 598 +++++++++++++++++++++-
>>>> qmp-commands.hx | 19 +
>>>> savevm.c | 881 +++++++++++++++++++++++++++++++--
>>>> tests/Makefile | 2 +-
>>>> tests/test-vmstate.c | 74 +--
>>>> util/qemu-sockets.c | 28 ++
>>>> 26 files changed, 4550 insertions(+), 178 deletions(-)
>>>> create mode 100644 include/migration/postcopy-ram.h
>>>> create mode 100644 postcopy-ram.c
>>>>
>>>> --
>>>> 1.9.3
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>>
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
- Re: [Qemu-devel] [PATCH v4 47/47] End of migration for postcopy, (continued)
[Qemu-devel] [PATCH v4 34/47] Postcopy: Create a fault handler thread before marking the ram as userfault, Dr. David Alan Gilbert (git), 2014/10/03
[Qemu-devel] [PATCH v4 39/47] postcopy_ram.c: place_page and helpers, Dr. David Alan Gilbert (git), 2014/10/03
Re: [Qemu-devel] [PATCH v4 00/47] Postcopy implementation, Dr. David Alan Gilbert, 2014/10/03