[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH V9 00/46] Live Update
From: |
Steven Sistare |
Subject: |
Re: [PATCH V9 00/46] Live Update |
Date: |
Tue, 7 Feb 2023 14:01:36 -0500 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 |
and of course I immediately screwed up and forgot to delete series numbers from
the email titles. I will resend.
- Steve
On 2/7/2023 1:44 PM, Steven Sistare wrote:
> To make forward progress on this series and reduce its size, I will be posting
> those of its patches that can be independently integrated and have some value
> on their own, to a reduced distribution of reviewers for each. This is what
> I plan to break out:
>
> migration: fix populate_vfio_info
>
> memory: RAM_NAMED_FILE flag
>
> memory: flat section iterator
>
> oslib: qemu_clear_cloexec
>
> migration: simplify blockers
>
> migration: simplify notifiers
>
> python/machine: QEMUMachine full_args
>
> python/machine: QEMUMachine reopen_qmp_connection
>
> qapi: strList_from_string
> qapi: QAPI_LIST_LENGTH
> qapi: strv_from_strList
> qapi: strList unit tests
>
> - Steve
>
> On 12/7/2022 10:48 AM, Steven Sistare wrote:
>> This series desperately needs review in its intersection with live migration.
>> The code in other areas has been reviewed and revised multiple times --
>> thank you!
>>
>> David, Juan, can you spare some time to review this? I have done my best to
>> order
>> the patches logically (see the labelled groups in this email), and to
>> provide
>> complete and clear cover letter and commit messages. Can I do anything to
>> facilitate,
>> like doing a code walk through via zoom?
>>
>> And of course, I welcome anyone's feedback.
>>
>> Here is the original posting.
>>
>> https://lore.kernel.org/qemu-devel/1658851843-236870-1-git-send-email-steven.sistare@oracle.com/
>>
>> - Steve
>>
>> On 7/26/2022 12:09 PM, Steve Sistare wrote:
>>> This version of the live update patch series integrates live update into the
>>> live migration framework. The new interfaces are:
>>> * mode (migration parameter)
>>> * cpr-exec-args (migration parameter)
>>> * file (migration URI)
>>> * migrate-mode-enable (command-line argument)
>>> * only-cpr-capable (command-line argument)
>>>
>>> Provide the cpr-exec and cpr-reboot migration modes for live update. These
>>> save and restore VM state, with minimal guest pause time, so that qemu may
>>> be
>>> updated to a new version in between. The caller sets the mode parameter
>>> before invoking the migrate or migrate-incoming commands.
>>>
>>> In cpr-reboot mode, the migrate command saves state to a file, allowing
>>> one to quit qemu, reboot to an updated kernel, start an updated version of
>>> qemu, and resume via the migrate-incoming command. The caller must specify
>>> a migration URI that writes to and reads from a file. Unlike normal mode,
>>> the use of certain local storage options does not block the migration, but
>>> the caller must not modify guest block devices between the quit and restart.
>>> The guest RAM memory-backend must be shared, and the @x-ignore-shared
>>> migration capability must be set, to avoid saving it to the file. Guest RAM
>>> must be non-volatile across reboot, which can be achieved by backing it with
>>> a dax device, or /dev/shm PKRAM as proposed in
>>> https://lore.kernel.org/lkml/1617140178-8773-1-git-send-email-anthony.yznaga@oracle.com
>>> but this is not enforced. The restarted qemu arguments must match those
>>> used
>>> to initially start qemu, plus the -incoming option.
>>>
>>> The reboot mode supports vfio devices if the caller first suspends the
>>> guest,
>>> such as by issuing guest-suspend-ram to the qemu guest agent. The guest
>>> drivers' suspend methods flush outstanding requests and re-initialize the
>>> devices, and thus there is no device state to save and restore. After
>>> issuing migrate-incoming, the caller must issue a system_wakeup command to
>>> resume.
>>>
>>> In cpr-exec mode, the migrate command saves state to a file and directly
>>> exec's a new version of qemu on the same host, replacing the original
>>> process
>>> while retaining its PID. The caller must specify a migration URI that
>>> writes
>>> to and reads from a file, and resumes execution via the migrate-incoming
>>> command. Arguments for the new qemu process are taken from the
>>> cpr-exec-args
>>> migration parameter, and must include the -incoming option.
>>>
>>> Guest RAM must be backed by a memory backend with share=on, but cannot be
>>> memory-backend-ram. The memory is re-mmap'd in the updated process, so
>>> guest
>>> ram is efficiently preserved in place, albeit with new virtual addresses.
>>> In addition, the '-migrate-mode-enable cpr-exec' option is required. This
>>> causes secondary guest ram blocks (those not specified on the command line)
>>> to be allocated by mmap'ing a memfd. The memfds are kept open across exec,
>>> their values are saved in special cpr state which is retrieved after exec,
>>> and they are re-mmap'd. Since guest RAM is not copied, and storage blocks
>>> are not migrated, the caller must disable all capabilities related to page
>>> and block copy. The implementation ignores all related parameters.
>>>
>>> The exec mode supports vfio devices by preserving the vfio container, group,
>>> device, and event descriptors across the qemu re-exec, and by updating DMA
>>> mapping virtual addresses using VFIO_DMA_UNMAP_FLAG_VADDR and
>>> VFIO_DMA_MAP_FLAG_VADDR as defined in
>>>
>>> https://lore.kernel.org/kvm/1611939252-7240-1-git-send-email-steven.sistare@oracle.com
>>> and integrated in Linux kernel 5.12.
>>>
>>> Here is an example of updating qemu from v7.0.50 to v7.0.51 using exec mode.
>>> The software update is performed while the guest is running to minimize
>>> downtime.
>>>
>>> window 1 | window 2
>>> |
>>> # qemu-system-$arch ... |
>>> -migrate-mode-enable cpr-exec |
>>> QEMU 7.0.50 monitor - type 'help' ... |
>>> (qemu) info status |
>>> VM status: running |
>>> | # yum update qemu
>>> (qemu) migrate_set_parameter mode cpr-exec |
>>> (qemu) migrate_set_parameter cpr-exec-args |
>>> qemu-system-$arch ... -incoming defer |
>>> (qemu) migrate -d file:/tmp/qemu.sav |
>>> QEMU 7.0.51 monitor - type 'help' ... |
>>> (qemu) info status |
>>> VM status: paused (inmigrate) |
>>> (qemu) migrate_incoming file:/tmp/qemu.sav |
>>> (qemu) info status |
>>> VM status: running |
>>>
>>>
>>> Here is an example of updating the host kernel using reboot mode.
>>>
>>> window 1 | window 2
>>> |
>>> # qemu-system-$arch ... mem-path=/dev/dax0.0 |
>>> -migrate-mode-enable cpr-reboot |
>>> QEMU 7.0.50 monitor - type 'help' ... |
>>> (qemu) info status |
>>> VM status: running |
>>> | # yum update kernel-uek
>>> (qemu) migrate_set_parameter mode cpr-reboot |
>>> (qemu) migrate -d file:/tmp/qemu.sav |
>>> (qemu) quit |
>>> |
>>> # systemctl kexec |
>>> kexec_core: Starting new kernel |
>>> ... |
>>> |
>>> # qemu-system-$arch mem-path=/dev/dax0.0 ... |
>>> -incoming defer |
>>> QEMU 7.0.51 monitor - type 'help' ... |
>>> (qemu) info status |
>>> VM status: paused (inmigrate) |
>>> (qemu) migrate_incoming file:/tmp/qemu.sav |
>>> (qemu) info status |
>>> VM status: running |
>>>
>>> Changes from V8 to V9:
>>> vfio:
>>> - free all cpr state during unwind in vfio_connect_container
>>> - change cpr_resave_fd to return void, and avoid new unwind cases
>>> - delete incorrect .unmigratable=1 in vmstate handlers
>>> - add route batching in vfio_claim_vectors
>>> - simplified vfio intx cpr code
>>> - fix commit message for 'recover from unmap-all-vaddr failure'
>>> - verify suspended runstate for cpr-reboot mode
>>> Other:
>>> - delete cpr-save, cpr-exec, cpr-load
>>> - delete ram block vmstate handlers that were added in V8
>>> - rename cpr-enable option to migrate-mode-enable
>>> - add file URI for migration
>>> - add mode and cpr-exec-args migration parameters
>>> - add per-mode migration blockers
>>> - add mode checks in migration notifiers
>>> - fix suspended runstate during migration
>>> - replace RAM_ANON flag with RAM_NAMED_FILE
>>> - support memory-backend-epc
>>>
>>> Steve Sistare (44):
>>> migration: fix populate_vfio_info --- reboot mode ---
>>> memory: RAM_NAMED_FILE flag
>>> migration: file URI
>>> migration: mode parameter
>>> migration: migrate-enable-mode option
>>> migration: simplify blockers
>>> migration: per-mode blockers
>>> cpr: relax some blockers
>>> cpr: reboot mode
>>>
>>> qdev-properties: strList --- exec mode ---
>>> qapi: strList_from_string
>>> qapi: QAPI_LIST_LENGTH
>>> qapi: strv_from_strList
>>> qapi: strList unit tests
>>> migration: cpr-exec-args parameter
>>> migration: simplify notifiers
>>> migration: check mode in notifiers
>>> memory: flat section iterator
>>> oslib: qemu_clear_cloexec
>>> vl: helper to request re-exec
>>> cpr: preserve extra state
>>> cpr: exec mode
>>> cpr: add exec-mode blockers
>>> cpr: ram block blockers
>>> cpr: only-cpr-capable
>>> cpr: Mismatched GPAs fix
>>> hostmem-memfd: cpr support
>>> hostmem-epc: cpr support
>>>
>>> pci: export msix_is_pending --- vfio for exec ---
>>> vfio-pci: refactor for cpr
>>> vfio-pci: cpr part 1 (fd and dma)
>>> vfio-pci: cpr part 2 (msi)
>>> vfio-pci: cpr part 3 (intx)
>>> vfio-pci: recover from unmap-all-vaddr failure
>>>
>>> chardev: cpr framework --- misc for exec ---
>>> chardev: cpr for simple devices
>>> chardev: cpr for pty
>>> python/machine: QEMUMachine full_args
>>> python/machine: QEMUMachine reopen_qmp_connection
>>> tests/avocado: add cpr regression test
>>>
>>> vl: start on wakeup request --- vfio for reboot ---
>>> migration: fix suspended runstate
>>> migration: notifier error reporting
>>> vfio: allow cpr-reboot migration if suspended
>>>
>>> Mark Kanda, Steve Sistare (2):
>>> vhost: reset vhost devices for cpr
>>> chardev: cpr for sockets
>>>
>>> MAINTAINERS | 14 ++
>>> accel/xen/xen-all.c | 3 +
>>> backends/hostmem-epc.c | 18 +-
>>> backends/hostmem-file.c | 1 +
>>> backends/hostmem-memfd.c | 22 ++-
>>> backends/tpm/tpm_emulator.c | 11 +-
>>> block/parallels.c | 7 +-
>>> block/qcow.c | 7 +-
>>> block/vdi.c | 7 +-
>>> block/vhdx.c | 7 +-
>>> block/vmdk.c | 7 +-
>>> block/vpc.c | 7 +-
>>> block/vvfat.c | 7 +-
>>> chardev/char-mux.c | 1 +
>>> chardev/char-null.c | 1 +
>>> chardev/char-pty.c | 16 +-
>>> chardev/char-serial.c | 1 +
>>> chardev/char-socket.c | 48 +++++
>>> chardev/char-stdio.c | 31 +++
>>> chardev/char.c | 49 ++++-
>>> dump/dump.c | 4 +-
>>> gdbstub.c | 1 +
>>> hmp-commands.hx | 2 +-
>>> hw/9pfs/9p.c | 11 +-
>>> hw/core/qdev-properties-system.c | 12 ++
>>> hw/core/qdev-properties.c | 44 +++++
>>> hw/display/virtio-gpu-base.c | 8 +-
>>> hw/intc/arm_gic_kvm.c | 3 +-
>>> hw/intc/arm_gicv3_its_kvm.c | 3 +-
>>> hw/intc/arm_gicv3_kvm.c | 3 +-
>>> hw/misc/ivshmem.c | 8 +-
>>> hw/net/virtio-net.c | 10 +-
>>> hw/pci/msix.c | 2 +-
>>> hw/pci/pci.c | 12 ++
>>> hw/ppc/pef.c | 2 +-
>>> hw/ppc/spapr.c | 2 +-
>>> hw/ppc/spapr_events.c | 2 +-
>>> hw/ppc/spapr_rtas.c | 2 +-
>>> hw/remote/proxy.c | 7 +-
>>> hw/s390x/s390-virtio-ccw.c | 9 +-
>>> hw/scsi/vhost-scsi.c | 9 +-
>>> hw/vfio/common.c | 235 +++++++++++++++++++----
>>> hw/vfio/cpr.c | 177 ++++++++++++++++++
>>> hw/vfio/meson.build | 1 +
>>> hw/vfio/migration.c | 23 +--
>>> hw/vfio/pci.c | 336 ++++++++++++++++++++++++++++-----
>>> hw/vfio/trace-events | 1 +
>>> hw/virtio/vhost-vdpa.c | 6 +-
>>> hw/virtio/vhost.c | 32 +++-
>>> include/chardev/char-socket.h | 1 +
>>> include/chardev/char.h | 5 +
>>> include/exec/memory.h | 48 +++++
>>> include/exec/ram_addr.h | 1 +
>>> include/exec/ramblock.h | 1 +
>>> include/hw/pci/msix.h | 1 +
>>> include/hw/qdev-properties-system.h | 4 +
>>> include/hw/qdev-properties.h | 3 +
>>> include/hw/vfio/vfio-common.h | 12 ++
>>> include/hw/virtio/vhost.h | 1 +
>>> include/migration/blocker.h | 69 ++++++-
>>> include/migration/cpr-state.h | 30 +++
>>> include/migration/cpr.h | 20 ++
>>> include/migration/misc.h | 13 +-
>>> include/migration/vmstate.h | 2 +
>>> include/qapi/util.h | 28 +++
>>> include/qemu/osdep.h | 9 +
>>> include/sysemu/runstate.h | 2 +
>>> migration/cpr-state.c | 362
>>> ++++++++++++++++++++++++++++++++++++
>>> migration/cpr.c | 85 +++++++++
>>> migration/file.c | 62 ++++++
>>> migration/file.h | 14 ++
>>> migration/meson.build | 3 +
>>> migration/migration.c | 268 +++++++++++++++++++++++---
>>> migration/ram.c | 24 ++-
>>> migration/target.c | 1 +
>>> migration/trace-events | 12 ++
>>> monitor/hmp-cmds.c | 59 +++---
>>> monitor/hmp.c | 3 +
>>> monitor/qmp.c | 4 +
>>> python/qemu/machine/machine.py | 14 ++
>>> qapi/char.json | 7 +-
>>> qapi/migration.json | 68 ++++++-
>>> qapi/qapi-util.c | 37 ++++
>>> qemu-options.hx | 50 ++++-
>>> replay/replay.c | 4 +
>>> softmmu/memory.c | 31 ++-
>>> softmmu/physmem.c | 100 +++++++++-
>>> softmmu/runstate.c | 42 ++++-
>>> softmmu/vl.c | 10 +
>>> stubs/cpr-state.c | 26 +++
>>> stubs/meson.build | 2 +
>>> stubs/migr-blocker.c | 9 +-
>>> stubs/migration.c | 33 ++++
>>> target/i386/kvm/kvm.c | 8 +-
>>> target/i386/nvmm/nvmm-all.c | 4 +-
>>> target/i386/sev.c | 2 +-
>>> target/i386/whpx/whpx-all.c | 3 +-
>>> tests/avocado/cpr.py | 176 ++++++++++++++++++
>>> tests/unit/meson.build | 1 +
>>> tests/unit/test-strlist.c | 81 ++++++++
>>> trace-events | 1 +
>>> ui/spice-core.c | 5 +-
>>> ui/vdagent.c | 5 +-
>>> util/oslib-posix.c | 9 +
>>> util/oslib-win32.c | 4 +
>>> 105 files changed, 2781 insertions(+), 330 deletions(-)
>>> create mode 100644 hw/vfio/cpr.c
>>> create mode 100644 include/migration/cpr-state.h
>>> create mode 100644 include/migration/cpr.h
>>> create mode 100644 migration/cpr-state.c
>>> create mode 100644 migration/cpr.c
>>> create mode 100644 migration/file.c
>>> create mode 100644 migration/file.h
>>> create mode 100644 stubs/cpr-state.c
>>> create mode 100644 stubs/migration.c
>>> create mode 100644 tests/avocado/cpr.py
>>> create mode 100644 tests/unit/test-strlist.c
>>>