qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] f54822: migration: introduce decompress-error


From: GitHub
Subject: [Qemu-commits] [qemu/qemu] f54822: migration: introduce decompress-error-check
Date: Mon, 04 Jun 2018 06:23:48 -0700

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: f548222c24342ca74689de7794f9006b43f86a54
      
https://github.com/qemu/qemu/commit/f548222c24342ca74689de7794f9006b43f86a54
  Author: Xiao Guangrong <address@hidden>
  Date:   2018-06-04 (Mon, 04 Jun 2018)

  Changed paths:
    M hw/arm/virt.c
    M hw/i386/pc_piix.c
    M hw/i386/pc_q35.c
    M include/hw/compat.h
    M migration/migration.c
    M migration/migration.h
    M migration/ram.c

  Log Message:
  -----------
  migration: introduce decompress-error-check

QEMU 3.0 enables strict check for compression & decompression to
make the migration more robust, that depends on the source to fix
the internal design which triggers the unexpected error conditions

To make it work for migrating old version QEMU to 2.13 QEMU, we
introduce this parameter to disable the error check on the
destination which is the default behavior of the machine type
which is older than 2.13, alternately, the strict check can be
enabled explicitly as followings:
      -M pc-q35-2.11 -global migration.decompress-error-check=true

Signed-off-by: Xiao Guangrong <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: b895de502717b83b4e5f089df617cb23530c4d2d
      
https://github.com/qemu/qemu/commit/b895de502717b83b4e5f089df617cb23530c4d2d
  Author: Cédric Le Goater <address@hidden>
  Date:   2018-06-04 (Mon, 04 Jun 2018)

  Changed paths:
    M exec.c
    M include/exec/cpu-common.h
    M migration/postcopy-ram.c
    M migration/ram.c
    M migration/savevm.c

  Log Message:
  -----------
  migration: discard non-migratable RAMBlocks

On the POWER9 processor, the XIVE interrupt controller can control
interrupt sources using MMIO to trigger events, to EOI or to turn off
the sources. Priority management and interrupt acknowledgment is also
controlled by MMIO in the presenter sub-engine.

These MMIO regions are exposed to guests in QEMU with a set of 'ram
device' memory mappings, similarly to VFIO, and the VMAs are populated
dynamically with the appropriate pages using a fault handler.

But, these regions are an issue for migration. We need to discard the
associated RAMBlocks from the RAM state on the source VM and let the
destination VM rebuild the memory mappings on the new host in the
post_load() operation just before resuming the system.

To achieve this goal, the following introduces a new RAMBlock flag
RAM_MIGRATABLE which is updated in the vmstate_register_ram() and
vmstate_unregister_ram() routines. This flag is then used by the
migration to identify RAMBlocks to discard on the source. Some checks
are also performed on the destination to make sure nothing invalid was
sent.

This change impacts the boston, malta and jazz mips boards for which
migration compatibility is broken.

Signed-off-by: Cédric Le Goater <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: 0f073f44df109ea0910d67caede70dec95956ff6
      
https://github.com/qemu/qemu/commit/0f073f44df109ea0910d67caede70dec95956ff6
  Author: Dr. David Alan Gilbert <address@hidden>
  Date:   2018-06-04 (Mon, 04 Jun 2018)

  Changed paths:
    M migration/migration.c
    M qapi/migration.json

  Log Message:
  -----------
  migration: Don't activate block devices if using -S

Activating the block devices causes the locks to be taken on
the backing file.  If we're running with -S and the destination libvirt
hasn't started the destination with 'cont', it's expecting the locks are
still untaken.

Don't activate the block devices if we're not going to autostart the VM;
'cont' already will do that anyway.   This change is tied to the new
migration capability 'late-block-activate' that defaults to off, keeping
the old behaviour by default.

bz: https://bugzilla.redhat.com/show_bug.cgi?id=1560854
Signed-off-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: f38f6d4155c0c5a3e96d81183362a40e2cc09b4c
      
https://github.com/qemu/qemu/commit/f38f6d4155c0c5a3e96d81183362a40e2cc09b4c
  Author: Lidong Chen <address@hidden>
  Date:   2018-06-04 (Mon, 04 Jun 2018)

  Changed paths:
    M migration/rdma.c

  Log Message:
  -----------
  migration: remove unnecessary variables len in QIOChannelRDMA

Because qio_channel_rdma_writev and qio_channel_rdma_readv maybe invoked
by different threads concurrently, this patch removes unnecessary variables
len in QIOChannelRDMA and use local variable instead.

Signed-off-by: Lidong Chen <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Daniel P. Berrangé <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>

Signed-off-by: Lidong Chen <address@hidden>


  Commit: c5e76115ccb4979cec795a8ae38becd07c2fde9f
      
https://github.com/qemu/qemu/commit/c5e76115ccb4979cec795a8ae38becd07c2fde9f
  Author: Lidong Chen <address@hidden>
  Date:   2018-06-04 (Mon, 04 Jun 2018)

  Changed paths:
    M migration/rdma.c
    M migration/trace-events

  Log Message:
  -----------
  migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect

When cancel migration during RDMA precopy, the source qemu main thread hangs 
sometime.

The backtrace is:
    (gdb) bt
    #0  0x00007f249eabd43d in write () from /lib64/libpthread.so.0
    #1  0x00007f24a1ce98e4 in rdma_get_cm_event (channel=0x4675d10, 
event=0x7ffe2f643dd0) at src/cma.c:2189
    #2  0x00000000007b6166 in qemu_rdma_cleanup (rdma=0x6784000) at 
migration/rdma.c:2296
    #3  0x00000000007b7cae in qio_channel_rdma_close (ioc=0x3bfcc30, errp=0x0) 
at migration/rdma.c:2999
    #4  0x00000000008db60e in qio_channel_close (ioc=0x3bfcc30, errp=0x0) at 
io/channel.c:273
    #5  0x00000000007a8765 in channel_close (opaque=0x3bfcc30) at 
migration/qemu-file-channel.c:98
    #6  0x00000000007a71f9 in qemu_fclose (f=0x527c000) at 
migration/qemu-file.c:334
    #7  0x0000000000795b96 in migrate_fd_cleanup (opaque=0x3b46280) at 
migration/migration.c:1162
    #8  0x000000000093a71b in aio_bh_call (bh=0x3db7a20) at util/async.c:90
    #9  0x000000000093a7b2 in aio_bh_poll (ctx=0x3b121c0) at util/async.c:118
    #10 0x000000000093f2ad in aio_dispatch (ctx=0x3b121c0) at 
util/aio-posix.c:436
    #11 0x000000000093ab41 in aio_ctx_dispatch (source=0x3b121c0, callback=0x0, 
user_data=0x0)
  at util/async.c:261
    #12 0x00007f249f73c7aa in g_main_context_dispatch () from 
/lib64/libglib-2.0.so.0
    #13 0x000000000093dc5e in glib_pollfds_poll () at util/main-loop.c:215
    #14 0x000000000093dd4e in os_host_main_loop_wait (timeout=28000000) at 
util/main-loop.c:263
    #15 0x000000000093de05 in main_loop_wait (nonblocking=0) at 
util/main-loop.c:522
    #16 0x00000000005bc6a5 in main_loop () at vl.c:1944
    #17 0x00000000005c39b5 in main (argc=56, argv=0x7ffe2f6443f8, 
envp=0x3ad0030) at vl.c:4752

It does not get the RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect 
sometime.

According to IB Spec once active side send DREQ message, it should wait for 
DREP message
and only once it arrived it should trigger a DISCONNECT event. DREP message can 
be dropped
due to network issues.
For that case the spec defines a DREP_timeout state in the CM state machine, if 
the DREP is
dropped we should get a timeout and a TIMEWAIT_EXIT event will be trigger.
Unfortunately the current kernel CM implementation doesn't include the 
DREP_timeout state
and in above scenario we will not get DISCONNECT or TIMEWAIT_EXIT events.

So it should not invoke rdma_get_cm_event which may hang forever, and the event 
channel
is also destroyed in qemu_rdma_cleanup.

Signed-off-by: Lidong Chen <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>


  Commit: b74588a493c12c8d389f08004318c7d01ebfda70
      
https://github.com/qemu/qemu/commit/b74588a493c12c8d389f08004318c7d01ebfda70
  Author: Peter Maydell <address@hidden>
  Date:   2018-06-04 (Mon, 04 Jun 2018)

  Changed paths:
    M exec.c
    M hw/arm/virt.c
    M hw/i386/pc_piix.c
    M hw/i386/pc_q35.c
    M include/exec/cpu-common.h
    M include/hw/compat.h
    M migration/migration.c
    M migration/migration.h
    M migration/postcopy-ram.c
    M migration/ram.c
    M migration/rdma.c
    M migration/savevm.c
    M migration/trace-events
    M qapi/migration.json

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20180604' 
into staging

migration/next for 20180604

# gpg: Signature made Mon 04 Jun 2018 05:14:24 BST
# gpg:                using RSA key F487EF185872D723
# gpg: Good signature from "Juan Quintela <address@hidden>"
# gpg:                 aka "Juan Quintela <address@hidden>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20180604:
  migration: not wait RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect
  migration: remove unnecessary variables len in QIOChannelRDMA
  migration: Don't activate block devices if using -S
  migration: discard non-migratable RAMBlocks
  migration: introduce decompress-error-check

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/163670542fa3...b74588a493c1
      **NOTE:** This service been marked for deprecation: 
https://developer.github.com/changes/2018-04-25-github-services-deprecation/

      Functionality will be removed from GitHub.com on January 31st, 2019.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]