qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) V


From: Wen Congyang
Subject: Re: [Qemu-devel] [RFC PATCH v2 00/23] COarse-grain LOck-stepping(COLO) Virtual Machines for Non-stop Service
Date: Wed, 29 Oct 2014 14:53:41 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

On 09/23/2014 05:23 PM, Yang Hongyang wrote:
> Virtual machine (VM) replication is a well known technique for
> providing application-agnostic software-implemented hardware fault
> tolerance "non-stop service". COLO is a high availability solution.
> Both primary VM (PVM) and secondary VM (SVM) run in parallel. They
> receive the same request from client, and generate response in parallel
> too. If the response packets from PVM and SVM are identical, they are
> released immediately. Otherwise, a VM checkpoint (on demand) is
> conducted. The idea is presented in Xen summit 2012, and 2013,
> and academia paper in SOCC 2013. It's also presented in KVM forum
> 2013:
> http://www.linux-kvm.org/wiki/images/1/1d/Kvm-forum-2013-COLO.pdf
> Please refer to above document for detailed information. 
> Please also refer to previous posted RFC proposal:
> http://lists.nongnu.org/archive/html/qemu-devel/2014-06/msg05567.html
> 
> The patchset is also hosted on github:
> https://github.com/macrosheep/qemu/tree/colo_v0.5
> 
> v2:
> use QEMUSizedBuffer/QEMUFile as COLO buffer
> colo support is enabled by default
> add nic replication support
> addressed comments from Eric Blake and Dr. David Alan Gilbert
> 
> v1:
> implement the frame of colo
> 
> This patchset is RFC, But it is ready for demo the COLO idea
> with QEMU-KVM.
> Steps using this patchset to get an overview of COLO:
> 1. configure
> 2. compile
> 3. just like QEMU's normal migration, run 2 QEMU VM:
>    - Primary VM 
>    - Secondary VM with -incoming tcp:[IP]:[PORT] option
> 4. on Primary VM's QEMU monitor, run following command:
>    migrate_set_capability colo on
>    migrate tcp:[IP]:[PORT]
> 5. done
> you will see two runing VMs, whenever you make changes to PVM, SVM
> will be synced to PVM's state.
> 
> TODO list:
> 1. failover (will require heartbeat module: 
> http://www.linux-ha.org/wiki/Downloads)
> 2. disk replication[COLO Disk manager]

Hi all:

I will start to implement disk replication. Before doing this, I think we 
should decide
how to implement it.

I have two ideas about it:
1. implement it in qemu
   Advantage: very easy, and don't take too much time
   Disadvantage: the virtio disk with vhost is not supported, because the disk 
I/O
       operations are not handled in qemu.

2. update drbd and make it support colo
   Advantage: we can use it for both KVM and XEN.
   Disadvantage: The implementation may be complex, and need too much time to
        implement it.(I don't read the drbd's codes, and can't estimate the 
cost)

I think we can use 1 to implement it first.
If you have some other idea, please let me know.

Thanks
Wen Congyang

> 
> Any comments/feedbacks are warmly welcomed.
> 
> Thanks,
> Yang
> 
> 
> Dr. David Alan Gilbert (1):
>   QEMUSizedBuffer/QEMUFile
> 
> Yang Hongyang (22):
>   configure: add CONFIG_COLO to switch COLO support
>   COLO: introduce an api colo_supported() to indicate COLO support
>   COLO migration: add a migration capability 'colo'
>   COLO info: use colo info to tell migration target colo is enabled
>   COLO save: integrate COLO checkpointed save into qemu migration
>   COLO restore: integrate COLO checkpointed restore into qemu restore
>   COLO: disable qdev hotplug
>   COLO ctl: implement API's that communicate with colo agent
>   COLO ctl: introduce is_slave() and is_master()
>   COLO ctl: implement colo checkpoint protocol
>   COLO ctl: add a RunState RUN_STATE_COLO
>   COLO ctl: implement colo save
>   COLO ctl: implement colo restore
>   COLO save: reuse migration bitmap under colo checkpoint
>   COLO ram cache: implement colo ram cache on slave
>   HACK: trigger checkpoint every 500ms
>   COLO nic: add command line switch
>   COLO nic: init/remove colo nic devices when add/cleanup tap devices
>   COLO nic: implement colo nic device interface support_colo()
>   COLO nic: implement colo nic device interface configure()
>   COLO nic: export colo nic APIs
>   COLO nic: setup/teardown colo nic devices
> 
>  Makefile.objs                      |   2 +
>  arch_init.c                        | 174 +++++++++++-
>  configure                          |  14 +
>  include/exec/cpu-all.h             |   1 +
>  include/migration/migration-colo.h |  36 +++
>  include/migration/migration.h      |  13 +
>  include/migration/qemu-file.h      |  28 ++
>  include/net/colo-nic.h             |  20 ++
>  include/net/net.h                  |   4 +
>  include/qemu/typedefs.h            |   1 +
>  migration-colo-comm.c              |  78 ++++++
>  migration-colo.c                   | 540 
> +++++++++++++++++++++++++++++++++++++
>  migration.c                        |  47 ++--
>  net/Makefile.objs                  |   1 +
>  net/colo-nic.c                     | 227 ++++++++++++++++
>  net/tap.c                          |  45 +++-
>  network-colo                       | 194 +++++++++++++
>  qapi-schema.json                   |  18 +-
>  qemu-file.c                        | 410 ++++++++++++++++++++++++++++
>  qemu-options.hx                    |  10 +-
>  stubs/Makefile.objs                |   1 +
>  stubs/migration-colo.c             |  34 +++
>  vl.c                               |  12 +
>  23 files changed, 1879 insertions(+), 31 deletions(-)
>  create mode 100644 include/migration/migration-colo.h
>  create mode 100644 include/net/colo-nic.h
>  create mode 100644 migration-colo-comm.c
>  create mode 100644 migration-colo.c
>  create mode 100644 net/colo-nic.c
>  create mode 100755 network-colo
>  create mode 100644 stubs/migration-colo.c
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]