qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] COLO: how to flip a secondary to a primary?


From: Wen Congyang
Subject: Re: [Qemu-devel] COLO: how to flip a secondary to a primary?
Date: Mon, 25 Jan 2016 09:32:51 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

On 01/23/2016 03:35 AM, Dr. David Alan Gilbert wrote:
> Hi,
>   I've been looking at what's needed to add a new secondary after
> a primary failed; from the block side it doesn't look as hard
> as I'd expected, perhaps you can tell me if I'm missing something!
> 
> The normal primary setup is:
> 
>    quorum
>       Real disk
>       nbd client

quorum
   real disk
   replication
      nbd client

> 
> The normal secondary setup is:
>    replication
>       active-disk
>       hidden-disk
>       Real-disk

IIRC, we can do it like this:
quorum
   replication
      active-disk
      hidden-disk
      real-disk

> 
> With a couple of minor code hacks; I changed the secondary to be:
> 
>    quorum
>       replication
>         active-disk
>         hidden-disk
>         Real-disk
>       dummy-disk

after failover,
quorum
   replicaion(old, mode is secondary)
     active-disk
     hidden-disk*
     real-disk*
   replication(new, mode is primary)
     nbd-client

In the newest version, we active commit active-disk to real-disk.
So it will be:
quorum
   replicaion(old, mode is secondary)
     active-disk(it is real disk now)
   replication(new, mode is primary)
     nbd-client

> 
> and then after the primary fails, I start a new secondary
> on another host and then on the old secondary do:
> 
>   nbd_server_stop
>   stop
>   x_block_change top-quorum -d children.0         # deletes use of real disk, 
> leaves dummy
>   drive_del active-disk0
>   x_block_change top-quorum -a node-real-disk
>   x_block_change top-quorum -d children.1         # Seems to have deleted the 
> dummy?!, the disk is now child 0
>   drive_add buddy 
> driver=replication,mode=primary,file.driver=nbd,file.host=ibpair,file.port=8889,file.export=colo-disk0,node-name=nbd-client,if=none,cache=none
>   x_block_change top-quorum -a nbd-client
>   c
>   migrate_set_capability x-colo on
>   migrate -d -b tcp:ibpair:8888
> 
> and I think that means what was the secondary, has the same disk
> structure as a normal primary.
> That's not quite happy yet, and I've not figured out why - but the
> order/structure of the block devices looks right?
> 
> Notes:
>    a) The dummy serves two purposes, 1) it works around the segfault
>       I reported in the other mail, 2) when I delete the real disk in the
>       first x_block_change it means the quorum still has 1 disk so doesn't
>       get upset.

I don't understand the purpose 2.

>    b) I had to remove the restriction in quorum_start_replication
>       on which mode it would run in. 

IIRC, this check will be removed.

>    c) I'm not really sure everything knows it's in secondary mode yet, and
>       I'm not convinced whether the replication is doing the right thing.
>    d) The migrate -d -b   eventually fails on the destination, not worked out 
> why
>       yet.

Can you give me the error message?

>    e) Adding/deleting children on quorum is hard having to use the 
> children.0/1
>       notation when you've added children using node names - it's worrying
>       which number is which; is there a way to give them a name?

No. I think we can improve 'info block' output.

>    f) I've not thought about the colo-proxy that much yet - I guess that
>       existing connections need to keep their sequence number offset but
>       new connections made by what is now the primary dont need to do anything
>       special.

Hailiang or Zhijian can answer this question.

Thanks
Wen Congyang

> 
> Dave
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 
> 
> .
> 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]