qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block


From: Wen Congyang
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH COLO-Block v6 07/16] Add new block driver interface to connect/disconnect the remote target
Date: Mon, 29 Jun 2015 09:05:41 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

On 06/27/2015 03:03 AM, Dr. David Alan Gilbert wrote:
> * Wen Congyang (address@hidden) wrote:
>> On 06/24/2015 10:07 PM, Dr. David Alan Gilbert wrote:
>>> * Wen Congyang (address@hidden) wrote:
>>>> At 2015/6/19 18:49, Stefan Hajnoczi Wrote:
>>>>> On Fri, Jun 19, 2015 at 08:54:56AM +0800, Wen Congyang wrote:
>>>>>> On 06/19/2015 12:06 AM, Stefan Hajnoczi wrote:
>>>>>>> On Thu, Jun 18, 2015 at 10:36:39PM +0800, Wen Congyang wrote:
>>>>>>>> At 2015/6/18 20:55, Stefan Hajnoczi Wrote:
>>>>>>>>> On Thu, Jun 18, 2015 at 04:49:12PM +0800, Wen Congyang wrote:
>>>>>>>>>> +void bdrv_connect(BlockDriverState *bs, Error **errp)
>>>>>>>>>> +{
>>>>>>>>>> +    BlockDriver *drv = bs->drv;
>>>>>>>>>> +
>>>>>>>>>> +    if (drv && drv->bdrv_connect) {
>>>>>>>>>> +        drv->bdrv_connect(bs, errp);
>>>>>>>>>> +    } else if (bs->file) {
>>>>>>>>>> +        bdrv_connect(bs->file, errp);
>>>>>>>>>> +    } else {
>>>>>>>>>> +        error_setg(errp, "this feature or command is not currently 
>>>>>>>>>> supported");
>>>>>>>>>> +    }
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +void bdrv_disconnect(BlockDriverState *bs)
>>>>>>>>>> +{
>>>>>>>>>> +    BlockDriver *drv = bs->drv;
>>>>>>>>>> +
>>>>>>>>>> +    if (drv && drv->bdrv_disconnect) {
>>>>>>>>>> +        drv->bdrv_disconnect(bs);
>>>>>>>>>> +    } else if (bs->file) {
>>>>>>>>>> +        bdrv_disconnect(bs->file);
>>>>>>>>>> +    }
>>>>>>>>>> +}
>>>>>>>>>
>>>>>>>>> Please add doc comments describing the semantics of these commands.
>>>>>>>>
>>>>>>>> Where should it be documented? In the header file?
>>>>>>>
>>>>>>> block.h doesn't document prototypes in the header file, please document
>>>>>>> the function definition in block.c.  (QEMU is not consistent here, some
>>>>>>> places do it the other way around.)
>>>>>>>
>>>>>>>>> Why are these operations needed when there is already a bs->drv == 
>>>>>>>>> NULL
>>>>>>>>> case which means the BDS is not ready for read/write?
>>>>>>>>>
>>>>>>>>
>>>>>>>> The purpos is that: don't connect to nbd server when opening a nbd 
>>>>>>>> client.
>>>>>>>> connect/disconnect
>>>>>>>> to nbd server when we need to do it.
>>>>>>>>
>>>>>>>> IIUC, if bs->drv is NULL, it means that the driver is ejected? Here,
>>>>>>>> connect/disconnect
>>>>>>>> means that connect/disconnect to remote target(The target may be in 
>>>>>>>> another
>>>>>>>> host).
>>>>>>>
>>>>>>> Connect/disconnect puts something on the QEMU command-line that isn't
>>>>>>> ready at startup time.
>>>>>>>
>>>>>>> How about using monitor commands to add objects when needed instead?
>>>>>>>
>>>>>>> That is cleaner because it doesn't introduce a new state (which is only
>>>>>>> implemented for nbd).
>>>>>>>
>>>>>>
>>>>>> The problem is that, nbd client is one child of quorum, and quorum must 
>>>>>> have more
>>>>>> than one child. The nbd server is not ready until colo is running.
>>>>>
>>>>> A monitor command to hot add/remove quorum children solves this problem
>>>>> and could also be used in other scenarios (e.g. user decides to take a
>>>>> quorum child offline).
>>>>>
>>>>
>>>> For replication case, we always do checkpoint again and again after
>>>> migration. If the disk is not synced before migration, we will use disk 
>>>> mirgation or mirror
>>>> job to sync it.
>>>
>>> Can you document the way that you use disk migration or mirror, together
>>> with your COLO setup?   I think it would make it easier to understand this
>>> restriction.
>>> At the moment I don't understand how you can switch from doing a disk 
>>> migration
>>> into COLO mode - there seems to be a gap between the end of disk migration 
>>> and the start
>>> of COLO.
>>
>> In my test, I use disk migration.
>>
>> After migration and before starting COLO, we will disable disk migration:
>> +    /* Disable block migration */
>> +    s->params.blk = 0;
>> +    s->params.shared = 0;
>> +    qemu_savevm_state_begin(trans, &s->params);
> 
> Ah, I hadn't realised you could do that; so do you just do:
> 
> migrate_set_parameter colo on
> migrate -d -b tcp:otherhhost:port
> 
> How does the secondary know to feed that data straight into the disk without
> recording all the old data into the hidden-disk ?

hidden disk and active disk will be made empty when starting block replication.

> 
>> If the user uses mirror job, we don't cancel the mirror job now.
> 
> It would be good to get it to work with mirror, that seems preferred these
> days to the old block migration.

In normal migration, is mirror job created and cancelled by libvirt?

> 
> With block migration or mirror, then that pretty much allows you to replace
> a failed secondary doesn't it without restarting?  The only thing stopping
> you is the NBD client needs to point to the address of the new secondary.
> Stefan's suggestion of being able to modify the quorum on the fly would seem
> to fix that problem.

Yes, I am implementing adding/removing quorum child now.

> (The other thing I can see is that the secondary wouldn't have the conntrack
> data for open connections; but that's a separate problem).
> 
> A related topic was that a colleague was asking about starting the qemu
> up and only then setting the NBD target (he's trying to wire it into 
> OpenStack);
> I suggested that instead of specifying the drive on the command line, it
> might work to use a drive_add to add the whole quorum/drive stack before 
> starting
> the guest; however again something to modify the quorum would be easier.
> 
> Finally the last good reason I can see for being able to modify the quorum on 
> the fly
> is that you'll need it for continuous ft to be able to transmute a secondary 
> into
> a new primary.

Yes, I think adding a filter on the top is also needed.

Thanks
Wen Congyang

> 
> Dave
> 
>>
>>>
>>>> So we cannot start block replication when migration is running. We need
>>>> that nbd
>>>> client is not ready when migration is running, and it is ready between
>>>> migration ends
>>>> and checkpoint begins. Using a monotir command add the nbd client will 
>>>> cause
>>>> larger
>>>> downtime. So if the nbd client has been added(only not connect to the nbd
>>>> server),
>>>> we can connect to nbd server automatically.
>>>
>>> Without the disk migration or mirror, I can't see the need for the delayed 
>>> bdrv_connect.
>>> I can see that you want to do a disconnect at failover, however you need
>>> to take care because if the network is broken at the point of failover
>>> you need to make sure nothing blocks.
>>
>> disk migration is needed if the disk is not synced before migration.
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> Dave
>>>
>>>>
>>>> Thanks
>>>> Wen Congyang
>>> --
>>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>>> .
>>>
>>
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> .
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]