qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: avoid starting a new migration task


From: Zhanghaoyu (A)
Subject: Re: [Qemu-devel] [PATCH] migration: avoid starting a new migration task while the previous one still exist
Date: Wed, 6 Nov 2013 01:50:45 +0000

>>>> Avoid starting a new migration task while the previous one still
>>> exist.
>>>
>>> Can you explain how to reproduce the problem?
>>>
>> When network disconnection between source and destination happened, 
>> the migration thread stuck at below stack,
>> Then I cancel the migration task, the migration state in qemu will be set to 
>> MIG_STATE_CANCELLED, so the migration job in libvirt quits.
>> Then I perform migration again, at this time, the network reconnected 
>> successfully, since the TCP timeout retransmission, above stack will not 
>> return immediately, so two migration tasks exist at the same time.
>> And still worse, source qemu will crash, because of accessing the NULL 
>> pointer in qemu_bh_schedule(s->cleanup_bh); statement in latter migration 
>> task, since the "s->cleanup_bh" had been deleted by previous migration task.
>
>Thanks for explaining.  CANCELLING looks like a useful addition.
>
>Why do you need both CANCELLING and COMPLETING?  The COMPLETED state should be 
>set only after all I/O is done.
>
There is a period of time from the timing of setting COMPLETED state to that of 
migration task exits,
so it's problematic in COMPLETED->CANCELLED transition, but if applying your 
below proposal, the problem gone.
do {
    old_state = s->state;
    if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
        break;
    }
    migrate_set_state(s, old_state, MIG_STATE_CANCELLED);
} while (s->state != MIG_STATE_CANCELLED);

>I agree with Eric that the CANCELLING state should not be exposed via QMP.
>"info migrate" and "query-migrate" can keep showing "active" for maximum 
>backwards compatibility.
>
>More comments below.
>
>
>> -    if (s->state != MIG_STATE_COMPLETED) {
>> +    if (s->state != MIG_STATE_COMPLETING) {
>>          qemu_savevm_state_cancel();
>> +        if (s->state == MIG_STATE_CANCELLING) {
>> +            migrate_set_state(s, MIG_STATE_CANCELLING, 
>> MIG_STATE_CANCELLED); 
>> +        }
>
>I think you can remove the "if" and unconditionally call migrate_set_state.

Do you mean to remove the "if (s->state == MIG_STATE_CANCELLING)" ?
The s->state probably is MIG_STATE_ERROR here, is it okay to unconditionally 
call migrate_set_state?

Thanks,
Zhang Haoyu

>
>> +    }else {
>> +        migrate_set_state(s, MIG_STATE_COMPLETING, 
>> + MIG_STATE_COMPLETED);
>>      }
>>  
>>      notifier_list_notify(&migration_state_notifiers, s);  }
>>  
>> -static void migrate_set_state(MigrationState *s, int old_state, int 
>> new_state) -{
>> -    if (atomic_cmpxchg(&s->state, old_state, new_state) == new_state) {
>> -        trace_migrate_set_state(new_state);
>> -    }
>> -}
>> -
>>  void migrate_fd_error(MigrationState *s)  {
>>      DPRINTF("setting error state\n"); @@ -328,7 +337,7 @@ static void 
>> migrate_fd_cancel(MigrationState *s)  {
>>      DPRINTF("cancelling migration\n");
>>  
>> -    migrate_set_state(s, s->state, MIG_STATE_CANCELLED);
>> +    migrate_set_state(s, s->state, MIG_STATE_CANCELLING);
>
>Here probably we want something like
>
>    do {
>        old_state = s->state;
>        if (old_state != MIG_STATE_SETUP && old_state != MIG_STATE_ACTIVE) {
>            break;
>        }
>        migrate_set_state(s, old_state, MIG_STATE_CANCELLING);
>    } while (s->state != MIG_STATE_CANCELLING);
>
>to avoid a bogus COMPLETED->CANCELLED transition.  Please separate the patch 
>in two parts:
>
>(1) the first uses the above code, with CANCELLED instead of CANCELLING
>
>(2) the second, similar to the one you have posted, introduces the new 
>CANCELLING state
>
>Thanks,
>
>Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]