qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] migration: failover: continue to wait card unplug on err


From: Laurent Vivier
Subject: Re: [PATCH 2/2] migration: failover: continue to wait card unplug on error
Date: Wed, 30 Jun 2021 11:04:38 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 29/06/2021 19:50, Juan Quintela wrote:
> Laurent Vivier <lvivier@redhat.com> wrote:
>> If the user cancels the migration in the unplug-wait state,
>> QEMU will try to plug back the card and this fails because the card
>> is partially unplugged.
>> To avoid the problem, continue to wait the card unplug, but to
>> allow the migration to be canceled if the card never finishes to unplug
>> use a timeout.
>>
>> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1976852
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>  migration/migration.c | 11 +++++++++++
>>  1 file changed, 11 insertions(+)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 3e92c405a2b6..3b06d43a7f42 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -3679,6 +3679,17 @@ static void qemu_savevm_wait_unplug(MigrationState 
>> *s, int old_state,
>>                 qemu_savevm_state_guest_unplug_pending()) {
>>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>>          }
>> +        if (s->state != MIGRATION_STATUS_WAIT_UNPLUG) {
>> +            int timeout = 120; /* 30 seconds */
>> +            /*
>> +             * migration has been canceled
>> +             * but as we have started an unplug we must wait the end
>> +             * to be able to plug back the card
>> +             */
>> +            while (timeout-- && qemu_savevm_state_guest_unplug_pending()) {
>> +                qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>> +            }
>> +        }
>>  
>>          migrate_set_state(&s->state, MIGRATION_STATUS_WAIT_UNPLUG, 
>> new_state);
>>      } else {
> I agree with the idea.  But if we are getting out due to timeout == 0,
> shouldn't we return some error, warning, whatever?

In that case, we keep the current behaviour: guest kernel will report an error 
when it
will try to plug back the card that has not been unplugged. This is a corner 
case: if it
happens we have something really wrong with the machine. Perhaps we can remove 
the
timeout, but I don't like to block the user, or increase it to be sure.

Thanks,

Laurent





reply via email to

[Prev in Thread] Current Thread [Next in Thread]