qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work


From: Juan Quintela
Subject: Re: [Qemu-devel] [PATCH v5 09/17] migration: Start of multiple fd work
Date: Tue, 08 Aug 2017 11:35:44 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

"Dr. David Alan Gilbert" <address@hidden> wrote:
> * Juan Quintela (address@hidden) wrote:
>> We create new channels for each new thread created. We only send through
>> them a character to be sure that we are creating the channels in the
>> right order.
>
> That text is out of date isn't it?

oops, fixed.


>> +gboolean multifd_new_channel(QIOChannel *ioc)
>> +{
>> +    int thread_count = migrate_multifd_threads();
>> +    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
>> +    MigrationState *s = migrate_get_current();
>> +    char string[MULTIFD_UUID_MSG];
>> +    char string_uuid[UUID_FMT_LEN];
>> +    char *uuid;
>> +    int id;
>> +
>> +    qio_channel_read(ioc, string, sizeof(string), &error_abort);
>> +    sscanf(string, "%s multifd %03d", string_uuid, &id);
>> +
>> +    if (qemu_uuid_set) {
>> +        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
>> +    } else {
>> +        uuid = g_strdup(multifd_uuid);
>> +    }
>> +    if (strcmp(string_uuid, uuid)) {
>> +        error_report("multifd: received uuid '%s' and expected uuid '%s'",
>> +                     string_uuid, uuid);
>
> probably worth adding the channel id as well so we can see
> when it fails.

Done.

>> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>> +                          MIGRATION_STATUS_FAILED);
>> +        terminate_multifd_recv_threads();
>> +        return FALSE;
>> +    }
>> +    g_free(uuid);
>> +
>> +    if (multifd_recv_state->params[id] != NULL) {
>> +        error_report("multifd: received id '%d' is already setup'", id);
>> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>> +                          MIGRATION_STATUS_FAILED);
>> +        terminate_multifd_recv_threads();
>> +        return FALSE;
>> +    }
>> +    qemu_mutex_init(&p->mutex);
>> +    qemu_sem_init(&p->sem, 0);
>> +    p->quit = false;
>> +    p->id = id;
>> +    p->c = ioc;
>> +    atomic_set(&multifd_recv_state->params[id], p);
>
> Can you explain why this is quite so careful about ordering ? Is there
> something that could look at params or try and take the mutex before
> the count is incremented?

what happened to me in the middle stages of the patches (yes, doing
asynchronously was painful) was that:

I created the threads (at the beggining I did the
multifd_recv_state->params[id] == p inside the thread, that makes things
really, really racy.  I *think* that now we could probably do this
as you state.



> I think it's safe to do:
>  p->quit = false;
>  p->id = id;
>  p->c = ioc;
>  &multifd_recv_state->params[id] = p;
>  qemu_sem_init(&p->sem, 0);
>  qemu_mutex_init(&p->mutex);
>  qemu_thread_create(...)
>  atomic_inc(&multifd_recv_state->count);    <-- I'm not sure if this
>  needs to be atomic

We only change it on the main thread, so it should be enough.  The split
that I want to do is:

we do the listen asynchronously
when something arrives, we just read it (main thread)
we then read <uuid> <string> <arguments>
and then after checking that uuid is right, we call whatever function we
have for "string", in our case "multifd", with <arguments> as one string
parameters.

This should make it easier to create new "channels" for other purposes.
So far so good.

But then it appears what are the responsabilities, At the beggining, I
read the string on the reception thread for that channel, that created a
race because I received the 1st message for that channel before the
channel was fully created (yes, it only happened sometimes, easy to
understand after debugging).  This is the main reason that I changed to
an array of pointers to structs instead of one array of structs.

Then, I had to ve very careful to know when I had created all the
channels threads, because otherwise I ended having races left and right.

I will try to test the ordering that you suggested.

>> +    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
>> +                       QEMU_THREAD_JOINABLE);
>
> You've lost the nice numbered thread names you had created in the
> previous version of this that you're removing.

I could get them back, but they really were not showing at gdb, where do
they show? ps?

>> +    multifd_recv_state->count++;
>> +
>> +    /* We need to return FALSE for the last channel */
>> +    if (multifd_recv_state->count == thread_count) {
>> +        return FALSE;
>> +    } else {
>> +        return TRUE;
>> +    }
>
> return multifd_recv_state->count != thread_count;   ?

For other reasons I change this functions and now they use a different
way of setting/checking if we have finished.  Look at the new series.

I didn't do as you said because I feel it weird that we return a bool
when we expert a gboolean, but .....

Thanks, Juan.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]