Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue

From:	Juan Quintela
Subject:	Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
Date:	Tue, 08 Aug 2017 13:40:58 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)

Peter Xu <address@hidden> wrote:
> On Mon, Jul 17, 2017 at 03:42:38PM +0200, Juan Quintela wrote:
>> Each time that we sync the bitmap, it is a possiblity that we receive
>> a page that is being processed by a different thread.  We fix this
>> problem just making sure that we wait for all receiving threads to
>> finish its work before we procedeed with the next stage.
>> 
>> We are low on page flags, so we use a combination that is not valid to
>> emit that message:  MULTIFD_PAGE and COMPRESSED.
>> 
>> I tried to make a migration command for it, but it don't work because
>> we sync the bitmap sometimes when we have already sent the beggining
>> of the section, so I just added a new page flag.
>> 
>> Signed-off-by: Juan Quintela <address@hidden>

>> @@ -675,6 +686,10 @@ static void *multifd_recv_thread(void *opaque)
>>                  return NULL;
>>              }
>>              p->done = true;
>> +            if (p->sync) {
>> +                qemu_cond_signal(&p->cond_sync);
>> +                p->sync = false;
>> +            }
>
> Could we use the same p->ready for this purpose? They looks similar:
> all we want to do is to let the main thread know "worker thread has
> finished receiving the last piece and becomes idle again", right?

We *could*, but "ready" is used for each page that we sent, sync is only
used once every round.  Notice that "ready" is a semaphore, and its
semantic is weird.  See next comment.


>> +static int multifd_flush(void)
>> +{
>> +    int i, thread_count;
>> +
>> +    if (!migrate_use_multifd()) {
>> +        return 0;
>> +    }
>> +    thread_count = migrate_multifd_threads();
>> +    for (i = 0; i < thread_count; i++) {
>> +        MultiFDRecvParams *p = multifd_recv_state->params[i];
>> +
>> +        qemu_mutex_lock(&p->mutex);
>> +        while (!p->done) {
>> +            p->sync = true;
>> +            qemu_cond_wait(&p->cond_sync, &p->mutex);
>
> (similar comment like above)

We need to look at the two pieces of code at the same time.  What are we
trying to do:

- making sure that all threads have finished the current round.
  in this particular case, that this thread has finished its current
  round OR  that it is waiting for work.

So, the main thread is the one that does the sem_wait(ready) and the channel
thread is the one that does the sem_post(ready).

multifd_recv_thread()

    if (p->sync) {
        sem_post(ready);
        p->sync = false;
    }

multifd_flush()
   if (!p->done) {
       p->sync = true;
       sem_wait(ready);
   }

Ah, but done and sync can be changed from other threads, so current code
will become:

multifd_recv_thread()

    if (p->sync) {
        sem_post(ready);
        p->sync = false;
    }

multifd_flush()
   ...
   mutex_lock(lock);
   if (!p->done) {
       p->sync = true;
       mutex_unlock(lock)
       sem_wait(ready);
       mutex_lock(lock)
   }
   mutex_unlock(lock)

That I would claim that it is more complicated to understand.  Mixing
locks and semaphores is ..... interesting to say the least.  With
variable conditions it becomes easy.

Yes, we can change sync/done to atomic access, but not sure that makes
things so much simpler.

>> +        }
>> +        qemu_mutex_unlock(&p->mutex);
>> +    }
>> +    return 0;
>> +}
>> +
>>  /**
>>   * save_page_header: write page header to wire
>>   *
>> @@ -809,6 +847,12 @@ static size_t save_page_header(RAMState *rs, QEMUFile 
>> *f,  RAMBlock *block,
>>  {
>>      size_t size, len;
>>  
>> +    if (rs->multifd_needs_flush &&
>> +        (offset & RAM_SAVE_FLAG_MULTIFD_PAGE)) {
>
> If multifd_needs_flush is only for multifd, then we may skip this
> check, but it looks more like an assertion:
>
>     if (rs->multifd_needs_flush) {
>         assert(offset & RAM_SAVE_FLAG_MULTIFD_PAGE);
>         offset |= RAM_SAVE_FLAG_ZERO;
>     }

No, it could be that this page is a _non_ multifd page, and then ZERO
means something different.  So, we can only send this for MULTIFD pages.

> (Dave mentioned about unaligned flag used in commit message and here:
>  ZERO is used, but COMPRESS is mentioned)

OK, I can change the message.

>> @@ -2496,6 +2540,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>>  
>>      if (!migration_in_postcopy()) {
>>          migration_bitmap_sync(rs);
>> +        if (migrate_use_multifd()) {
>> +            rs->multifd_needs_flush = true;
>> +        }
>
> Would it be good to move this block into entry of
> migration_bitmap_sync(), instead of setting it up at the callers of
> migration_bitmap_sync()?

We can't have all of it.

We call migration_bitmap_sync() in 4 places.
- We don't need to set the flag for the 1st synchronization
- We don't need to set it on postcopy (yet).

So, we can add code inside to check if we are on the 1st round, and
forget about postcopy (we check in other place), or we maintain it this way.

So, change becomes:

modified   migration/ram.c
@@ -1131,6 +1131,9 @@ static void migration_bitmap_sync(RAMState *rs)
     if (migrate_use_events()) {
         qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL);
     }
+    if (rs->ram_bulk_stage && migrate_use_multifd()) {
+        rs->multifd_needs_flush = true;
+    }
 }
 
 /**
@@ -2533,9 +2536,6 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
     if (!migration_in_postcopy()) {
         migration_bitmap_sync(rs);
-        if (migrate_use_multifd()) {
-            rs->multifd_needs_flush = true;
-        }
     }
 
     ram_control_before_iterate(f, RAM_CONTROL_FINISH);
@@ -2578,9 +2578,6 @@ static void ram_save_pending(QEMUFile *f, void *opaque, 
uint64_t max_size,
         qemu_mutex_lock_iothread();
         rcu_read_lock();
         migration_bitmap_sync(rs);
-        if (migrate_use_multifd()) {
-            rs->multifd_needs_flush = true;
-        }
         rcu_read_unlock();
         qemu_mutex_unlock_iothread();
         remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE;

three less lines, you win.  We need to check in otherplace already that
postcopy & multifd are not enabled at the same time.

Thanks, Juan.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue, Juan Quintela, 2017/08/08
- Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue, Dr. David Alan Gilbert, 2017/08/08
- Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue, Juan Quintela <=
  - Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue, Peter Xu, 2017/08/10

Prev by Date: Re: [Qemu-devel] [PATCH v5 16/17] migration: Transfer pages over new channels
Next by Date: Re: [Qemu-devel] [PATCH v5 13/17] migration: Create thread infrastructure for multifd recv side
Previous by thread: Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
Next by thread: Re: [Qemu-devel] [PATCH v5 17/17] migration: Flush receive queue
Index(es):
- Date
- Thread