qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Hang with migration multi-thread compression under high


From: Li, Liang Z
Subject: Re: [Qemu-devel] Hang with migration multi-thread compression under high load
Date: Thu, 28 Apr 2016 02:41:58 +0000

> On Wed, Apr 27, 2016 at 03:29:30PM +0100, Dr. David Alan Gilbert wrote:
> > ccing in Liang Li
> >
> > * Daniel P. Berrange (address@hidden) wrote:
> > > for some reason it isn't shown in the stack thrace for thread
> > > 1 above, when initially connecting GDB it says the main thread is
> > > at:
> > >
> > > decompress_data_with_multi_threads (len=702, host=0x7fd78fe06000,
> f=0x55901af09950) at /home/berrange/src/virt/qemu/migration/ram.c:2254
> > > 2254              for (idx = 0; idx < thread_count; idx++) {
> > >
> > >
> > > Looking at the target QEMU, we see  do_data_decompress method is
> > > waiting in a condition var:
> > >
> > >         while (!param->start && !quit_decomp_thread) {
> > >       qemu_cond_wait(&param->cond, &param->mutex);
> > >             ....do stuff..
> > >       param->start = false
> > >         }
> > >
> > >
> > > Now the decompress_data_with_multi_threads is checking param->start
> > > without holding the param->mutex lock.
> > >
> > > Changing decompress_data_with_multi_threads to acquire param-
> >mutex
> > > lock makes it work, but isn't ideal, since that now blocks the
> > > decompress_data_with_multi_threads() method on the completion of
> > > each thread, which defeats the point of having multiple threads.
> 
> FWIW, the following patch also appears to "fix" the issue, presumably by just
> making the race much less likely to hit:
> 
> diff --git a/migration/ram.c b/migration/ram.c index 3f05738..be0233f 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2271,6 +2271,7 @@ static void
> decompress_data_with_multi_threads(QEMUFile *f,
>          if (idx < thread_count) {
>              break;
>          }
> +        sched_yield();
>      }
>  }
> 
> 
> Incidentally IIUC, this decompress_data_with_multi_threads is just busy
> waiting for a thread to become free, which seems pretty wasteful of CPU
> resources. I wonder if there's a more effective way to structure this, so that
> instead of having decompress_data_with_multi_threads()
> choose which thread to pass the decompression job to, it just puts the job
> into a queue, and then let all the threads pull from that shared queue. IOW
> whichever thread the kerenl decides to wakeup would get the job, without
> us having to explicitly assign a thread to the job.
> 
> 

Thanks for reporting this issue, I will take a look and get back to you.

Liang

> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

reply via email to

[Prev in Thread] Current Thread [Next in Thread]