[Qemu-devel] [RFC PATCH 0/2] High downtime with 95+ throttle pct

From: Yury Kotov
Subject: [Qemu-devel] [RFC PATCH 0/2] High downtime with 95+ throttle pct
Date: Wed, 10 Jul 2019 12:23:36 +0300


I wrote a test for migration auto converge and found out a strange thing:
1. Enable auto converge
2. Set max-bandwidth 1Gb/s
3. Set downtime-limit 1ms
4. Run standard test (just writes a byte per page)
5. Wait for converge
6. It's converged with 99% throttle percentage
7. The result downtime was about 300-600ms   <<<<

It's much higher than expected 1ms. I figured out that cpu_throttle_thread()
function sleeps for 100ms+ for high throttle percentage (>=95%) in VCPU thread.
And it sleeps even after a cpu kick.

I tried to fix it by using timedwait for ms part of sleep.
E.g timedwait(halt_cond, 1ms) + usleep(500).

But I'm not sure about using timedwait function here with qemu_global_mutex.
The original function uses qemu_mutex_unlock_iothread + qemu_mutex_lock_iothread
It differs from locking/unlocking (inside timedwait) qemu_global_mutex
because of using qemu_bql_mutex_lock_func function which could be anything.
This is why the series is RFC.

What do you think?

Yury Kotov (2):
  qemu-thread: Add qemu_cond_timedwait
  cpus: Fix throttling during vm_stop

 cpus.c                   | 27 +++++++++++++++++++--------
 include/qemu/thread.h    | 12 ++++++++++++
 util/qemu-thread-posix.c | 40 ++++++++++++++++++++++++++++------------
 util/qemu-thread-win32.c | 16 ++++++++++++++++
 4 files changed, 75 insertions(+), 20 deletions(-)


