qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fio regression caused by f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94


From: Paolo Bonzini
Subject: Re: Fio regression caused by f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94
Date: Fri, 6 May 2022 10:42:05 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

On 5/6/22 06:30, Lukáš Doktor wrote:
Also let me briefly share the details about the execution:

Thanks, this is super useful!

I got very similar results to yours:

QEMU 6.2                        bw=1132MiB/s
QEMU 7.0                        bw=1046MiB/s
QEMU 7.0 + patch                bw=1012MiB/s
QEMU 7.0 + tweaked patch        bw=1077MiB/s

"tweaked patch" is moving qemu_cond_signal after qemu_mutex_unlock.
It's better than QemuSemaphore in QEMU 7.0 but still not as good as
the original.  /me thinks

Paolo

---

mkdir -p /var/lib/runperf/runperf-nbd/
truncate -s 256M /var/lib/runperf/runperf-nbd//disk.img
nohup qemu-nbd -t -k /var/lib/runperf/runperf-nbd//socket -f raw 
/var/lib/runperf/runperf-nbd//disk.img &> $(mktemp 
/var/lib/runperf/runperf-nbd//qemu_nbd_XXXX.log) & echo $! >> 
/var/lib/runperf/runperf-nbd//kill_pids
for PID in $(cat /var/lib/runperf/runperf-nbd//kill_pids); do disown -h $PID; 
done
export TERM=xterm-256color
true
mkdir -p /var/lib/runperf/runperf-nbd/
cat > /var/lib/runperf/runperf-nbd/nbd.fio << \Gr1UaS
# To use fio to test nbdkit:
#
# nbdkit -U - memory size=256M --run 'export unixsocket; fio examples/nbd.fio'
#
# To use fio to test qemu-nbd:
#
# rm -f /tmp/disk.img /tmp/socket
# truncate -s 256M /tmp/disk.img
# export target=/tmp/socket
# qemu-nbd -t -k $target -f raw /tmp/disk.img &
# fio examples/nbd.fio
# killall qemu-nbd

[global]
bs = $@
runtime = 30
ioengine = nbd
iodepth = 32
direct = 1
sync = 0
time_based = 1
clocksource = gettimeofday
ramp_time = 5
write_bw_log = fio
write_iops_log = fio
write_lat_log = fio
log_avg_msec = 1000
write_hist_log = fio
log_hist_msec = 10000
# log_hist_coarseness = 4 # 76 bins

rw = $@
uri=nbd+unix:///?socket=/var/lib/runperf/runperf-nbd/socket
# Starting from nbdkit 1.14 the following will work:
#uri=${uri}

[job0]
offset=0

[job1]
offset=64m

[job2]
offset=128m

[job3]
offset=192m

Gr1UaS

benchmark_bin=/usr/local/bin/fio pbench-fio  --block-sizes=4 
--job-file=/var/lib/runperf/runperf-nbd/nbd.fio --numjobs=4 --runtime=60 
--samples=5 --test-types=write --clients=$WORKER_IP

---

I am using pbench to run the execution, but you can simply replace the "$@" variables in 
the produced "/var/lib/runperf/runperf-nbd/nbd.fio" and run it directly using fio.

Regards,
Lukáš


Dne 05. 05. 22 v 15:27 Paolo Bonzini napsal(a):
On 5/5/22 14:44, Daniel P. Berrangé wrote:
util/thread-pool.c uses qemu_sem_*() to notify worker threads when work
becomes available. It makes sense that this operation is
performance-critical and that's why the benchmark regressed.

Doh, I questioned whether the change would have a performance impact,
and it wasn't thought to be used in perf critical places

The expectation was that there would be no contention and thus no overhead because 
of the pool->lock that exists anyway, but that was optimistic.

Lukáš, can you run a benchmark with this condvar implementation that was 
suggested by Stefan:

20220505131346.823941-1-pbonzini@redhat.com/raw">https://lore.kernel.org/qemu-devel/20220505131346.823941-1-pbonzini@redhat.com/raw

?

If it still regresses, we can either revert the patch or look at a different 
implementation (even getting rid of the global queue is an option).

Thanks,

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]