On Aug 11, 2016, at 11:24 AM, address@hidden wrote:
Performance
===========
You can't do full work-load testing on this tree due to the lack of
atomic support (but I will run some numbers on
mttcg/base-patches-v4-with-cmpxchg-atomics-v2). However you certainly
see a run time improvement with the kvm-unit-tests TCG group.
retry.py called with ['./run_tests.sh', '-t', '-g', 'tcg', '-o',
'-accel
tcg,thread=single']
run 1: ret=0 (PASS), time=1047.147924 (1/1)
run 2: ret=0 (PASS), time=1071.921204 (2/2)
run 3: ret=0 (PASS), time=1048.141600 (3/3)
Results summary:
0: 3 times (100.00%), avg time 1055.737 (196.70 varience/14.02
deviation)
Ran command 3 times, 3 passes
retry.py called with ['./run_tests.sh', '-t', '-g', 'tcg', '-o',
'-accel
tcg,thread=multi']
run 1: ret=0 (PASS), time=303.074210 (1/1)
run 2: ret=0 (PASS), time=304.574991 (2/2)
run 3: ret=0 (PASS), time=303.327408 (3/3)
Results summary:
0: 3 times (100.00%), avg time 303.659 (0.65 varience/0.80
deviation)
Ran command 3 times, 3 passes
The TCG tests run with -smp 4 on my system. While the TCG tests are
purely CPU bound they do exercise the hot and cold paths of TCG
execution (especially when triggering SMC detection). However
there is
still a benefit even with a 50% overhead compared to the ideal 263
second elapsed time.
Alex
Your tests results look very promising. It looks like you saw a 3x
speed
improvement over single threading. Excellent. I wonder what the
numbers
would be for a 22 core Xeon or 72 core Xeon Phi...