Hi,
I've hacked a bit on QEMU and added simulated memory using a
translation cache as we discussed earlier. These tests are
mostly for my own interest, but you might find them interesting
aswell.
Below is the result of the BYTEmark [1] benchmark (I do not have
access to any of the SPEC benchmarks) using simulated memory:
NUMERIC SORT: Iterations/sec.: 10.385957 Index: 0.268406
STRING SORT: Iterations/sec.: 1.807970 Index: 0.794712
BITFIELD: Iterations/sec.: 1913364.293577 Index: 0.328203
FP EMULATION: Iterations/sec.: 0.647669 Index: 0.311379
FOURIER: Iterations/sec.: 619.696756 Index: 0.701681
ASSIGNMENT: Iterations/sec.: 0.151103 Index: 0.575697
IDEA: Iterations/sec.: 21.734410 Index: 0.332534
HUFFMAN: Iterations/sec.: 13.068599 Index: 0.363168
Same test without the simulated memory (original QEMU):
NUMERIC SORT: Iterations/sec.: 20.327522 Index: 0.525327
STRING SORT: Iterations/sec.: 2.919430 Index: 1.283266
BITFIELD: Iterations/sec.: 3086647.786244 Index: 0.529458
FP EMULATION: Iterations/sec.: 1.112348 Index: 0.534783
FOURIER: Iterations/sec.: 717.791439 Index: 0.812754
ASSIGNMENT: Iterations/sec.: 0.208943 Index: 0.796063
IDEA: Iterations/sec.: 39.651108 Index: 0.606657
HUFFMAN: Iterations/sec.: 19.098677 Index: 0.530740
Slowdown rates (calculated using the Index field from original
QEMU divided with the Index from the QEMU w/ simulated memory):
NUMERIC SORT: 1.96
STRING SORT: 1.61
BITFIELD: 1.61
FP EMULATION: 1.72
FOURIER: 1.16
ASSIGNMENT: 1.38
IDEA: 1.82
HUFFMAN: 1.46
The slowdown would be greater if any processing must have been
done on every cache miss. The current hack just adds the page
to the cache and does the memory transaction and returns.
A slowdown between 1.16x and ~2x is pretty good I think.
As reference, below is the results for Valgrind (CVS version,
running the none skin):
NUMERIC SORT: Iterations/sec.: 36.541455 Index: 0.944346
STRING SORT: Iterations/sec.: 2.181686 Index: 0.958983
BITFIELD: Iterations/sec.: 6294984.678336 Index: 1.079789
FP EMULATION: Iterations/sec.: 2.232148 Index: 1.073148
FOURIER: Iterations/sec.: 746.055296 Index: 0.844757
ASSIGNMENT: Iterations/sec.: 0.386720 Index: 1.473388
IDEA: Iterations/sec.: 80.463770 Index: 1.231086
HUFFMAN: Iterations/sec.: 43.592067 Index: 1.211396
[1] http://www.byte.com/bmark/bmark.htm
best regards,
johan