qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] tests with simulated memory


From: Fabrice Bellard
Subject: Re: [Qemu-devel] tests with simulated memory
Date: Thu, 19 Jun 2003 18:33:47 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020828

Some days ago I looked for a benchmark to publish QEMU objective results (gzip is not enough!). The BYTEmark you mention seems a good start. SPEC benchmarks are less interesting since their source code cannot be easily distributed.

I would have expected the slowdown to be more important. In the code you submitted you did not include alignment tests. Hopefully by just anding the address with '0xffff0003' you do both address translation _and_ unaligned access handling...

It seems that Valgrind is twice as fast on most tests. Some optimisations will be needed in qemu to correct that :-)

Fabrice.

Johan Rydberg wrote:
Hi,

I've hacked a bit on QEMU and added simulated memory using a translation cache as we discussed earlier. These tests are mostly for my own interest, but you might find them interesting
aswell.

Below is the result of the BYTEmark [1] benchmark (I do not have access to any of the SPEC benchmarks) using simulated memory:

NUMERIC SORT: Iterations/sec.: 10.385957 Index: 0.268406 STRING SORT: Iterations/sec.: 1.807970 Index: 0.794712
BITFIELD:      Iterations/sec.: 1913364.293577  Index: 0.328203
FP EMULATION:  Iterations/sec.: 0.647669        Index: 0.311379
FOURIER:       Iterations/sec.: 619.696756      Index: 0.701681
ASSIGNMENT:    Iterations/sec.: 0.151103        Index: 0.575697
IDEA:          Iterations/sec.: 21.734410       Index: 0.332534
HUFFMAN:       Iterations/sec.: 13.068599       Index: 0.363168

Same test without the simulated memory (original QEMU):

NUMERIC SORT:  Iterations/sec.: 20.327522       Index: 0.525327
STRING SORT:   Iterations/sec.: 2.919430        Index: 1.283266
BITFIELD:      Iterations/sec.: 3086647.786244  Index: 0.529458
FP EMULATION:  Iterations/sec.: 1.112348        Index: 0.534783
FOURIER:       Iterations/sec.: 717.791439      Index: 0.812754
ASSIGNMENT:    Iterations/sec.: 0.208943        Index: 0.796063
IDEA:          Iterations/sec.: 39.651108       Index: 0.606657
HUFFMAN:       Iterations/sec.: 19.098677       Index: 0.530740

Slowdown rates (calculated using the Index field from original QEMU divided with the Index from the QEMU w/ simulated memory):

NUMERIC SORT:  1.96
STRING SORT:   1.61
BITFIELD:      1.61
FP EMULATION:  1.72
FOURIER:       1.16
ASSIGNMENT:    1.38
IDEA:          1.82
HUFFMAN:       1.46

The slowdown would be greater if any processing must have been
done on every cache miss.  The current hack just adds the page
to the cache and does the memory transaction and returns.

A slowdown between 1.16x and ~2x is pretty good I think.

As reference, below is the results for Valgrind (CVS version,
running the none skin):

NUMERIC SORT:  Iterations/sec.: 36.541455       Index: 0.944346
STRING SORT:   Iterations/sec.: 2.181686        Index: 0.958983
BITFIELD:      Iterations/sec.: 6294984.678336  Index: 1.079789
FP EMULATION:  Iterations/sec.: 2.232148        Index: 1.073148
FOURIER:       Iterations/sec.: 746.055296      Index: 0.844757
ASSIGNMENT:    Iterations/sec.: 0.386720        Index: 1.473388
IDEA:          Iterations/sec.: 80.463770       Index: 1.231086
HUFFMAN:       Iterations/sec.: 43.592067       Index: 1.211396

 [1] http://www.byte.com/bmark/bmark.htm

best regards,
johan






reply via email to

[Prev in Thread] Current Thread [Next in Thread]