qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] target-sparc: Store mmu index in TB flags


From: Dennis Luehring
Subject: Re: [Qemu-devel] [PATCH] target-sparc: Store mmu index in TB flags
Date: Tue, 25 Aug 2015 21:03:53 +0200
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0

Am 25.08.2015 um 20:09 schrieb Richard Henderson:
On 08/25/2015 07:37 AM, Dennis Luehring wrote:
> Am 25.08.2015 um 16:25 schrieb Richard Henderson:
>> Er, no, it should.  The primary vector by which I expect improvement is via 
not
>> encoding dmmu.mmu_primary_context into the TB flags.  I.e. ASI_DMMU, which
>> sun4u certainly uses.
>>
>> The fact that the patch_also_  fixes a sun4v problem is secondary.
>
> please, can you(or someone else) give me a feedback about my tests/numbers -
> and the relevance of them - the stream benchmarks results seems to be worser
> then before and the compilespeed is just a little bit better - so i don't 
understand (at
> all) what problems are fixed or what is improved now

The fact that stream degraded means that stream is unreliable as a benchmark.
I suspect that if you simply run it N times with the exact same setup you'll
see a very large variance in its runtime.

This particular patch cannot possibly have degraded performance, as it could
only result in a reduction, not expansion, of the number of TBs created.

As to why stream should be unreliable, I have no clue.

6 runs - 6 times nearly the same result (and the stream benchmark itself seems not to be an unknown https://www.cs.virginia.edu/stream/ - measures sustainable memory bandwidth vs. FPU performance)

run 1#
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             278.3     0.576045     0.574946     0.581186
Scale:            181.5     0.888582     0.881669     0.900648
Add:              217.6     1.109354     1.102955     1.123495
Triad:            167.7     1.440939     1.430755     1.463517
run 2#
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             277.8     0.577607     0.575970     0.582532
Scale:            181.4     0.909480     0.882134     1.058552
Add:              217.5     1.110417     1.103327     1.122539
Triad:            167.5     1.444383     1.432864     1.477904
run 3#
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             278.3     0.586721     0.574839     0.655187
Scale:            181.7     0.889060     0.880544     0.898155
Add:              217.3     1.115113     1.104248     1.146618
Triad:            167.6     1.480999     1.432066     1.748302
run 4#
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             276.7     0.580837     0.578262     0.585253
Scale:            180.6     0.891853     0.885707     0.895370
Add:              216.5     1.116623     1.108630     1.126520
Triad:            167.1     1.444834     1.435996     1.451557
run 5#
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:             278.3     0.593767     0.574839     0.689366
Scale:            182.0     0.897183     0.879005     0.938262
Add:              217.7     1.132244     1.102195     1.203082
Triad:            167.4     1.444530     1.434112     1.487601


> - the compilation test is still 180 times slower then on my host

I'll have to compare that test vs an Alpha guest and see what I get.  I only
remember one factor of 10, not two...

But you're right, it would be nice to put together a coherent set of
benchmarks.  Ideally, a guest kernel plus minimal ramdisk with the tests
pre-loaded so that we can boot and run ./benchmark at the prompt.  That's
the sort of thing we can easily upload to the wiki and share.

any idea what memory bandwidth benchmark i could use
somthing on this list http://lbs.sourceforge.net/ ?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]