|
From: | Dennis Luehring |
Subject: | Re: [Qemu-devel] [PATCH] target-sparc: Store mmu index in TB flags |
Date: | Tue, 25 Aug 2015 21:03:53 +0200 |
User-agent: | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 |
Am 25.08.2015 um 20:09 schrieb Richard Henderson:
On 08/25/2015 07:37 AM, Dennis Luehring wrote: > Am 25.08.2015 um 16:25 schrieb Richard Henderson: >> Er, no, it should. The primary vector by which I expect improvement is via not >> encoding dmmu.mmu_primary_context into the TB flags. I.e. ASI_DMMU, which >> sun4u certainly uses. >> >> The fact that the patch_also_ fixes a sun4v problem is secondary. > > please, can you(or someone else) give me a feedback about my tests/numbers - > and the relevance of them - the stream benchmarks results seems to be worser > then before and the compilespeed is just a little bit better - so i don't understand (at > all) what problems are fixed or what is improved now The fact that stream degraded means that stream is unreliable as a benchmark. I suspect that if you simply run it N times with the exact same setup you'll see a very large variance in its runtime. This particular patch cannot possibly have degraded performance, as it could only result in a reduction, not expansion, of the number of TBs created. As to why stream should be unreliable, I have no clue.
6 runs - 6 times nearly the same result (and the stream benchmark itself seems not to be an unknown https://www.cs.virginia.edu/stream/ - measures sustainable memory bandwidth vs. FPU performance)
run 1# Function Best Rate MB/s Avg time Min time Max time Copy: 278.3 0.576045 0.574946 0.581186 Scale: 181.5 0.888582 0.881669 0.900648 Add: 217.6 1.109354 1.102955 1.123495 Triad: 167.7 1.440939 1.430755 1.463517 run 2# Function Best Rate MB/s Avg time Min time Max time Copy: 277.8 0.577607 0.575970 0.582532 Scale: 181.4 0.909480 0.882134 1.058552 Add: 217.5 1.110417 1.103327 1.122539 Triad: 167.5 1.444383 1.432864 1.477904 run 3# Function Best Rate MB/s Avg time Min time Max time Copy: 278.3 0.586721 0.574839 0.655187 Scale: 181.7 0.889060 0.880544 0.898155 Add: 217.3 1.115113 1.104248 1.146618 Triad: 167.6 1.480999 1.432066 1.748302 run 4# Function Best Rate MB/s Avg time Min time Max time Copy: 276.7 0.580837 0.578262 0.585253 Scale: 180.6 0.891853 0.885707 0.895370 Add: 216.5 1.116623 1.108630 1.126520 Triad: 167.1 1.444834 1.435996 1.451557 run 5# Function Best Rate MB/s Avg time Min time Max time Copy: 278.3 0.593767 0.574839 0.689366 Scale: 182.0 0.897183 0.879005 0.938262 Add: 217.7 1.132244 1.102195 1.203082 Triad: 167.4 1.444530 1.434112 1.487601
> - the compilation test is still 180 times slower then on my host I'll have to compare that test vs an Alpha guest and see what I get. I only remember one factor of 10, not two... But you're right, it would be nice to put together a coherent set of benchmarks. Ideally, a guest kernel plus minimal ramdisk with the tests pre-loaded so that we can boot and run ./benchmark at the prompt. That's the sort of thing we can easily upload to the wiki and share.
any idea what memory bandwidth benchmark i could use somthing on this list http://lbs.sourceforge.net/ ?
[Prev in Thread] | Current Thread | [Next in Thread] |