swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance issues


From: Marcus G. Daniels
Subject: Re: Performance issues
Date: Fri, 31 Jan 2003 14:33:09 -0700
User-agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.3b) Gecko/20030117

Bill Northcott wrote:

Motherboards like Tyan Trinity GC-SL (for Pentium 4, not even a Xeon)
have PCI-X slots run 64 bits wide @ 133 Mhz.

They are not common. The standard Intel chip sets only support 32bit/33MHz PCI on P4 computers.
If you want a server, buy a motherboard with a server chipset.
The performance of the Pentium 4 CPU _itself_ is a different issue.

Latency is much more important in most applications. Which is why high end Macs and powerful number crunchers use Level 3 cache memory

Level 3 cache is a way to deal with the memory bus clock speed being limited and inherent RAM latency. On a new and mainstream Pentium 4-based system, the clock is four times as fast as on a Mac, so there is less benefit in a level 3 cache. New Pentium 4s have 512KB on the chip itself, and they use very fast RAM, both in terms of bandwidth and latency.

For a benchmark, let me propose a garbage collecting test program, such as the Java version GCBench. http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_bench

Folks can compile the GCBench.java referenced here on different machines with minimum trouble and make some comparisons. I propose this benchmark because it will be traversing from pointer to pointer within a large memory region larger (say 32MB) than any potential Level 2 or 3 cache. Information gets torn up and down in different ways, and should be a big stress on the caches. That it is Java just makes it easy for people to run, but also has the benefit that the highly-tuned Sun JDK generational garbage collector has the opportunity to sort objects (and thus memory acceses) by age and physical region, possibly leveraging the available cache.
Results:

On a 550 Mhz Pentium 3 with 256K Level 2 cache and PC100 memory I get the following:

time jdk1.3.1_07/bin/java -Xms32m -Xmx32m GCBench
[output deleted]
real    0m8.565s
user    0m8.280s
sys     0m0.310s

This is roughly comparable to a Sun Blade 100 @ 500 Mhz also with 256K Level 2 cache.

time /opt/jdk1.3.1/bin/java GCBench
[output deleted]
real    0m10.609s
user    0m9.330s
sys 0m0.680s On a 400 Mhz G4 with 1MB Level 2 cache and PC100 memory I get 6.7 user seconds of the built-in JDK 1.3.1 with Mac OS X 10.2.3.

time java -Xms32m -Xmx32m GCBench [output deleted]
real    0m9.183s
user    0m7.000s
sys 0m0.760s Very impressive. The G4 is only running at 400 Mhz. But then it has four times as much level 2 cache. It _ought_ to.

Now how about a 1Ghz Pentium III with PC133 memory and 256K level 2 cache?
In other words, since the GCBench should be latency-limited to a considerable extent, the faster PC133 RAM should make a difference and the CPU won't help so much. That would appear to be the case:

time /usr/java/jdk1.3.1_07/bin/java GCBench
[output deleted]
real    0m7.186s
user    0m6.970s
sys     0m0.090s

How does this compare to a Pentium 4 @ 1.7Ghz with fast Rambus memory and a 400 Mhz front side bus, and 256K level 2 cache?

real    0m2.799s
user    0m2.510s
sys 0m0.120s Still a small level 2 cache, but fast RAM and a fast CPU trounces the G4 with four times the level two cache.

Since the current generation Pentium 4's have 512KB level 2 cache and DDR and Rambus is even faster now than with the test machine I used, it would be interesting to compare a 3.06 Ghz Pentium 4 with 512 KB level 2 cache and PC2700 DDR memory (or 1066 RDRAM) against the latest Apple offering (Xserve, new Quicksilver, etc.). (I don't have any of these at the moment.) I predict that the Mac will be treated to a big dose of whoop-ass.

and modern RAID controllers boast Gigabytes of DRAM and very sophisticated caching algoritms.

Well, $150 Pentium 4 motherboards with RAID 0 and two to four ports are common these days. A 64 bit PCI (not even PCI-X) ATA RAID card will be a couple hundred dollars for a dual Pentium III motherboard. If you want to spend more money, there are server motherboards for Pentium IVs that have the latest-and-greateast SCSI controllers. Server motherboards seem to be four or five times as expensive as mainstream motherboards (e.g. $600).

Btw, another possible benchmark would be SQL server performance. SQL servers use a lot of RAM and have access patterns that are unpredictable (thus stressing the data cache). I tried to find some SQL numbers for the XServe, but of course Apple advocates rarely bother to publish benchmarks. Reality distortion field and all that..


                 ==================================
  Swarm-Support is for discussion of the technical details of the day
  to day usage of Swarm.  For list administration needs (esp.
  [un]subscribing), please send a message to <address@hidden>
  with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]