bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Degraded performance in cat + patch


From: Jim Meyering
Subject: Re: Degraded performance in cat + patch
Date: Fri, 06 Mar 2009 16:57:52 +0100

Pádraig Brady wrote:
...
> Wow that's interesting. My results are with 400MHz DDR2.
> If I do a simpler test excluding file-system and page cache
> to just show the syscall overhead I can also see the doubling
> of throughput when going from 4KiB to 32KiB buffers:
>
> for i in $(seq 0 10); do
>   bs=$((1024*2**$i))
>   printf "%7s=" $bs
>   dd bs=$bs if=/dev/zero of=/dev/null count=$(((2*1024**3)/$bs)) 2>&1 |
>   sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
> done
>    1024=484 MB/s
>    2048=857 MB/s
>    4096=1.6 GB/s
>    8192=2.4 GB/s
>   16384=3.1 GB/s
>   32768=3.6 GB/s
>   65536=3.6 GB/s
>  131072=3.8 GB/s
>  262144=3.9 GB/s
>  524288=3.9 GB/s
> 1048576=3.9 GB/s
>
> Why I only see a small increase between 4 & 32K buffers when going
> through the file-system and page cache on my kernel, must be due to
> inefficiencies that have subsequently been addressed?

Interesting test.

On the 2-core AMD system (1MB cache per core)

  $ for i in $(seq 0 10); do
   bs=$((1024*2**$i))
   printf "%7s=" $bs
   dd bs=$bs if=/dev/zero of=/dev/null count=$(((2*1024**3)/$bs)) 2>&1 |
   sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
  done
     1024=578 MB/s
     2048=1.1 GB/s
     4096=1.8 GB/s
     8192=2.6 GB/s
    16384=3.2 GB/s
    32768=4.1 GB/s
    65536=4.8 GB/s
   131072=5.2 GB/s
   262144=5.7 GB/s
   524288=5.9 GB/s
  1048576=3.4 GB/s

On the 4-core Intel with 6M cache per core and faster RAM

     1024=1.5 GB/s
     2048=2.8 GB/s
     4096=5.0 GB/s
     8192=7.7 GB/s
    16384=10.4 GB/s
    32768=9.6 GB/s
    65536=9.9 GB/s
   131072=10.6 GB/s
   262144=10.7 GB/s
   524288=10.6 GB/s
  1048576=11.2 GB/s
  2097152=10.6 GB/s
  4194304=9.8 GB/s
  8388608=2.6 GB/s




reply via email to

[Prev in Thread] Current Thread [Next in Thread]