bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: digest algorithm performance


From: P
Subject: Re: digest algorithm performance
Date: Tue, 21 Jun 2005 14:20:32 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124

As ever I haven't had time to look at this yet. comments below...

Jim Meyering wrote:
address@hidden wrote:
...

OK, I'll have a look. FreeBSD seems to be faster again,
(don't compare these results to the previous mail):


[I've just noticed you used `sha1' below.
 Is that a shell alias or function?
 The program from coreutils is called sha1sum. ]

I was referring to the seperate FreeBSD sha1 utility.
Note be careful when testing that as reading from stdin
causes 16KiB reads while reading from file causes 1KiB reads
(or whatever the block size of the filesystem is I guess?).

FYI, here are a few more data points:

  On an AMD-64 system, using a 700MB file on a tmpfs file system
  (and enough RAM so that no actual disk reads are performed),
  GNU md5sum is slightly faster than `openssl md5', e.g.:

    2.38s user 0.38s system 100% cpu 2.756 total  (gnu md5sum)
    vs.
    2.52s user 0.34s system 100% cpu 2.869 total

  However, `openssl sha1' is about 5% faster than GNU sha1sum:

    3.32s user 0.33s system 99% cpu 3.653 total   (openssl sha1)
    3.45s user 0.39s system 99% cpu 3.843 total   (gnu sha1sum)

The above are using the debian-sid (amd_64 alioth) binaries from
coreutils-5.2.1.  When I compile the latest (coreutils-cvs) with
gcc-4.0 -O3, I get slightly (2-3%) better sha1sum performance,
and a ~7% *decrease* in performance for md5sum.  I suspect that
with the right compiler options you can do much better.

have a look at http://www.pixelbeat.org/scripts/gcccpuopt

Note the openssl sha1 performance is highly dependent on CPU.
The following is a comment in openssl 0.9.7:

"""
It was noted that Intel IA-32 C compiler generates code which
performs ~30% *faster* on P4 CPU than original *hand-coded*
SHA1 assembler implementation. To address this problem (and
prove that humans are still better than machines:-), the
original code was overhauled, which resulted in following
performance changes:

              compared with original  compared with Intel cc
              assembler impl.         generated code
Pentium       -16%                    +48%
PIII/AMD      +8%                     +16%
P4            +85%(!)                 +45%
"""

I did a little testing of various sha1 implementations
on my fedora core 3 1.6GHz P4 laptop and got:

openssl-0.9.7a-40               = 0.220s
sha1sum                         = 0.350s
SteveReid (-march=pentium4 -O3) = 0.334s
gladman   (-march=pentium4 -O3) = 0.310s

Next to compare is the BSD sha1 code...


--
Pádraig Brady - http://www.pixelbeat.org
--




reply via email to

[Prev in Thread] Current Thread [Next in Thread]