bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Taking advantage of L1 and L2 cache in sort


From: Pádraig Brady
Subject: Re: Taking advantage of L1 and L2 cache in sort
Date: Wed, 03 Mar 2010 01:06:06 +0000
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100216 Thunderbird/3.0.2

On 02/03/10 18:20, Chen Guo wrote:
This is exactly what that guy Shaun Jackman was talking about earlier.
I'm actually really surprised this is faster, if I can dig up his e-mail I'll
forward him this, I remember him saying something about experimenting
with exactly this.

I missed that thread but yes he pretty much had the
same idea as I stumbled on when trying to perturb
the posix_fadvise() testing by changing the buffer size.
http://lists.gnu.org/archive/html/bug-coreutils/2010-02/msg00151.html
Spooky :)

Shaun, you can use `taskset` to set process affinity BTW.

Can you profile the difference in the number of I/O system calls?

$ TMPDIR=/ram LANG=C /usr/bin/time -v strace -c ./src/sort sort.t/sort.1.test > 
/dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 70.70    0.283077       21775        13           read
 28.97    0.115983       19331         6           munmap
  0.32    0.001268           0     21609           write
  0.01    0.000054           8         7           open
  0.00    0.000000           0         9           close
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         1         1 ioctl
  0.00    0.000000           0         1           uname
  0.00    0.000000           0         5           mprotect
  0.00    0.000000           0        25           rt_sigaction
  0.00    0.000000           0         1           rt_sigprocmask
  0.00    0.000000           0         4           getrlimit
  0.00    0.000000           0        16           mmap2
  0.00    0.000000           0         9           fstat64
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           fadvise64_64
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.400382                 21717         3 total
        Command being timed: "strace -c ./src/sort sort.t/sort.1.test"
        User time (seconds): 26.91
        System time (seconds): 2.01
        Percent of CPU this job got: 90%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.02
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 0
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 3
        Minor (reclaiming a frame) page faults: 181060
        Voluntary context switches: 87362
        Involuntary context switches: 2526
        Swaps: 0
        File system inputs: 173504
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

$ TMPDIR=/ram LANG=C /usr/bin/time -v strace -c ./src/sort -S1M sort.t/sort.1.test 
> /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 38.95    0.035011           1     60991           read
 33.47    0.030081          90       334           unlink
 24.17    0.021721           0     81864           write
  2.07    0.001862           2      1006           munmap
  0.75    0.000670           1       673           open
  0.23    0.000209           0      1016           mmap2
  0.19    0.000167           0       675           fstat64
  0.09    0.000085           0       675           close
  0.07    0.000062           0       334           fcntl64
  0.02    0.000018           0      1337           rt_sigprocmask
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1         1 access
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         1         1 ioctl
  0.00    0.000000           0         1           gettimeofday
  0.00    0.000000           0         1           uname
  0.00    0.000000           0         5           mprotect
  0.00    0.000000           0       334           _llseek
  0.00    0.000000           0        25           rt_sigaction
  0.00    0.000000           0         1           getrlimit
  0.00    0.000000           0         2         1 futex
  0.00    0.000000           0         1           set_thread_area
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0       335           fadvise64_64
  0.00    0.000000           0         1           set_robust_list
------ ----------- ----------- --------- --------- ----------------
100.00    0.089886                149618         3 total
        Command being timed: "strace -c ./src/sort -S1M sort.t/sort.1.test"
        User time (seconds): 21.76
        System time (seconds): 4.51
        Percent of CPU this job got: 98%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:26.79
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 0
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 3
        Minor (reclaiming a frame) page faults: 23038
        Voluntary context switches: 598317
        Involuntary context switches: 2316
        Swaps: 0
        File system inputs: 173504
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

cheers,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]