[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] sort: parallel external sort implementation
From: |
Pádraig Brady |
Subject: |
Re: [PATCH] sort: parallel external sort implementation |
Date: |
Thu, 04 Mar 2010 12:06:19 +0000 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100216 Thunderbird/3.0.2 |
On 04/03/10 06:01, Joey Degges wrote:
Hello,
Matt (cc'd, not on list) and I have developed a patch that parallelizes sort
using pthreads. The threading approach taken here is to break up input files
into groups based on physical device and sort each of those groups in
parallel.
This allows for storage and processing resources to be more fully utilized
during the sort process.
multidisk sort with 4x 50M inputs
$ time -v sort_multidisk /q/0/50M /q/1/50M /q/2/50M /q/3/50M> /dev/null
User time (seconds): 10.89
System time (seconds): 0.83
Percent of CPU this job got: 180%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:06.50
Major (requiring I/O) page faults: 20
Minor (reclaiming a frame) page faults: 72843
Voluntary context switches: 1936
Involuntary context switches: 149
File system inputs: 393080
current sort with 4x 50M inputs
$ time -v sort_current /q/0/50M /q/1/50M /q/2/50M /q/3/50M> /dev/null
User time (seconds): 11.05
System time (seconds): 0.32
Percent of CPU this job got: 86%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:13.19
Major (requiring I/O) page faults: 18
Minor (reclaiming a frame) page faults: 72532
Voluntary context switches: 915
Involuntary context switches: 48
File system inputs: 198936
diff --git a/src/sort.c b/src/sort.c
+ --threads=N use no more than N threads to improve
parallelism\n\
Threads is a bit of an implementation detail?
Perhaps it would be better to use --parallel so one
could move to multiprocess in future if desired?
Since this is a new option, it requires documentation
in docs/coreutils.texi which would include examples
of how and why you would use this option.
+
+/* Sort NFILES FILES onto OUTPUT_FILE.
+
+ Threading approach: Break FILES up into several groups where each contains
+ only files that can be found on the same physical device (according to
+ device_cmp()). Spawn threads to execute do_sort() on each group of files in
+ parallel.
+
+ This allows for all concerned resources (storage devices and processors) to
+ be more fully utilized.
+*/
+
+static void
+sort_multidisk (char * const *files, size_t nfiles, char const *output_file,
+ unsigned long int nthreads)
Have you considered the seek characteristics of SSDs
and how they might affect things (with consideration
that mechanical disks will be replaced by SSDs quite quickly).
There still would be some benefit splitting per SSD,
but it would be worth measuring to see.
@@ -3691,7 +3996,21 @@ main (int argc, char **argv)
IF_LINT (free (sortfiles));
}
else
- sort (files, nfiles, outfile);
+ {
+ if (!nthreads)
+ {
+ /* The default NTHREADS is 2 ** floor (log2 (number of processors)).
+ If comparisons do not vary in cost and threads are
+ scheduled evenly, with the current algorithm there is no
+ performance advantage to using a number of threads that
+ is not a power of 2. */
+ unsigned long int np2 = num_processors (NPROC_CURRENT) / 2;
+ for (nthreads = 1; nthreads <= np2; nthreads *= 2)
+ continue;
+ }
You probably want NPROC_CURRENT_OVERRIDABLE ?
cheers,
Pádraig.