parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Job Processing Was RE: Parallel Merge


From: Ole Tange
Subject: Re: Job Processing Was RE: Parallel Merge
Date: Wed, 24 Aug 2011 11:10:37 +0200

On Wed, Aug 24, 2011 at 4:06 AM, Nathan Watson-Haigh
<nathan.watson-haigh@awri.com.au> wrote:
>> From: ole.tange@gmail.com
>> On Tue, Aug 23, 2011 at 8:38 AM, Nathan Watson-Haigh
>> <nathan.watson-haigh@awri.com.au> wrote:
:
>> I often face this problem aswell. Starting more processes normally
>> solves the problem for me. I believe the issue is that while the disk
>> I/O is below maximum capacity on average, there are spikes where it is
>> over capacity (e.g. every time it has to seek). During theses spikes
>> the CPU is waiting for data to process. By having more processes
>> running than processors the "extra" processes can buffer up some input
>> which can then be processed when there is idle CPU time.
>>
>> So: Try starting twice as many processes (-j 200%).
>
> I'm about to try this out, but you comments raised a couple of questions
> for me:
>
> Does this introduces the concept of a queue length vs the number of
> processes that can be run at any one time?

No. There is no such concept.

> Are there options for
> specifying these separately so I could specify only a subset of
> available cores to be used, but still have parallel buffer up the extra
> input in the way that you say?

It is not parallel that buffers up extra input: It is the processes
themselves. Parallel simply runs more processes in parallel.

Let us say we have 1 core and we run 2 processes (P1, P2).

If P1 is blocked because it is waiting for I/O, maybe P2 is not
waiting for I/O and can be run. If P1 is blocked P2 will automatically
get 100% CPU time. UNIX does that for you - no action needed on your
part. If none of them wait for I/O each will get 50% and if both are
waiting for I/O none of them will be run.

If you have even more processes chances are that at least one of them
is not waiting for I/O and thus can use the CPU power. Of course this
only works if there are periods where the I/O is sufficient. If I/O is
the bottleneck all the time, then it will usually not help starting
more processes.

>> Low throughput on disks can also be due to disk seeks.  Until recently
>> I did not know of a tool that could detect disk seeks, but I have now
>> found iostat. The '%util' column is very useful to see how busy the
>> disk is:
>>
>>   iostat -xd 1

I forgot to mention that iostat has to be fairly new. Version 8.1.2 is OK.

Here is a (slightly edited) example of the output:

Linux 2.6.38-10-generic (ole-laptop)    2011-08-24      _x86_64_        (8 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,72     8,67    1,56    2,97   120,24   125,11
108,27     1,02  225,73   72,26  306,36   3,38   1,53

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     1,00    0,00   11,00     0,00    48,00
8,73     0,00    0,00    0,00    0,00   0,00   0,00
sda               0,00     2,00    0,00   22,00     0,00    92,00
8,36     0,25   11,36    0,00   11,36   5,00  11,00
sda               0,00     0,00    0,00    0,00     0,00     0,00
0,00     0,00    0,00    0,00    0,00   0,00   0,00
sda               0,00     0,00   73,00    0,00   292,00     0,00
8,00     0,22    3,01    3,01    0,00   2,33  17,00
sda               0,00    18,00    1,00  313,00     4,00  1408,00
8,99   146,87  497,42   50,00  498,85   3,18 100,00
sda               0,00    38,00    1,00  567,00     4,00  4352,00
15,34   141,49  254,52  240,00  254,55   1,76 100,00
sda               0,00     0,00  181,00  218,00   724,00  1024,00
8,76    45,50  161,05    9,89  286,56   2,46  98,00
sda               0,00     0,00  401,00    5,00  1604,00     5,00
7,93     0,66    1,63    1,65    0,00   1,63  66,00
sda               0,00   243,00    0,00  740,00     0,00  4076,00
11,02   130,94  189,28    0,00  189,28   1,35 100,00
sda               0,00     0,00    0,00   25,00     0,00   160,00
12,80     0,49  241,60    0,00  241,60   1,60   4,00
sda               0,00     0,00    0,00    0,00     0,00     0,00
0,00     0,00    0,00    0,00    0,00   0,00   0,00
sda               0,00     0,00    0,00    0,00     0,00     0,00
0,00     0,00    0,00    0,00    0,00   0,00   0,00

It start out on an idle system. Then I run 'find /' for a few seconds
and the it is idle again. Notice that the amount of blocks is not very
high, but the %util is still 60-100%. This is an indication of seeks.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]