bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: Getting gnubg to use all available cores


From: Louis Zulli
Subject: Re: [Bug-gnubg] Re: Getting gnubg to use all available cores
Date: Thu, 6 Aug 2009 14:58:59 -0400 (EDT)

Hi Ingo,

I don't use gnubg for analysis. I just happened to notice that gnubg seems to use just one core on my eight physical core machine. No matter how many threads, they all run in one core (or maybe it's two cores).

When I use the chess program crafty to analyze positions, and I have it use 8 threads for evaluation, the top command shows CPU%=800%. If I have it use 16 threads, the top command shows 1600% (which must mean that 16 virtual cores are being used, via hyperthreading).

My point is simply that threads are not being distributed among cores on my new Nehalem Mac Pro.

I'm hoping someone will help me understand why, just so I know.

Louis



----- Original Message -----
From: "Ingo Macherius" <address@hidden>
To: "Louis Zulli" <address@hidden>
Cc: "Michael Petch" <address@hidden>, address@hidden
Sent: Thursday, August 6, 2009 2:43:31 PM GMT -05:00 US/Canada Eastern
Subject: RE: [Bug-gnubg] Re: Getting gnubg to use all available cores

I've done two experiments on Debian 5.0.2 on my 2xXeon 5130 box (4 cores in
2 chips).

A) Run 4 instances of gnubg analyzing 5 matches with 7pt each (real fibs
matches)
B) Run 1 instance of gnubg analyzing the same matches 4 times each (with
caches flushed in between) with 4 threads

=> The amount of work done is equal, each experiiment analyzes 20 7pt
matches.

Run A takes about 10.3 secs real time (wallclock), Run B takes about 13.2
secs. These are averaged values over several runs, there were no significant
escapes.

I've used the "top" and "mpstat" commands running while both experiments
took place. I've noticed that both sort of smooth the % CPU usage they
display over a sliding window. When you start either A or B, the displayed %
usage remains low for some secs, gradually goes up, and peaks shortly before
the run ends and then falls to zero again. In other words: the % displayed
is an average over the last n seconds, not the current usage.

In other words: gnubg is utilizing the cores well in both cases, what you
see is the "beautifying" effect of top. Run a longer batch which runs for
say several minutes and you will see the 100% CPU usage (Case A) or near 90%
CPU usage (Case B) you expect.

The fact threaded gnubg can not utilize cores to the same effect than 4
parallel no-threading gnubgs is explainable by these assumptions of mine:
- The system scheduler does a better job then gnubg's
- There is overhead for threading even in multithreaded binaries, a gnubg
compiled without threading is some % faster than one with threading running
in one thread only.
- Thread synchronization causes some amount of idle time

So if you want to utilize your CPU the best, run as many gnubg instances as
there are cores in parallel. Compile them without threading support. Split
the amount of work to analyze into batches manually. If you prefer the more
comfortable way to let gnubg do the scheduling of your batches, accept the
reasonable penalty for that.

Ingo

P.S. The batch I've used to run 4 in parallel is

#!/bin/bash

TMPBATCH=/tmp/gnubgbatch

echo > ${TMPBATCH} set cache 131072
echo >> ${TMPBATCH} clear cache
echo >> ${TMPBATCH} clear hint
echo >> ${TMPBATCH} analysis clear
echo >> ${TMPBATCH} import mat You_vs_silent_greek_20090802162313580.mat
echo >> ${TMPBATCH} analyze match
echo >> ${TMPBATCH} import mat You_vs_silent_greek_20090521010117727.mat
echo >> ${TMPBATCH} analyze match
echo >> ${TMPBATCH} import mat You_vs_sale_20090803190848830.mat
echo >> ${TMPBATCH} analyze mat
echo >> ${TMPBATCH} import mat You_vs_fortuna_20090802225906909.mat
echo >> ${TMPBATCH} analyze mat
echo >> ${TMPBATCH} import mat You_vs_VaGrant_20090103173459437.mat
echo >> ${TMPBATCH} analyze mat

for t in 1 2 3
do
  ./gnubg-nt < ${TMPBATCH} > /dev/null &
done
  ./gnubg-nt < ${TMPBATCH} > /dev/null

> -----Original Message-----
> From: Louis Zulli [mailto:address@hidden
> Sent: Thursday, August 06, 2009 6:29 PM
> To: Ingo Macherius
> Cc: 'Michael Petch'; address@hidden
> Subject: Re: [Bug-gnubg] Re: Getting gnubg to use all available cores
>
>
> Hi,
>
> I put
>
> #define MAX_NUMTHREADS 64
>
> in multithread.h and rebuilt.
>
> In Settings-->Options-->Other, I put Eval Threads to 64.
>
> I then let gnubg analyze a game using 4-ply analysis.
>
> According to my unix top command, gnubg had 69 threads and was using  
> 188%CPU. So apparently all the threads were running (into
> each other!)  
> in one physical core.
>
> In any case, increasing the max number of threads above 16 seems  
> trivial to do, unless I'm missing something.
>
> Louis
>
>
> On Aug 6, 2009, at 11:34 AM, Ingo Macherius wrote:
>
> > Do you use the calibrate command or a batch analysis of matchfiles?
> > The
> > former was shown to be of no value for benchmarks, see here:
> > http://lists.gnu.org/archive/html/bug-gnubg/2009-08/msg00006.html
> >
> > With calibrate I had the very same effect of high idle times during
> > benchmarks, unless I used at least 8 threads per physical core.
> >
> > I am doing benchmark on a 4 core machine which iterates over #thread
> > (1..6)
> > and cache size (2^1 .. 2^27). Should be posted in say 3 hours, it  
> > literally
> > is still running :)
> >
> > Ingo
> >
> >> -----Original Message-----
> >> From: address@hidden
> >> [mailto:address@hidden On Behalf Of
> >> Louis Zulli
> >> Sent: Thursday, August 06, 2009 3:21 PM
> >> To: Michael Petch
> >> Cc: address@hidden
> >> Subject: [Bug-gnubg] Re: Getting gnubg to use all available cores
> >>
> >>
> >>
> >> On Aug 5, 2009, at 4:02 PM, Michael Petch wrote:
> >>
> >>> I'm unsure how the architecture is deployed and how OS/X
> >> handles the
> >>> physical cores, but it almost sounds like one Physical
> core is being
> >>> used (Using Hyperthreads to run 2 threads
> simultaneously). I wonder
> >>> if the memory
> >>> is shared across all the cores? A friend of mine was
> >> suggesting that
> >>> people
> >>> may have to wait for Snow Lapard to come out before OS/X properly
> >>> utilizes the Nehalem architecture (whetehr that si true or not, I
> >> don't know).
> >>>
> >>> Anyway, as an experiment. If you run 2 copies of Gnubg at
> the same
> >>> time (using multiple threads) do you get 400% CPU usage?
> >>>
> >>
> >>
> >> Hi Mike,
> >>
> >> Sorry for the delay. I just had two copies of gnubg
> analyze the same
> >> game, using 3 ply analysis. Each instance of gnubg used 200%
> >> CPU. Each
> >> copy was set to use 4 evaluation threads.
> >>
> >> So what's the verdict here? Is Leopard simply not directing threads
> >> correctly?
> >>
> >> Louis
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Bug-gnubg mailing list
> >> address@hidden http://lists.gnu.org/mailman/listinfo/bug-gnubg
> >
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]