[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Parallelisation
From: |
Rudolf Weeber |
Subject: |
Re: Parallelisation |
Date: |
Thu, 13 Jan 2022 11:18:25 +0100 |
Hi Ahmad,
On Wed, Jan 12, 2022 at 01:35:35PM +0000, Ahmad Reza Motezakker wrote:
>
> I have a suspension of polymers coupled with fluid. (LJ+LB)
>
> Here are the parameters:
>
> box_l = 300*sigma (box is a cube)
>
> number of polymers = 300
>
> beads per polymer = 26
>
> All the particles = 300*26 =7800
>
> LJ cut = sigma*(2**(1/6))
>
> l_skin = 8.3 *sigma (set it thid to have 31cells in each direction)
>
> LB cells = 50
>
> number of cells in each direction = 31
>
>
> Timing for 100 productive run after setting the system and warming up:
>
> 1core 17.824 s
>
> 2core 16.22 s
>
> 4core 15.93 s
>
> 8core 17.83 s
Can you please report the timings obtained via
mpirun -np 4 ./pypresso ../maintainer/benchmarks/lb.py
--particles_per_core=20000 --lb_sites_per_particle 6
These are 80k particles with a 78^3 LB, so slightly bigger than your system. I
get about 80ms per time step on an AMD Ryzen 1920x Threadripper with 12 cores.
You can also check on 8 cores by using -np 8 and --particles_per_core=10000.
On my system, this is not worth it.
You can get a significantly faster simulation by using the GPU LB. The speedup
relies partially on the fact that GPUs are very well suited for LB (e.g.
because of high memory band width) but also on the fact that Espresso's GPU LB
uses single precision, whereas the CPU one uses double precision.
> If I want get one node with 128 cores on cluster and only use 4 of them, the
> cluster support will not be happy.
On some clusters, it is possible to request just a part of a node (shared node
usage). Otherwise, it may be possible to run several instances of Espresso at
the same time on the cluster node.
Regards, Rudolf