espressomd-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Discussion: Switching Esprseso to shared memory parallelization


From: Ivan Cimrak
Subject: Re: Discussion: Switching Esprseso to shared memory parallelization
Date: Fri, 6 Aug 2021 08:59:42 +0200

Dear Rudolf, fellow Espresso developers,


There is a module PyOIF that enables simulation of biomechanics of cells in 
ESPResSo. For single cell simulations, the code runs perfectly on single CPU 
and for dilute cell suspension it is possible to run the code on dozen cores. 
We also do many-cell simulations and for this we use clusters involving 
hundreds (possibly thousand) of cores.

So for the use case of simulation many cell systems, the transition to shared 
memory parallelization would make it impossible.


Best regards,
Ivan

> On 5 Jul 2021, at 17:55, Rudolf Weeber <weeber@icp.uni-stuttgart.de> wrote:
> 
> Dear Espresso users,
> 
> We are currently discussing switching Espresso's parallelization from
> MPI-based to shared memory based. This should result in better parallel
> performance and much simpler code. However, it would mean that a single
> instance of Espresso would only run on a single machine. For current HPC
> systems, that would be something between 20 and 64 cores, typically.
> 
> To help with the decision, we would like to know, if anyone runs simulations
> using Espresso with more than 64 cores, and what kind of simulations those
> are.
> Please see technical details below and let us know, what you think.
> 
> Regards, Rudolf
> 
> 
> # Technical details
> 
> ## Parallelization paradigms
> 
> With MPI-based parallelization, information between processes is passed by
> explicitly sending messages. This means that the data (such as particles at
> the boundary of a processor) has to be packed, sent, and unpacked.
> In shared memory parallelization, all processes have access to the same data.
> It is only necessary to ensure that no two processes write the same data at
> the same time. So, delays for packing, unpacking and sending the data can
> mostly be avoided.
> 
> ## Reasons for not using MPI
> 
> * Adding new features to Espresso will be easier, because a lot of non-trivial
> communication code does not have to be written.
> * The mix of controller-agent and synchronous parallelization used by Espresso
> is difficult to understand for new developers, which makes it difficult to get
> started with Espresso coding. This parallelization scheme is a result of
> Espresso being controlled by a (Python) scripting interface.
> * The MPI and Boost::MPI dependencies complicate Espresso's installation and
> make it virtually impossible to run Espresso on public Python platforms such
> as Azure Notebooks or Google Collab as well as building on Windows natively.
> * The core team had to spend considerable time handling bugs in the MPI and
> Boost::MPI dependencies that affected Espresso.
> * Writing and validating MPI-parallel code is difficult. We had a few
> instances of data not being correctly synchronized across MPI processes which
> went unnoticed. In one instance, we were, after a lot of effort, not able to
> solve the issue and had to disable a feature for MPI-parallel simulations.
> 
> ## Advantages of supporting MPI
> 
> Simulations can run on more than a single node, i.e., more than the 20-64
> cores which are present in typical HPC-nodes.
> 
> ## Performance estimates
> 
> Assuming that one million time steps per day is acceptable, this corresponds
> to slightly less than 10k particles per core in a charged soft sphere system
> (LJ+P3M) at 10% volume fraction. So, approximately 300k particles would be
> possible on an HPC node.
> For a soft sphere + LB on a GPU, several million particles should be possible.
> 
> --
> Dr. Rudolf Weeber
> Institute for Computational Physics
> Universität Stuttgart
> Allmandring 3
> 70569 Stuttgart
> Germany
> Phone: +49(0)711/685-67717
> Email: weeber@icp.uni-stuttgart.de
> http://www.icp.uni-stuttgart.de/~icp/Rudolf_Weeber
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]