[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Discussion: Switching Esprseso to shared memory parallelization

From: Rudolf Weeber
Subject: Discussion: Switching Esprseso to shared memory parallelization
Date: Mon, 5 Jul 2021 17:55:10 +0200

Dear Espresso users,

We are currently discussing switching Espresso's parallelization from
MPI-based to shared memory based. This should result in better parallel
performance and much simpler code. However, it would mean that a single
instance of Espresso would only run on a single machine. For current HPC
systems, that would be something between 20 and 64 cores, typically.

To help with the decision, we would like to know, if anyone runs simulations
using Espresso with more than 64 cores, and what kind of simulations those
Please see technical details below and let us know, what you think.

Regards, Rudolf

# Technical details

## Parallelization paradigms

With MPI-based parallelization, information between processes is passed by
explicitly sending messages. This means that the data (such as particles at
the boundary of a processor) has to be packed, sent, and unpacked.
In shared memory parallelization, all processes have access to the same data.
It is only necessary to ensure that no two processes write the same data at
the same time. So, delays for packing, unpacking and sending the data can
mostly be avoided.

## Reasons for not using MPI

* Adding new features to Espresso will be easier, because a lot of non-trivial
communication code does not have to be written.
* The mix of controller-agent and synchronous parallelization used by Espresso
is difficult to understand for new developers, which makes it difficult to get
started with Espresso coding. This parallelization scheme is a result of
Espresso being controlled by a (Python) scripting interface.
* The MPI and Boost::MPI dependencies complicate Espresso's installation and
make it virtually impossible to run Espresso on public Python platforms such
as Azure Notebooks or Google Collab as well as building on Windows natively.
* The core team had to spend considerable time handling bugs in the MPI and
Boost::MPI dependencies that affected Espresso.
* Writing and validating MPI-parallel code is difficult. We had a few
instances of data not being correctly synchronized across MPI processes which
went unnoticed. In one instance, we were, after a lot of effort, not able to
solve the issue and had to disable a feature for MPI-parallel simulations.

## Advantages of supporting MPI

Simulations can run on more than a single node, i.e., more than the 20-64
cores which are present in typical HPC-nodes.

## Performance estimates

Assuming that one million time steps per day is acceptable, this corresponds
to slightly less than 10k particles per core in a charged soft sphere system
(LJ+P3M) at 10% volume fraction. So, approximately 300k particles would be
possible on an HPC node.
For a soft sphere + LB on a GPU, several million particles should be possible.

Dr. Rudolf Weeber
Institute for Computational Physics
Universit├Ąt Stuttgart
Allmandring 3
70569 Stuttgart
Phone: +49(0)711/685-67717

reply via email to

[Prev in Thread] Current Thread [Next in Thread]