Issues on optimization of parallel computing

Dear all,

I perform my calculations on a local computer with i5 -8500 processor (6 threads) and on a server with E5-2690 v4 processor (64 threads).

In both cases a part of calculations for each series of particles is running on only one kernel/thread. In the case of a server this is especially important both because of significantly larger number of threads and lower processor frequency (probably).

Is it possible to avoid this situation?
At the moment I have not found anything better than to run up to 9 tasks with different parameters (and limit the number of threads used) simultaneously. At the same time, the average time per task is significantly reduced.
So, on 63 threads a test task is executed for 100 seconds, and on 7 only 182 seconds (I can start 9 such calculations). On average, it is almost 5 times faster.

It is already inconvenient to start more than 9 tasks due to management problems :).

Or am I doing something wrong?

Thank you for your time. :slight_smile:

Hi Illia,

I would say the most important thing is to ensure that each thread enough “work” between synchronization points (at the end of a generation/batch) so that the cost of the synchronization (some of which is serial) becomes insignificant compared to the time performing transport. In practical terms, this just means increasing the number of particles per generation (settings.particles).

On a multi-socket system, there is some further optimization you can do. Have a look at our user’s guide section on parallelism, but I’ll also mention one important point there. When you have a system with multiple sockets, it’s generally best to compile with MPI and OpenMP, and then use one MPI process per socket and OpenMP threads within a socket. On your server, if it really is an E5-2690v4, it likely has two CPUs, each with 14 cores (28 hardware threads). If you build with MPI, I would recommend nunning as:

mpiexec -n 2 -bind-to socket openmc

which ensures that the OpenMP threads stay bound to the socket where they were created.

Best regards,
Paul