Issues with deplete using MPI (it only runs single thread on each process)

Greetings,

I’m running a depletion simulation:

...
with change_directory(saveDir):
    with openmc.lib.quiet_dll(True):
        dep.pool.USE_MULTIPROCESSING = True
        depletion_operator = dep.CoupledOperator(
            model=model,
            chain_file=chainLoc,
            normalization_mode = "fission-q",
            fission_yield_mode = "average",
            reduce_chain_level = 5,
            reaction_rate_mode = "flux",
            reaction_rate_opts = {
                'energies': np.logspace(-5, 7 + np.log10(2), 3000, base=10),
                'reactions': ['fission', '(n,gamma)', '(n,2n)', '(n,3n)', '(n,4n)', '(n,a)', '(n,p)'],
                'nuclides': ['Th232', 'Pa231', 'U232', 'U233', 'U234', 'U235'],
            })
    dep.pool.USE_MULTIPROCESSING = True

    depletion_operator.cleanup_when_done = True
    
    integrator_class = dep.integrators.integrator_by_name[burnup_algo]
    integrator = integrator_class(depletion_operator, time_steps, power_density=powerDensity, timestep_units='d')
       
    with openmc.lib.quiet_dll(True):
        dep.pool.USE_MULTIPROCESSING = True
        integrator.integrate(True)

    depletion_operator.finalize()

I know that pool.USE_MULTIPROCESSING = True by default, and it will use all availabe threads because NUM_PROCESSES = None by default. However, when I run the simulation using mpiexec -n 2 --map-by node:PE=12 --bind-to hwthread -x OMP_NUM_THREADS=12 python main-noreflector2.py, and even though the output says MPI Processes | 2, OpenMP Threads | 12, the depletion part only use one thread per processes:

Why is that? Is it because my system lacks of memory?

Thanks!

The information reported by the OpenMC header is usually reliable, so it’s surprising to me that the cores aren’t being used. If you provide the same arguments to mpiexec for a single OpenMC execution do you see the same behavior?

Thanks for replying,

When openmc do the neutron transport calculation, it uses all the available threads (the C++ part of this program does things as expected). Only when it starts calculating the transmutation it goes to single-thread-per-mpi-process.

I think it’s because of the implementation of the multiprocessing; probably there was a process before or after the multiprocessing part that only use single thread, a bottleneck after the transport calculation and before the transmutation calculation, but i’m not sure…

Yes, the number of threads indicated in the OpenMC operations are only going to apply to transport. The number of threads used in transmutation will be determined by settings in the NumPy/SciPy which can be influenced by a number of environment variables depending on the system. You can try setting some of these additional environment variables as well to see if they make a difference

But the reality is that the transmutation is a very small amount of the computational time compared to transport so adding threading to those calculations isn’t likely to make a big difference in the overall depletion calculation.

1 Like