HDF5 errors when using openmc_init(some_arbitrary_comm)

I’m developing a driver that calls openmc_init() with an arbitrary communicator. When I do so, however, I get some errors from HDF5 that I don’t understand.

Here’s a modified version of main.F90 that demonstrates the problematic parts of my driver. It creates a new communicator from a subset of procs of MPI_COMM_WORLD. Then it passes that communicator to openmc_init().

program main

use constants

use finalize, only: openmc_finalize

use global

use initialize, only: openmc_init

use message_passing

use particle_restart, only: run_particle_restart

use plot, only: run_plot

use simulation, only: run_simulation

use volume_calc, only: run_volume_calculations

use mpi

implicit none

integer :: world_group ! Group belonging to MPI_0COMM_WORLD

integer :: openmc_comm, openmc_group ! A new group/comm for running OpenMC

! Initialize run – when run with MPI, pass communicator

#ifdef MPI

call MPI_Init(mpi_err)

! Make a new communicator from a subset of procs. For demonstrative purposes, we make a

! communicator from just one proc.

call MPI_Comm_group(MPI_COMM_WORLD, world_group, mpi_err)

call MPI_Group_incl( &

world_group, &

1, &

(/ 0 /), &

openmc_group, &

mpi_err)

call MPI_Comm_create(MPI_COMM_WORLD, openmc_group, openmc_comm, mpi_err)

if (openmc_comm /= MPI_COMM_NULL) then

call openmc_init(openmc_comm)

endif

#else

call openmc_init()

#endif

! start problem based on mode

select case (run_mode)

case (MODE_FIXEDSOURCE, MODE_EIGENVALUE)

call run_simulation()

case (MODE_PLOTTING)

call run_plot()

case (MODE_PARTICLE)

if (master) call run_particle_restart()

case (MODE_VOLUME)

call run_volume_calculations()

end select

! finalize run

call openmc_finalize()

end program main

After I build and run it, however, I get the following errors. These are the same errors I get with the real driver I’m developing.

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) MPI-process 1:

#000: H5T.c line 1723 in H5Tclose(): not a datatype

major: Invalid arguments to routine

minor: Inappropriate type

HDF5-DIAG: Error detected in HDF5 (1.10.0-patch1) MPI-process 1:

#000: H5T.c line 1723 in H5Tclose(): not a datatype

major: Invalid arguments to routine

minor: Inappropriate type

Fatal error in PMPI_Type_free: Invalid datatype, error stack:

PMPI_Type_free(157): MPI_Type_free(datatype_p=0xce4ba8) failed

PMPI_Type_free(89).: Invalid datatype

Any suggestions?

Ron,

I’m not able to reproduce this error on my system. Can you give some more details? What OS are you using? How did you install HDF5? Does the error happen on openmc_finalize? (it would seem so based on that error message)

Thanks,
Paul

I’ve attached the stdout/sterr when I configure, make, and run the example OpenMC on my system. I’ve also pushed my working copy to a new branch (https://github.com/RonRahaman/openmc/tree/hdf5-err); it doesn’t have all the upstream changes from ‘develop’.

I’m using:

  • Ubuntu 16.04 LTS
  • HDF5 1.10.0-patch1

I installed HDF5 from source using the following configuration:

./configure --prefix=/home/rahaman/install/hdf5-1.10.0-intel-17/ --enable-build-mode=debug --enable-build-all --enable-fortran --enable-fortran2003 --enable-parallel CC=/homes/rahaman/install/mpich-3.2-intel-17/bin/mpicc FC=/homes/rahaman/install/mpich-3.2-intel-17/bin/mpif90

I assume the error occurs during openmc_init(), as the title and header are not printed before OpenMC aborts. How would I obtain a backtrace?

Also, OpenMC only aborts when I use more than one process (mpiexec -n 2). When I use only one process (mpiexec -n 1), OpenMC runs to completion.

run.out (541 Bytes)

make.out (31.2 KB)

cmake.out (1.39 KB)

Ok, when I run with two processes I can reproduce the error. So if I understand wrap.F90 correctly, you are creating a communicator with just the first process, but then you are calling openmc_init from all processes. That subroutine should only be called from the processes that are in openmc_comm. Does that make sense?

I’m fairly certain that I’m only calling openmc_init from the procs in openmc_comm. I call openmc_init after running this setup:

call MPI_Comm_group(MPI_COMM_WORLD, world_group, mpi_err)
call MPI_Group_incl( &
world_group, &
1, &
(/ 0 /), &
openmc_group, &
mpi_err)
call MPI_Comm_create(MPI_COMM_WORLD, openmc_group, openmc_comm, mpi_err)

if (openmc_comm /= MPI_COMM_NULL) then
call openmc_init(openmc_comm)
endif

I’m exploiting the fact that, when MPI_Comm_create is called from processes that aren’t in openmc_group, then openmc_comm will be returned as MPI_COMM_NULL. I’ve confirmed this behavior in the standard (see the description of MPI_Comm_create here), and I’ve used this behavior successfully in the past.

With some print statements, I’ve also confirmed that openmc_init is only called from processes in open_comm.

Paul, it seems like the error comes from openmc_finalize, as you originally suspected. In my driver, openmc_finalize is called by processes outside of openmc_comm, which is problematic.

For example, if a process is not in openmc_comm, it skips openmc_initialize and the initialization of some HDF5 datatypes. This is as I intended. However, that process will still run openmc_finalize and attempt to free those datatypes. I believe that causes the error I saw.

Furthermore, the process outside openmc_comm reaches openmc_finalize while the process in openmc_comm is still running openmc_init. So when the former process aborted in openmc_finalize, I mistakenly guessed that the latter process aborted in openmc_init.

I should be able to fix the problem in my driver pretty easily.

Thanks for your help!