Hello everyone,
I ran into this problem today. i tried to restart a job but it gave an error. the git sha1 for the devleop branch version which i am using is also pasted here for your information.
Git SHA1: ba959372f4f6245614ec2002f93694e31a659306
the error message is shown below
HDF5-DIAG: Error detected in HDF5 (1.8.15-patch1) MPI-process 0:
#000: H5Dio.c line 173 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 425 in H5D__read(): unable to set up type info
major: Dataset
minor: Unable to initialize object
#002: H5Dio.c line 958 in H5D__typeinfo_init(): unable to convert between src and dest datatype
major: Dataset
HDF5-DIAG: Error detected in HDF5 (1.8.15-patch1) MPI-process 1:
#000: H5Dio.c line 173 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 425 in H5D__read(): unable to set up type info
major: Dataset
minor: Unable to initialize object
#002: H5Dio.c line 958 in H5D__typeinfo_init(): unable to convert between src and dest datatype
major: Dataset
minor: Feature is unsupported
#003: H5T.c line 4492 in H5T_path_find(): no appropriate function for conversion path
major: Datatype
minor: Unable to initialize object
HDF5-DIAG: Error detected in HDF5 (1.8.15-patch1) HDF5-DIAG: Error detected in HDF5 (1.8.15-patch1) MPI-process 3:
#000: H5Dio.c line 173 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 425 in H5D__read(): unable to set up type info
major: Dataset
minor: Unable to initialize object
#002: H5Dio.c line 958 in H5D__typeinfo_init(): unable to convert between src and dest datatype
major: Dataset
minor: Feature is unsupported
#003: H5T.c line 4492 in H5T_path_find(): no appropriate function for conversion path
major: Datatype
minor: Unable to initialize object
minor: Feature is unsupported
#003: H5T.c line 4492 in H5T_path_find(): no appropriate function for conversion path
major: Datatype
minor: Unable to initialize object
MPI-process 2:
#000: H5Dio.c line 173 in H5Dread(): can’t read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 425 in H5D__read(): unable to set up type info
major: Dataset
minor: Unable to initialize object
#002: H5Dio.c line 958 in H5D__typeinfo_init(): unable to convert between src and dest datatype
major: Dataset
minor: Feature is unsupported
#003: H5T.c line 4492 in H5T_path_find(): no appropriate function for conversion path
major: Datatype
minor: Unable to initialize object
HDF5-DIAG: Error detected in HDF5 (1.8.15-patch1) MPI-process 0:
Fatal error in MPI_Allreduce: Message truncated, error stack:
MPI_Allreduce(912)…: MPI_Allreduce(sbuf=0x7ffebefa0860, rbuf=0x7ffebefa0870, count=1, MPI_INT, MPI_BOR, comm=0x84000006) failed
MPIR_Allreduce_impl(769)…:
MPIR_Allreduce_intra(270)…:
MPIR_Bcast_impl(1462)…:
MPIR_Bcast(1486)…:
MPIR_Bcast_intra(1295)…:
MPIR_Bcast_binomial(241)…:
MPIDI_CH3U_Receive_data_found(131): Message from rank 0 and tag 2 truncated; 260 bytes received but buffer size is 4
Fatal error in MPI_Allreduce: Message truncated, error stack:
MPI_Allreduce(912)…: MPI_Allreduce(sbuf=0x7ffc5edaf2a0, rbuf=0x7ffc5edaf2b0, count=1, MPI_INT, MPI_BOR, comm=0x84000006) failed
MPIR_Allreduce_impl(769)…:
MPIR_Allreduce_intra(270)…:
MPIR_Bcast_impl(1462)…:
MPIR_Bcast(1486)…:
MPIR_Bcast_intra(1295)…:
MPIR_Bcast_binomial(241)…:
MPIDI_CH3U_Receive_data_found(131): Message from rank 0 and tag 2 truncated; 260 bytes received but buffer size is 4
#000: H5D.c line 358 in H5Dopen2(): not found
major: Dataset
minor: Object not found
#001: H5Gloc.c line 430 in H5G_loc_find(): can’t find object
major: Symbol table
minor: Object not found
#002: H5Gtraverse.c line 861 in H5G_traverse(): internal path traversal failed
major: Symbol table
minor: Object not found
#003: H5Gtraverse.c line 641 in H5G_traverse_real(): traversal operator failed
major: Symbol table
minor: Callback failed
Fatal error in MPI_Allreduce: Other MPI error, error stack:
MPI_Allreduce(912)…: MPI_Allreduce(sbuf=0x7ffcb0460810, rbuf=0x7ffcb0460820, count=1, MPI_INT, MPI_BOR, comm=0x84000006) failed
MPIR_Allreduce_impl(769).:
MPIR_Allreduce_intra(270):
MPIR_Bcast_impl(1462)…:
MPIR_Bcast(1486)…:
MPIR_Bcast_intra(1295)…:
MPIR_Bcast_binomial(312).: Failure during collective
#004: H5Gloc.c line 385 in H5G_loc_find_cb(): object ‘tallies_present’ doesn’t exist
major: Symbol table
minor: Object not found
thank you for your help.