Segmentation fault on simple model after many particles

Hi all

I’m running a very simple spherical geometry and tallying current on a surface and getting a fatal error. I wonder if anyone could give ideas on how to avoid this error.

The only odd part of the script is the density of the material, changing the density solves the problem but I am hoping to keep the density high to simulate the correct inertial confinement fusion case (Revolver)

python mve.py


 Simulating batch 89
 Simulating batch 90
 Simulating batch 91
Traceback (most recent call last):
  File "mve.py", line 43, in <module>
    sp_filename = model.run()
  File "/home/jshimwell/openmc/openmc/model/model.py", line 712, in run
    openmc.run(particles, threads, geometry_debug, restart_file,
  File "/home/jshimwell/openmc/openmc/executor.py", line 314, in run
    _run(args, output, cwd)
  File "/home/jshimwell/openmc/openmc/executor.py", line 125, in _run
    raise RuntimeError(error_msg)
RuntimeError: OpenMC aborted unexpectedly.

So I’ve tried running the openmc executable in openmc with gdb

gdb --exec /usr/local/bin/openmc
>>> run

 Simulating batch 89
 Simulating batch 90
 Simulating batch 91

Thread 21 "openmc" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffebc2a700 (LWP 4242)]
0x00007ffff7a6c8f9 in std::__uniq_ptr_impl<openmc::Surface, std::default_delete<openmc::Surface> >::_M_ptr (
    this=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:147
147	      pointer    _M_ptr() const { return std::get<0>(_M_t); }

I’m running Ubuntu 18.04, with the very latest version of openmc from the develop branch commit tag
cd13d3471da30213bf68b91739994b73bb07cb90 and I’m using ENDF-B-VIII.0-NNDC data I attach the python script if anyone is able to reproduce this and let me know if it works for others?

import openmc
shell_material = openmc.Material()
shell_material.add_element("Au", 1.0, percent_type="ao")
shell_material.set_density("g/cm3", 1930) # revolver density
materials = openmc.Materials([shell_material])
surface_inner_shell = openmc.Sphere(r=0.0035)
surface_outer_shell = openmc.Sphere(r=0.0095)
sphere_surface_detector_1 = openmc.Sphere(r=20000, boundary_type="vacuum")
fuel_region = -surface_inner_shell
shell_region = +surface_inner_shell & -surface_outer_shell
void_region_1 = +surface_outer_shell & -sphere_surface_detector_1
fuel_cell = openmc.Cell(region=fuel_region)
shell_cell = openmc.Cell(region=shell_region, fill=shell_material)
void_cell_1 = openmc.Cell(region=void_region_1)
geometry = openmc.Geometry([fuel_cell, shell_cell, void_cell_1])
source = openmc.Source()
source.angle = openmc.stats.Isotropic()
settings = openmc.Settings()
settings.batches = 100
settings.particles = 80000
settings.run_mode = "fixed source"
settings.source = source
settings.photon_transport = False
surface_filter_1 = openmc.SurfaceFilter(sphere_surface_detector_1)
time_tally_1 = openmc.Tally(name="time_tally_detector_1")
time_tally_1.scores = ["current"]
time_tally_1.filters = [surface_filter_1]
tallies = openmc.Tallies([time_tally_1])
model = openmc.model.Model(geometry, materials, settings, tallies)
sp_filename = model.run()

Hello,
last week I’ve faced a similar issue: different model but spherical surface for vacuum boundary conditions too. So, I decided to compile openmc with debug options to get more info. Here the output error of your input (identical to what I received last week):

backtrace (tid: 210201)
0 0x0000000000012dd0 __funlockfile() :0
1 0x0000000000158d90 openmc::Particle::cross_surface() /opt/openmc/src/particle.cpp:466
2 0x0000000000158d90 std::unique_ptr<openmc::Surface, std::default_delete<openmc::Surface> >::get() /opt/GCCcore/12.2.0/include/c++/12.2.0/bits/unique_ptr.h:462
3 0x0000000000158d90 openmc::Particle::cross_surface() /opt/openmc/src/particle.cpp:466
4 0x0000000000159982 openmc::Particle::event_cross_surface() /opt/openmc/src/particle.cpp:248
5 0x000000000019bd95 openmc::transport_history_based_single_particle() /opt/openmc/src/simulation.cpp:741
6 0x000000000019bfdb openmc::transport_history_based() /opt/openmc/src/simulation.cpp:758
7 0x000000000019bfdb std::vector<double, std::allocator >::~vector() /opt/GCCcore/12.2.0/include/c++/12.2.0/bits/stl_vector.h:733
8 0x000000000019bfdb openmc::ParticleData::~ParticleData() /opt/openmc/include/openmc/particle_data.h:220
9 0x000000000019bfdb openmc::Particle::~Particle() /opt/openmc/include/openmc/particle.h:31
10 0x000000000019bfdb openmc::transport_history_based() /opt/openmc/src/simulation.cpp:759
11 0x000000000001e3be gomp_thread_start() /tmp/GCCcore/12.2.0/system-system/gcc-12.2.0/stage3_obj/x86_64-pc-linux-gnu/libgomp/…/…/…/libgomp/team.c:129
12 0x00000000000082de start_thread() pthread_create.c:0
13 0x00000000000fbe83 __GI___clone() :0

[giova@localhost] *** Process received signal ***
[giova@localhost] Signal: Segmentation fault (11)
[giova@localhost] Signal code: (-6)
[giova@localhost] Failing at address: 0x10e71000333ce
[giova@localhost] [ 0] /lib64/libpthread.so.0(+0x12dd0)[0x7f36dc899dd0]
[giova@localhost] [ 1] /opt/binfiles/lib64/libopenmc.so(_ZN6openmc8Particle13cross_surfaceEv+0x30)[0x7f36dd03ad90]
[giova@localhost] [ 2] /opt/binfiles/lib64/libopenmc.so(_ZN6openmc8Particle19event_cross_surfaceEv+0xc2)[0x7f36dd03b982]
[giova@localhost] [ 3] /opt/binfiles/lib64/libopenmc.so(_ZN6openmc39transport_history_based_single_particleERNS_8ParticleE+0x15)[0x7f36dd07dd95]
[giova@localhost] [ 4] /opt/binfiles/lib64/libopenmc.so(+0x19bfdb)[0x7f36dd07dfdb]
[giova@localhost] [ 5] /opt/GCCcore/12.2.0/lib64/libgomp.so.1(+0x1e3be)[0x7f36dc0af3be]
[giova@localhost] [ 6] /lib64/libpthread.so.0(+0x82de)[0x7f36dc88f2de]
[giova@localhost] [ 7] /lib64/libc.so.6(clone+0x43)[0x7f36dbda9e83]
[giova@localhost] *** End of error message ***
Segmentation fault (core dumped)

Looking at line 466 of particle.cpp we have

const auto& surf {model::surfaces[i_surface - 1].get()};

Like every bad python programmer, I’ve put a “print i_surface” and recompiled xD: I get 2147483647 which is the maximum integer number for 32 bit (obviously there is no surface with this index). Going back, through event_cross_surface, boundary_info etc. etc. I arrive to the distance method of region. Here the i_surf variable is initialized as

int32_t i_surf {std::numeric_limits<int32_t>::max()};

for some reason i_surf is not updated in the function and is returned as it has been initialized.
I am not able to go deep in this investigation now.

Bye
Giovanni

1 Like

Continue…
Looking at distance method of Region, the variable i_source is not updated if for each surface of region the variable “d” is never lower than min_dist (which is initialized to INFTY) (see line 803 of cell.cpp).

In your example, the cell 3 is defined by two spherical surfaces and the variable “d” for both surfaces is calculated by the distance method of class SurfaceSphere: in some cases, d could be INFTY (in particular, if particle position is coincident with surface and variable “k” is positive).

In the code we have two different “definition” of infinite: one is INFTY=std::numeric_limits::max() and the other INFINITY from cmath and INFINITY is higher than INFTY.

In simulation.cpp we have:

 void transport_history_based_single_particle(Particle& p)
{
  while (true) {
    p.event_calculate_xs();
    if (!p.alive())
      break;
    p.event_advance();
    if (p.collision_distance() > p.boundary().distance) {
      p.event_cross_surface();
    } else {
      p.event_collide();
    }
    p.event_revive_from_secondary();
    if (!p.alive())
      break;
  }
  p.event_death();
}

In your example, p.boundary().distance is INFTY and p.collision_distance() is INFINITY (as defined here). Should it be treated as lost particle? Updating the Region distance method as below, I receive the warning “WARNING: After particle 58375 crossed surface 1 it could not be located in any cell and it did not leak.”:

std::pair<double, int32_t> Region::distance(
  Position r, Direction u, int32_t on_surface, Particle* p) const
{
  double min_dist {INFTY};
  int32_t i_surf {std::numeric_limits<int32_t>::max()};

  for (int32_t token : expression_) {
    // Ignore this token if it corresponds to an operator rather than a region.
    if (token >= OP_UNION)
      continue;

    // Calculate the distance to this surface.
    // Note the off-by-one indexing
    bool coincident {std::abs(token) == std::abs(on_surface)};
    double d {model::surfaces[abs(token) - 1]->distance(r, u, coincident)};

    if (coincident) {   //added by G. Mariano
        i_surf = -token;
      }

    // Check if this distance is the new minimum.
    if (d < min_dist) {
      if (min_dist - d >= FP_PRECISION * min_dist) {
        min_dist = d;
        i_surf = -token;
      }
    }
  }

  return {min_dist, i_surf};
}
1 Like

Great deep dive into the code, many thanks for posting your findings.

Should it be treated as lost particle?

Yes I guess so, perhaps it should be moved to the next surface and in my case that is a vacuum surface so I guess it would be leaked and lost yes. I suspect this doesn’t apply for all geometry.

1 Like

Hello!
Some updates on this topic. During transport a particle changes its cell when cross a surface (which is managed through neighbor_list_find_cell function here): however when a particle stands on a surface and a collision event happens, the direction of particle could change. According to Region::contains method, a particle is contained in a region on the basis of position, direction and on_surface: assuming the direction changes during a collision event (scattering), the cell containing the particle could also change. However, this possibility is not managed in event collision method (and I don’t know if it should be done).

In your case, a collision event happens on the surface 1: adding neighbor_list_find_cell(*this) at line 280 of particle.cpp solves the problem: indeed, through find_inner_cell it changes the particle cell.

Bye,
Giovanni

Giovanni, Jonathan,
has any progress been made to remedy this issue? I’ve run across it (exact same seg fault error identified using gdb) with a model of the FNG SiC benchmark I’m working on.
A brute force approach which I have been taking is to change the random seed trying to avoid the unfortunate history that causes this seg fault. Which I recognize is a dangerous game to play and not really a solution. Any and all insight would be welcome.
Cheers, Bor

Hello Bor,
I didn’t have time to further investigate this error. However, it would be useful to check if your error is caused by the same “bug”. To check it, add:

neighbor_list_find_cell(*this);

to the line 310 of particle.cpp
(inside Particle::event_collide) and recompile.

Basically, my opinion is that the neighbor_list_find_cell function should be called not only when the particle cross a surface but also after collision events: neighbor_list_find_cell is a function that updates the list of the neighboring cells of the cell containing the particle.

To better clarify, tet’s consider the situation in the picture: the particle (on the border between two cells) is in the same position before and after the collision but it changes the flight direction. Before collision the particle is in cell 2 and the neighbor list contains cell 1, while after collision the particle is in cell 1 and the neighbor list contains cell 2.

My solution, however, has a negative impact on the performaces: the situation is very rare and call each time this function would be a waste of resources. Maybe a better solution is to manage it through exceptions.

Giovanni

Giovanni,
thank you again for another deep dive. Between me posting the question and you responding I also found another solution to the issue, which I can’t say is full proof, but it definitely works without any performance difference for my case.
All I did was change the double min_dist {INFTY}; entry in the Region::distance function of cell.cpp at line 789 to double min_dist {INFINITY};
After some additional testing I will push this commit and I hope this post helps someone else.
Thanks,
Bor

Bor,
when I was dealing with this problem I did the same at certain point (see my post of 5th August): however, in my opinion this solution could just hide the real problem which is (in my opinion) the one described above. Moreover, this solution can create issues here:

if (p.collision_distance() > p.boundary().distance) {
p.event_cross_surface();
} else {
p.event_collide();
}

p.collision_distance() can be INFINITY (see here) and if both are INFINITY the “if” statement would fail. Indeed, in my opinion, this is the reason why we have two “infinity” values in the code.

The following for loop in Region::distance fails in finding i_surf so it remains at the initializated value (std::numeric_limits<int32_t>::max()): it fails because it’s considering the wrong cell (from here).

for (int32_t token : expression_) {
// Ignore this token if it corresponds to an operator rather than a region.
if (token >= OP_UNION)
continue;

// Calculate the distance to this surface.
// Note the off-by-one indexing
bool coincident {std::abs(token) == std::abs(on_surface)};
double d {model::surfaces[abs(token) - 1]->distance(r, u, coincident)};

// Check if this distance is the new minimum.
if (d < min_dist) {
  if (min_dist - d >= FP_PRECISION * min_dist) {
    min_dist = d;
    i_surf = -token;
  }
}

Best,
Giovanni

Hi all,

I’ve been taking a look at this recently and I’ll give my take on what’s happening here.

In the example case that @Shimwell posted originally, the model contains two small spheres which are known to cause problems with the way we perform point containment checks in OpenMC. I’m failing to find the original reporter of this issue, but I wrote up the following document not long after. I’ll add the post here later if I can find it.

The basis of this issue is that we use a squared distance to evaluate floating-point coincidence in our point containment and distance methods for spheres (and cylinders I believe). This sometimes triggers the coincidence An absolute distance would make OpenMC more robust to this problem, but it would be more expensive to do so.

Sphere_Nerd_Out.pdf (212.6 KB)

If I increase the size of the spheres a bit in @Shimwell’s example and run 100x as many particles I don’t encounter this issue, and if I make them smaller it happens even earlier.

In the case of the FNG SiC benchmark, I’m not familiar with the geometry details of that particular version of the model so maybe @BorKos can chime in on whether or not there are surfaces that would fit this criterion.

Regarding what I’ll dub the “coincident collision” issue described by @g.mariano, we try to handle that by breaking ties with a collision. The line here will result in a collision when the distances are equal and, in the case that a collision occurs at a location that is equal to the boundary distance, the particle will collide without crossing the surface and will remain logically in Cell 1 (the zero-distance intersection will be ignored b/c the value of k is negative in SurfaceSphere::distance) The change in direction will then result in an intersection in an intersection on the far side of Cell 1.

In the case that the distance to boundary is slightly smaller than the collision, then the particle will logically cross the surface into cell 2 and a collision distance will be resampled.

There are other problematic scenarios, but they are too many to list here. It’s interesting that changing the initialization value from INFTY to INFINITY fixed this problem though. My thinking is that
b/c INFINITY is a built-in value and so even the floating point maximum we assign to INFTY is less than INFINITY when performing comparisons.

1 Like