Problem 1945

Summary: Event biasing creates a data race which leads to erratic behaviour in MT mode
Product: Geant4 Reporter: alexander.howard
Component: geometry/biasingAssignee: alexander.howard
Status: CLOSED FIXED    
Severity: normal CC: jverburg
Priority: P4    
Version: 10.3   
Hardware: All   
OS: All   

Description alexander.howard 2017-03-07 15:04:17 CET
Event biasing creates a data race in MT with multiple threads retrieving the importance values simultaneously resulting in erratic behaviour which can also lead to out-of-bound vector access. Running with bounds checking on will cause a crash at some point for long runs. Reproducibility is also lost.

This problem is caused by the communication between the importance biasing process and the geometry store of the importance values, so it's split between the process and geometry categories.
Comment 1 alexander.howard 2017-03-07 15:56:58 CET
A fix has been proposed: geombias-10-03-00

This involves putting in mutex locks around methods which are accessed by the biasing process. Thus preventing erratic behaviour due to changes in the returned importance values during the request.

The member functions of G4IStore and G4ImportanceAlgorithm are affected.
Comment 2 Joost Verburg 2017-04-23 00:01:51 CEST
I have the same problem with my simulations. Incorrect importance values get returned because of the race condition in G4IStore::GetImportance, which leads to random simulation behavior.

A mutex may not be needed. The problem is the temporary member variable used for iteration (fCurrentIterator). Using a local iterator instead should solve it.
Comment 3 alexander.howard 2017-04-24 09:35:41 CEST
Thanks Joost! 

You are correct, however, I wanted to first check that the problem could be resolved with this "brute force" method, as the problem was not so easy to reproduce - one reason why the bug remained for such a long time.

The local iterator solution will be tested next.
Comment 4 Joost Verburg 2017-04-25 20:09:39 CEST
Thanks. FYI, using a local iterator did solve the problem for me:

https://github.com/joostverburg/geant4/commit/8f8b31bcff9c673546d179d664d983c73a4502e6
Comment 5 alexander.howard 2018-03-07 15:08:23 CET
A fix should now be included in the recent Geant4 patch. There was an additional problem with a threadrace in setting the parallel worlds. I have left the fixes as an Autolock, however, I would like to change the design such that the configuration can be handled better. Nonetheless for now the bug will be closed.