Problem 2546

Summary: Race conditions in EM physics processes
Product: Geant4 Reporter: v hewes <vhewes>
Component: processes/electromagnetic/standardAssignee: Vladimir.Ivantchenko
Status: RESOLVED FIXED    
Severity: minor    
Priority: P4    
Version: 11.0   
Hardware: All   
OS: All   
Attachments: A patch file that resolves the issue described in this bug report

Description v hewes 2023-05-24 19:49:59 CEST
Created attachment 814 [details]
A patch file that resolves the issue described in this bug report

The NOvA experiment recently upgraded to Geant4 v11.0.3, and discovered a bug introduced in several EM physics processes, in which the destructor of these processes deletes the global configuration.

NOvA utilises some custom physics algorithms which instantiate additional temporary instances of these classes, and when these instances are destroyed, any existing or future instances of these processes are left with garbage values in their physics configuration, which leads to undefined behaviour.

The affected processes are:
- G4eBremsstrahlungRelModel
- G4PairProductionRelModel
- G4BetheHeitlerModel

The NOvA experiment has developed a custom patch that adds an instance counter to these physics processes, so the destructor will only remove the global configuration when the instance counter drops down to zero. I've attached the patch we used to this bug report. We've confirmed that this fixes the issue in our simulation chain, but the procedure of patching Geant4 on our end is awkward, so we'd appreciate if this fix could be incorporated into the next Geant4 patch release.

Thank you!
Comment 1 Vladimir.Ivantchenko 2023-06-03 10:59:31 CEST
Hello,

thank you for the report. Similar situation may happens in other cases, we already fixed some of them. We will try to address the problem asap.

VI
Comment 2 Vladimir.Ivantchenko 2023-07-09 00:11:57 CEST
Hello,

Recently two public versions of Geant4 were released: 11.1p02 and 11.2beta. In the 11.2beta the problem of race at initialisation is fixed. Unfortunately, code for initialisation was significantly modified and it was not possible to make such modifications in the patch release.

However, even in previous version of the code it is possible to reduce probability of data race. For that following simple rules should be applied:

   1) extra model objects in user code should be created only in worker threads;

   2) extra models should be created via "new" and should be never deleted in the user code;

   3) smart pointers should not be used for model objects.

These rules are also relevant to 11.2beta, where the problem is solved. All models objects are following Geant4 register mechanism, they will be destroyed end of job. Potentially user may create and destroy such object but there is no guarantee that this will be done coherently. 

So, the recommendation is to follow these rules. If it is possible, please check if the data race still there in 11.2beta.

VI
Comment 3 Vladimir.Ivantchenko 2023-09-13 12:16:40 CEST
Hello,

Many thanks for this bug report.

For the end of the year Geant4 version 11.2 the problem is fixed. The preliminary fix is included into public 11.2beta, which should work for your environment.

VI