Problem 1247

Summary: Occasionally occurring infinite loop in optical photon tracking
Product: Geant4 Reporter: Matthias Nagl <geantbugs>
Component: processes/opticalAssignee: gum
Status: RESOLVED WORKSFORME    
Severity: normal CC: John.Apostolakis
Priority: P5    
Version: 9.4   
Hardware: All   
OS: All   
URL: https://code.google.com/p/scintillate/

Description Matthias Nagl 2011-09-02 17:21:00 CEST
While using an extended Geant 4.9.4_p02 example to simulate the scintillation process in small scintillation detectors many simulation runs get stuck. It seems that this happens due to a bug in the optical photon tracking.

It gets stuck in the loop in G4TrackingManager.cc between lines 122 and 134. However lots of things happen in each iteration of this loop. Therefore I do not exactly know where the problem's source is located. There seem to exist lots of "transport" steps with zero path length but this also happens during normal operation.

The simulation code that triggers the problem can be downloaded from https://code.google.com/p/scintillate
The problem occurs almost always after execution of the macro runs/PerfectLYSO_20x40_whiteground_1275keV_10deg.mac
(https://code.google.com/p/scintillate/source/browse/trunk/runs/PerfectLYSO_20x40_whiteground_1275keV_10deg.mac)
which uses the detector definition from here:
https://code.google.com/p/scintillate/source/browse/trunk/scintillators/PerfectLYSO_20x40_whiteground.mac
Comment 1 John Apostolakis 2011-09-21 03:45:46 CEST
Accepting the report - and trying to change it to "Geant4" product.
Comment 2 gum 2012-02-07 03:07:32 CET
I was able to download your sources, compile, link and run. However, I see that each event is rather lengthy and you require 10000 events. Do you have the random number seed of the event that loops - assuming that you continue to see infinite loops with 9.5 as well. Most ideally, we'd know the origin, energy, polarization and direction of the optical photon that loops in your geometry and launch it alone. Would you be able to extract this information and are you able to launch individual photons in your simulation?
Comment 3 Matthias Nagl 2012-05-20 00:20:25 CEST
Sorry for the long delay!

There are several cases where the problem occurs almost every time I start the particular macro file - even using the most recent version of Geant4.
Two examples which you can find in the google-code repository are:
runs/LaBr_conical40x20x10_whiteground_1275keV_iso.mac
runs/PerfectLYSO_20x40_whiteground_1275keV_0deg.mac

Despite many attempts I was not able to get any of these to pass all 10000 runs without hanging.

Is there an easy way to get the information you requested from a hanging instance?
Comment 4 gum 2012-05-23 01:39:37 CEST
Hi Matthias,

To know the random number seed at the beginning of the event that is looping (equivalent to knowing the seed at the end of the previous event) you need to code equivalent to what is coded in /examples/extended/field/field04/src/F04EventAction.cc:

http://www-geant4.kek.jp/lxr/source/examples/extended/field/field04/src/F04EventAction.cc#L86

and the rest in that example that's related to saveEngineStatus (F04RunActionMessenger.cc)

Once you know the seed, and it is written to file, you can launch only that one event by seeding the program from the file and then you can debug where exactly the looping happens and maybe why. If you have access to the source code you can dig into it either with gdb or your own coded G4cout statements until you find where the loop is.

Thanks, Peter
Comment 5 Matthias Nagl 2012-05-27 10:14:39 CEST
Hello Peter,

I added an UserEventAction calling CLHEP::HepRandom::saveEngineStatus in its BeginOfEventAction method. However I am not able to reproduce the infinite loop by reloading the last written engine status before looping in the beginning or in the UserEventAction of a subsequent run. is there anything to pay attention to?

Cheers, Matthias
Comment 6 gum 2012-08-31 02:26:19 CEST
I cannot reproduce the bug. 10000 events run through without an infinite loop when I install, compile, link and run the code myself.

I'll relegate this bug-report to 'works-for-me'. I suggest that if the user finds the time, he compiles, links and runs his code against 9.4.p03. Maybe it 'resolved itself' since. In any case, I cannot trace the problem without more help from the user. If the event cannot be isolated, and the infinite loop reliably reproduced then the cause must be a memory issue caused by a previous event. In that case, the cause may not be reproducable except on the very platform where it appears. There have not been any other users claiming similar problems in their applications. So, in all likelihood cause for the problem lies with the user's code.