Problem 917 - ATLAS simulation jobs stuck
Summary: ATLAS simulation jobs stuck
Status: RESOLVED WONTFIX
Alias: None
Product: Geant4
Classification: Unclassified
Component: processes/transportation (show other problems)
Version: 7.1
Hardware: PC Linux
: P2 normal
Assignee: John Apostolakis
URL:
Depends on:
Blocks:
 
Reported: 2006-12-14 08:14 CET by andrea.di.simone
Modified: 2010-09-13 11:42 CEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this problem.
Description andrea.di.simone 2006-12-14 08:14:45 CET
Hi,

we are having some reports of simulation jobs going on forever, using all the
CPU resources of the machines they are run into, without never exiting.
The problem is rare but reproducible, and running with the most verbose
tracking, one gets:

#Step#    X(mm)    Y(mm)    Z(mm) KinE(MeV)  dE(MeV) StepLeng TrackLeng
NextVolume ProcName
  764 1.12e+03 -1.15e+03     -120    0.0273     1.01  0.00968  1.65e+03
LArMgr::LAr::EMB::ThickAbs::Straight AntiProtonInelastic

 >>DefinePhysicalStepLength (List of proposed StepLengths):
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName = XTR (Forced)
    ++ProposedStep(PostStep ) = 3.130561811917341 : ProcName = LElastic (No
ForceCondition)
    ++ProposedStep(PostStep ) = 4.260258727918588 : ProcName =
AntiProtonInelastic (No ForceCondition)
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName = hIoni (No
ForceCondition)
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName = msc (Forced)
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName =
Transportation (Forced)
    ++ProposedStep(AlongStep) = 0.0005158306156931578 : ProcName = hIoni
(CandidateForSelection)
    ++ProposedStep(AlongStep) = 0.0002225454306019344 : ProcName = msc
(NotCandidateForSelection)

and then nothing else. As I said, the job continues using the CPU.
Another output, from a different job, shows the same effect in a different
subdetector:

#Step#    X(mm)    Y(mm)    Z(mm) KinE(MeV)  dE(MeV) StepLeng TrackLeng
NextVolume ProcName
  285      151      598      298   0.00557     1.16     0.45       702
TRT::Radiator AntiProtonInelastic

 >>DefinePhysicalStepLength (List of proposed StepLengths):
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName = XTR (Forced)
    ++ProposedStep(PostStep ) = 1479.088703660263 : ProcName = LElastic (No
ForceCondition)
    ++ProposedStep(PostStep ) = 189.8539681874653 : ProcName =
AntiProtonInelastic (No ForceCondition)
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName = hIoni (No
ForceCondition)
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName = msc (Forced)
    ++ProposedStep(PostStep ) = 1.797693134862316e+308 : ProcName =
Transportation (Forced)
    ++ProposedStep(AlongStep) = 0.004088077124961289 : ProcName = hIoni
(CandidateForSelection)
    ++ProposedStep(AlongStep) = 0.003531294354111748 : ProcName = msc
(NotCandidateForSelection)

I would be able to easily test any patch you may provide.

Thanks for your help,

Andrea.
Comment 1 John Apostolakis 2007-01-19 10:17:59 CET
It appears to me that the problem is in the Transportation process - rather than
the tracking.  The tracking code's behaviour cannot be different from track to
track, whereas the propagation in magnetic field (or potentially other modules)
called by the Transportation could be iterating to move a track.

A couple of quesitons:
  - can you confirm that this is a charged track ?  ( I see AntiProtonInelastic)
  - can you try the new version of G4PropagatorInField for just this event (and
also for similar runs) that is provided with Geant4 8.2 -- as this has a revised
algorithm for locating the boundary intersection.

If neither of these helps, then I will try to suggest different things to try,
in order to help identify what is happening.

Best regards,  John Apostolakis
Comment 2 John Apostolakis 2010-09-13 11:42:44 CEST
This problem appears to concern an outdated version of Geant4, and, as such, be no longer relevant.