Problem 2505 - Crash in CMS production with Geant4 10.7p02
Summary: Crash in CMS production with Geant4 10.7p02
Status: RESOLVED FIXED
Alias: None
Product: Geant4
Classification: Unclassified
Component: geometry/magneticfield (show other problems)
Version: 10.7
Hardware: All All
: P4 critical
Assignee: John Apostolakis
URL:
Depends on:
Blocks:
 
Reported: 2022-08-11 16:49 CEST by Vladimir.Ivantchenko
Modified: 2023-03-22 23:41 CET (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this problem.
Description Vladimir.Ivantchenko 2022-08-11 16:49:26 CEST
Hi John,

a new problem was seen in CMS MinBias event production. 

It happens in Air with ~5 keV e- in area of CMS tracker. This particle should not affect any result. Having first JustWaring is fine but there is a strange thing: it shows that within this step in different points energy of electron vary from 2 to 4 keV. This is difficult to understand.

The FatalException happens after the warning. From my point of view there is no reason to make fatal exception on such low energy particle whatever numerical problem happens. I would suggest issue JustWarning and return "true".   

............................................

The detailed output is following:

Begin processing the 12305th record. Run 1, Event 573520305, LumiSection 573521 on stream 7 at 04-Aug-2022 20:51:17.502 -03
%MSG-w SimG4CoreApplication:  (NoModuleName) 04-Aug-2022 20:51:18 -03 pre-events
 
-------- WWWW ------- G4Exception-START -------- WWWW -------
*** G4Exception : GeomNav0003
      issued by : G4MultiLevelLocator::EstimateIntersectionPoint()
       Current Position  and  Direction 
Step#       s        X(mm)      Y(mm)      Z(mm)    N_x     N_y     N_z   Delta|N|   StepLen  StartSafety   PhsStep 
Step taken was -2.08443e-08 out of PhysicalStep= -1
Final safety is: 0.0160097
Chord length = 0.0298999
 
Error in advancing propagation.
   The final curve point is NOT further along  than the original!
   Going *backwards* from len(A) = 0.177098  to len(B) = 0.177098
      Curve distance is -2.08443e-08 mm 
      Point A' (start) is  (  X= 126.959311 629.745713 700.289106  P= 0.0311682907 0.018383016 -0.037952622  Pmag= 0.0524385273 Ekin= 0.00268356504 l= 0.177097702356 m0= 0.510999 (Pdir-1)= 0 t_lab= 0 t_proper= 0 PolV= (0,0,0)  ) 
      Point B' (end)   is  (  X= 126.950834 629.724863 700.308789  P= 0.0550081272 0.0087014188 -0.0379411875  Pmag= 0.0673879993 Ekin= 0.00442424477 l= 0.177097681511 m0= 0.510999 (Pdir-1)= 0 t_lab= 0 t_proper= 0 PolV= (0,0,0)  ) 
      fEpsStep= 0.01


 In full precision, the position, momentum, E_kin, length, rest mass  ... are: 
 Point A[0] (Curve   start) is  (  X= 126.934231 629.81948 700.409556  P= -0.0526687685 -0.010295613 -0.0378458583  Pmag= 0.0656681644 Ekin= 0.0042022098 l= 0 m0= 0.510999 (Pdir-1)= 0 t_lab= 6.09523 t_proper= 0 PolV= (0,0,0)  ) 
 Point S    (Sub     start) is  (  X= 126.94159 629.74136 700.308497  P= 0.0361863519 -0.00184533727 -0.0379075923  Pmag= 0.0524389444 Ekin= 0.00268360762 l= 0.15029113595 m0= 0.510999 (Pdir-1)= -1.11022e-16 t_lab= 0 t_proper= 0 PolV= (0,0,0)  )  Point A'   (Current start) is  (  X= 126.959311 629.745713 700.289106  P= 0.0311682907 0.018383016 -0.037952622  Pmag= 0.0524385273 Ekin= 0.00268356504 l= 0.177097702356 m0= 0.510999 (Pdir-1)= 0 t_lab= 0 t_proper= 0 PolV= (0,0,0)  ) 
 Point E    (Trial Point)   is (126.95317432736638352,629.73061822354270589,700.30335614091939078)
 Point F    (Intersection)  is  (  X= 126.959311 629.745713 700.289106  P= 0.0311682907 0.018383016 -0.037952622  Pmag= 0.0524385273 Ekin= 0.00268356504 l= 0.177097702356 m0= 0.510999 (Pdir-1)= 0 t_lab= 0 t_proper= 0 PolV= (0,0,0)  ) 
 Point B'   (Current end)   is  (  X= 126.950834 629.724863 700.308789  P= 0.0550081272 0.0087014188 -0.0379411875  Pmag= 0.0673879993 Ekin= 0.00442424477 l= 0.177097681511 m0= 0.510999 (Pdir-1)= 0 t_lab= 0 t_proper= 0 PolV= (0,0,0)  ) 
 Point B[0] (Curve   end)   is  (  X= 126.950834 629.724863 700.308789  P= 0.0550081272 0.0087014188 -0.0379411875  Pmag= 0.0673879993 Ekin= 0.00442424477 l= 0.177097681511 m0= 0.510999 (Pdir-1)= 0 t_lab= 0 t_proper= 0 PolV= (0,0,0)  ) 
 
 LocateIntersection parameters are : 
      Substep no (total) = 4
      Substep no         = 4 at depth= 0
 * Location: G4MultiLevelLocator::EstimateIntersectionPoint()- After EndIf(Intersects_AF)
 * Bool flags:  Recalculated = 0   Intersects_AF = 0   Intersects_FB = 1
 * Number of calls to MLL:EIP= 92871641
 
TrackID=198806 ParentID=198629  e-; Ekin(MeV)=0.00420221; time(ns)=6.09523; status=0
   position(mm): (126.934,629.819,700.41); direction: (-0.802044,-0.156782,-0.57632)
   PhysicalVolume: tob:TOB_1; material: materials:Air
   stepNumber=7; stepLength(mm)=0.236909; weight=1; creatorProcess: eIoni; modelID=-1 
*** This is just a warning message. ***
-------- WWWW -------- G4Exception-END --------- WWWW -------

%MSG
%MSG-w SimG4CoreApplication:  (NoModuleName) 04-Aug-2022 20:51:18 -03 pre-events
 
-------- EEEE ------- G4Exception-START -------- EEEE -------
*** G4Exception : GeomField0003
      issued by : G4IntegrationDriver::AccurateAdvance()
Invalid run condition.
Proposed step is negative; hstep = -1.50904e-08.
Requested step cannot be negative! Aborting event.
TrackID=198806 ParentID=198629  e-; Ekin(MeV)=0.00420221; time(ns)=6.09523; status=2
   position(mm): (126.934,629.819,700.41); direction: (-0.802044,-0.156782,-0.57632)
   PhysicalVolume: tob:TOB_1; material: materials:Air
   stepNumber=7; stepLength(mm)=0.236909; weight=1; creatorProcess: eIoni; modelID=-1 
 
-------- EEEE -------- G4Exception-END --------- EEEE -------
 
%MSG
----- Begin Fatal Exception 04-Aug-2022 20:51:18 -03-----------------------
An exception of category 'Geant4 fatal exception' occurred while
   [0] Processing  Event run: 1 lumi: 573521 event: 573520280 stream: 3
   [1] Running path 'RAWSIMoutput_step'
   [2] Prefetching for module PoolOutputModule/'RAWSIMoutput'
   [3] Calling method for module OscarMTProducer/'g4SimHits'
----- End Fatal Exception -------------------------------------------------
Comment 1 Vladimir.Ivantchenko 2022-08-14 14:39:06 CEST
Hi John,

frequency of this problem for MinBias events is 1/500000.

In CMSSW we have implemented a protection, which seems to work. It would be good to provide a real fix, which may be applied on top of Geant4 10.7p02 and in the new  Geant4 11.1.

Cheers,
Vladimir
Comment 2 John Apostolakis 2022-08-22 13:01:53 CEST
I believe that I have found the origin of this error in G4ChordFinder.

I will submit a fix, and if it works append it to this message.
Comment 3 John Apostolakis 2022-08-26 18:33:51 CEST
Deeper investigation shows that integration with DormandPrince is quite fragile if the required accuracy is worse that 0.001.

It causes integration errors for potential surface-crossing points that are hard to cope with.

I am continuing the investigation in order to try to craft a robust solution.
Comment 4 John Apostolakis 2022-12-16 11:53:19 CET
A protection has been added in release 11.1 to request application developers to use a larger maximum bound for the relative integration error. 

The limit was set to 0.01 for the time being, with the ability for the developer to increase to 0.02. 

But the intent is to eliminate the use of epsilon_max larger than 0.001 by finding a solution to potential performance issues, e.g. by using the so-called 'B-field' integration driver or its future refinement to cope with lower-energy tracks (typically electrons/positrons or muons) looping in strong fields.

We will continue to investigate and try to improve the interaction between the accuracy of integration and the correspondence with the physical properties of the integrated track - which still appear to cause problems in the intersection logic, in particular in G4MultiLevelLocator.

As the immediate problem appears to be addressed, I propose that we close it.
Comment 5 John Apostolakis 2023-03-22 23:41:40 CET
Closed as proposed in December.