| Summary: | Fatal Exception due to bad Normal in field propagation when using Error propagation | ||
|---|---|---|---|
| Product: | Geant4 | Reporter: | John Apostolakis <John.Apostolakis> |
| Component: | geometry/navigation | Assignee: | John Apostolakis <John.Apostolakis> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | Gabriele.Cosmo, lorenzo.viliani, Thomas.Hauth, vincenzo.innocente, Vladimir.Ivantchenko |
| Priority: | P5 | ||
| Version: | 9.6 | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| Attachments: |
Log file for run showing crash - from Geant4 9.6-patch2
Log file for clean run - from Geant4 9.5-patch02 Log file showing warnings - from Geant4 9.6-patch3 Stack trace of crash for run with Geant4 9.6 patch 02 Analysisi of the problem and possible patch Draft patch file for G4Navigator and G4ErrorPropagationNavigator classes |
||
Created attachment 284 [details]
Log file for clean run - from Geant4 9.5-patch02
Added log file output from clean run with Geant4 9.5 (from Thomas and Lorenzo)
Additional information from Thomas, replying to questions: > Could you provide a stack trace of your program when the Exception is called ? ( Running in gdb should give you this quickly. ) I added the stacktrace at the bottom. The last stack which is within CMSSW is #16 (Geant4ePropagator::propagateGeneric). > Are you able to reproduce the problem with the debug installation of Geant4 ? Do you mean a specific debug compile of G4? I am not sure if this can be used from within CMSSW. But I can give it a try using only the G4 example and see if something shows up. > How long does it take for the program to crash ? It is hard to say. The startup takes a couple of seconds, but after that, the crash happens quite fast ( < 1s ). I counted the number of successful propagations at is it 2. The crash happens on the third. Created attachment 285 [details]
Log file showing warnings - from Geant4 9.6-patch3
Log file using Geant4 9.6 patch 03. Shows warnings, but no crash.
Created attachment 286 [details]
Stack trace of crash for run with Geant4 9.6 patch 02
Cleaned up stack trace (without the full AFS paths), showing the functions called before the crash.
diff -rupN geant4.10.00.p02.orig/source/geometry/navigation/src/G4Navigator.cc geant4.10.00.p02/source/geometry/navigation/src/G4Navigator.cc
--- geant4.10.00.p02.orig/source/geometry/navigation/src/G4Navigator.cc 2014-09-05 14:46:30.000000000 +0200
+++ geant4.10.00.p02/source/geometry/navigation/src/G4Navigator.cc 2014-09-05 15:37:33.000000000 +0200
@@ -1266,6 +1266,28 @@ void G4Navigator::SetupHierarchy()
}
}
+G4ThreeVector NavigateDaughtersForNormal(const G4LogicalVolume* v, const G4ThreeVector& point){
+ G4ThreeVector ExitNormal(0.,0.,0.);
+ for (int i = 0; i < v->GetNoDaughters(); ++i){
+ const G4VPhysicalVolume* cur_phys = v->GetDaughter(i);
+ const G4LogicalVolume* cur_logi = cur_phys->GetLogicalVolume();
+ G4AffineTransform transform = G4AffineTransform(cur_phys->GetRotation(), cur_phys->GetTranslation()).Invert();
+ G4ThreeVector daughterPointOwnLocal=transform.TransformPoint( point );
+ if (cur_logi->GetNoDaughters()>0){
+ ExitNormal = NavigateDaughtersForNormal(cur_logi, daughterPointOwnLocal);
+ if( std::fabs(ExitNormal.mag2()-1.0 ) < CLHEP::perMillion)
+ return ExitNormal;
+ }
+ }
+
+ G4VSolid* solid = v->GetSolid();
+ ExitNormal= -(solid->SurfaceNormal(point));
+ if( std::fabs(ExitNormal.mag2()-1.0 ) < CLHEP::perMillion )
+ return ExitNormal;
+ return ExitNormal;
+
+}
+
// ********************************************************************
// GetLocalExitNormal
//
@@ -1392,18 +1414,21 @@ G4ThreeVector G4Navigator::GetLocalExitN
{
G4VSolid* daughterSolid =fHistory.GetTopVolume()->GetLogicalVolume()
->GetSolid();
- ExitNormal= -(daughterSolid->SurfaceNormal(fLastLocatedPointLocal));
+ ExitNormal= NavigateDaughtersForNormal(fHistory.GetTopVolume()->GetLogicalVolume(), fLastLocatedPointLocal);
if( std::fabs(ExitNormal.mag2()-1.0 ) > CLHEP::perMillion )
{
G4ExceptionDescription desc;
desc << " Parameters of solid: " << *daughterSolid
<< " Point for surface = " << fLastLocatedPointLocal << std::endl;
G4Exception("G4Navigator::GetLocalExitNormal()",
- "GeomNav0003", FatalException, desc,
+ "GeomNav0003", JustWarning, desc,
"Surface Normal returned by Solid is not a Unit Vector." );
- }
- fCalculatedExitNormal= true;
- *valid = true;
+ fCalculatedExitNormal= false;
+ *valid = false;
+ } else {
+ fCalculatedExitNormal=true;
+ *valid = true;
+ }
}
else
{
@@ -1417,10 +1442,10 @@ G4ThreeVector G4Navigator::GetLocalExitN
{
*valid = false;
fCalculatedExitNormal= false;
- G4ExceptionDescription message;
- message << "Function called when *NOT* at a Boundary." << G4endl;
- G4Exception("G4Navigator::GetLocalExitNormal()",
- "GeomNav0003", JustWarning, message);
+ //G4ExceptionDescription message;
+ //message << "Function called when *NOT* at a Boundary." << G4endl;
+ //G4Exception("G4Navigator::GetLocalExitNormal()",
+ // "GeomNav0003", JustWarning, message);
}
}
}
Hi John, in previous comment I have copied the patch proposed by Piergiulio Lenzi. Alternatively the patch can be accessed at url: https://github.com/lenzip/cmssw/blob/geant4e_test_giulio/TrackPropagation/Geant4e/test/geant4.10.p2_exitNormal.patch Cheers, Vladimir Created attachment 288 [details]
Analysisi of the problem and possible patch
Comment on attachment 288 [details]
Analysisi of the problem and possible patch
Hi,
this presentation contains a description of the problem and the analysis of the cause.
Cheers
Giulio
I have identified missing functionality in G4ErrorPropagationNavigator which is responsible for errors similar to the ones reported below. In the simpler case of the error propagation example, patches in this class and in G4Navigator now give reliable results. I have shared the improved versions of these files with those reporting the issue. Once it is checked that the same corrections also fix the issue for the CMS use case we can mark this as resolved. Created attachment 296 [details]
Draft patch file for G4Navigator and G4ErrorPropagationNavigator classes
Proposed corrections in tar format file.
Identified an issue in G4ErrorPropagationNavigator::ComputeStep: it calls G4Navigator::ComputeSafety, which has a side effect regarding the location. Created a revised version of G4ErrorPropagationNavigator which addresses this. |
Created attachment 283 [details] Log file for run showing crash - from Geant4 9.6-patch2 Thomas Haupt has reported a problem in CMS software when using G4ErrorPropagator "Surface Normal returned by Solid is not a Unit Vector." is reported also in G4Navigator. This exception is fatal and processing stops. This problem occurs with Geant4 9.6 patch 02 and 03, and with Geant4 release 10.0, but not with version 9.5 patch 2. Further details from his email (25 March 2014): 'I now did some further studies to understand a problem we see when using the G4 error propagation package in the G4 version 4.6 and above. I discovered the problem when I tried to use the G4 version which is shipped with our new CMSSW releases 7.0 / 7.1. The G4 version here is geant4.9.6.p02-cms4. The attached log file "geant4.9.6.p02-cms4_errorprop_crash.txt" contains some lines -------- WWWW ------- G4Exception-START -------- WWWW ------ which report "Function called when *NOT* at a Boundary." inside of G4Navigator. This seems to be considered a warning and the G4 processing continues, but after some time, the G4Exception "Surface Normal returned by Solid is not a Unit Vector." is reported also in G4Navigator. This exception is fatal and processing stops. Using the G4 9.5 p2 version, I never saw these G4Exceptions when using the error propagation in CMS. Lorenzo (in CC) helped me trace whether this might be a problem, which was introduced in some recent G4 releases. He had a look at G4 10 first and ran the example in examples/extended/errorpropagation and the G4Exception "Function called when *NOT* at a Boundary." ( the non-fatal one) shows up a couple of times when running the example code. However it does not crash and continues the example to the end. I did some further tests: in g4.9.6.p03 -> same as in G4.10 in g4.9.5.p02 the exceptions do not show up. This seems to indicate, that some strange effect was introduced in between the 4.9.5p2 and 4.9.6p3 I added the log files from running the example in g4.9.6.p03 and g4.5.2.p02 to this mail. Another interesting discovery we made is, that the problem might be related to the location of the propagation target. The G4exception in for the example is always reported in the very last step before hitting the target. Furthermore, Lorenzo discovered that when the propagation target is moved outside of the muon detector in the example, the G4exception disappears. Can think of any changes g4.9.5 -> g4.9.6 which might provoke this behavior ?"