Problem 1679 - Fatal Exception due to bad Normal in field propagation when using Error propagation
Summary: Fatal Exception due to bad Normal in field propagation when using Error propa...
Status: RESOLVED FIXED
Alias: None
Product: Geant4
Classification: Unclassified
Component: geometry/navigation (show other problems)
Version: 9.6
Hardware: PC Linux
: P5 normal
Assignee: John Apostolakis
URL:
Depends on:
Blocks:
 
Reported: 2014-10-10 19:24 CEST by John Apostolakis
Modified: 2014-12-15 20:14 CET (History)
5 users (show)

See Also:


Attachments
Log file for run showing crash - from Geant4 9.6-patch2 (536.90 KB, text/plain)
2014-10-10 19:24 CEST, John Apostolakis
Details
Log file for clean run - from Geant4 9.5-patch02 (21.66 KB, text/plain)
2014-10-10 19:28 CEST, John Apostolakis
Details
Log file showing warnings - from Geant4 9.6-patch3 (25.37 KB, text/plain)
2014-10-10 19:42 CEST, John Apostolakis
Details
Stack trace of crash for run with Geant4 9.6 patch 02 (10.61 KB, text/plain)
2014-10-10 19:46 CEST, John Apostolakis
Details
Analysisi of the problem and possible patch (991.54 KB, application/x-download)
2014-10-13 12:28 CEST, Piergiulio Lenzi
Details
Draft patch file for G4Navigator and G4ErrorPropagationNavigator classes (27.94 KB, application/octet-stream)
2014-11-19 16:20 CET, John Apostolakis
Details

Note You need to log in before you can comment on or make changes to this problem.
Description John Apostolakis 2014-10-10 19:24:54 CEST
Created attachment 283 [details]
Log file for run showing crash - from Geant4 9.6-patch2

Thomas Haupt has reported a problem in CMS software when using G4ErrorPropagator 
  "Surface Normal returned by Solid is not a Unit Vector."
is reported also in G4Navigator. This exception is fatal and processing
stops.

This problem occurs with Geant4 9.6 patch 02 and 03, and with Geant4 release 10.0, but not with version 9.5 patch 2.

Further details from his email (25 March 2014):

'I now did some further studies to understand a problem we see when using
the G4 error propagation package in the G4 version 4.6 and above. I
discovered the problem when I tried to use the G4 version which is
shipped with our new CMSSW releases 7.0 / 7.1. The G4 version here is
geant4.9.6.p02-cms4. The attached log file
"geant4.9.6.p02-cms4_errorprop_crash.txt" contains some lines

-------- WWWW ------- G4Exception-START -------- WWWW ------

which report

"Function called when *NOT* at a Boundary." inside of G4Navigator. This
seems to be considered a warning and the G4 processing continues, but
after some time, the G4Exception

"Surface Normal returned by Solid is not a Unit Vector."

is reported also in G4Navigator. This exception is fatal and processing
stops.

Using the G4 9.5 p2 version, I never saw these G4Exceptions when using
the error propagation in CMS.

Lorenzo (in CC) helped me trace whether this might be a problem, which
was introduced in some recent G4 releases. He had a look at G4 10 first
and ran the example in examples/extended/errorpropagation and the
G4Exception "Function called when *NOT* at a Boundary." ( the non-fatal
one) shows up a couple of times when running the example code. However
it does not crash and continues the example to the end.

I did some further tests:

in g4.9.6.p03
	-> same as in G4.10
in g4.9.5.p02
	the exceptions do not show up.

This seems to indicate, that some strange effect was introduced in
between the 4.9.5p2 and 4.9.6p3
I added the log files from running the example in g4.9.6.p03 and
g4.5.2.p02 to this mail.

Another interesting discovery we made is, that the problem might be related to the location of the propagation target. The G4exception in for the example is always reported in the very last step before hitting the target. Furthermore, Lorenzo discovered that when the propagation target is moved outside of the muon detector in the example, the G4exception disappears.

Can think of any changes g4.9.5 -> g4.9.6 which might provoke this behavior ?"
Comment 1 John Apostolakis 2014-10-10 19:28:06 CEST
Created attachment 284 [details]
Log file for clean run - from Geant4 9.5-patch02

Added log file output from clean run with Geant4 9.5 (from Thomas and Lorenzo)
Comment 2 John Apostolakis 2014-10-10 19:39:50 CEST
Additional information from Thomas, replying to questions:

> Could you provide a stack trace of your program when the Exception is called ?  ( Running in gdb should give you this quickly. )

I added the stacktrace at the bottom. The last stack which is within CMSSW is #16 (Geant4ePropagator::propagateGeneric).

> Are you able to reproduce the problem with the debug installation of Geant4 ?

Do you mean a specific debug compile of G4? I am not sure if this can be used from within CMSSW. But I can give it a try using only the G4 example and see if something shows up.

> How long does it take for the program to crash ?
It is hard to say. The startup takes a couple of seconds, but after that, the crash happens quite fast ( < 1s ). I counted the number of successful propagations at is it 2. The crash happens on the third.
Comment 3 John Apostolakis 2014-10-10 19:42:34 CEST
Created attachment 285 [details]
Log file showing warnings - from Geant4 9.6-patch3

Log file using Geant4 9.6 patch 03.  Shows warnings, but no crash.
Comment 4 John Apostolakis 2014-10-10 19:46:59 CEST
Created attachment 286 [details]
Stack trace of crash for run with Geant4 9.6 patch 02

Cleaned up stack trace (without the full AFS paths), showing the functions called before the crash.
Comment 5 Vladimir.Ivantchenko 2014-10-10 20:29:26 CEST
diff -rupN geant4.10.00.p02.orig/source/geometry/navigation/src/G4Navigator.cc geant4.10.00.p02/source/geometry/navigation/src/G4Navigator.cc
--- geant4.10.00.p02.orig/source/geometry/navigation/src/G4Navigator.cc 2014-09-05 14:46:30.000000000 +0200
+++ geant4.10.00.p02/source/geometry/navigation/src/G4Navigator.cc 2014-09-05 15:37:33.000000000 +0200
@@ -1266,6 +1266,28 @@ void G4Navigator::SetupHierarchy()
}
}
+G4ThreeVector NavigateDaughtersForNormal(const G4LogicalVolume* v, const G4ThreeVector& point){
+ G4ThreeVector ExitNormal(0.,0.,0.);
+ for (int i = 0; i < v->GetNoDaughters(); ++i){
+ const G4VPhysicalVolume* cur_phys = v->GetDaughter(i);
+ const G4LogicalVolume* cur_logi = cur_phys->GetLogicalVolume();
+ G4AffineTransform transform = G4AffineTransform(cur_phys->GetRotation(), cur_phys->GetTranslation()).Invert();
+ G4ThreeVector daughterPointOwnLocal=transform.TransformPoint( point );
+ if (cur_logi->GetNoDaughters()>0){
+ ExitNormal = NavigateDaughtersForNormal(cur_logi, daughterPointOwnLocal);
+ if( std::fabs(ExitNormal.mag2()-1.0 ) < CLHEP::perMillion)
+ return ExitNormal;
+ }
+ }
+
+ G4VSolid* solid = v->GetSolid();
+ ExitNormal= -(solid->SurfaceNormal(point));
+ if( std::fabs(ExitNormal.mag2()-1.0 ) < CLHEP::perMillion )
+ return ExitNormal;
+ return ExitNormal;
+
+}
+
// ********************************************************************
// GetLocalExitNormal
//
@@ -1392,18 +1414,21 @@ G4ThreeVector G4Navigator::GetLocalExitN
{
G4VSolid* daughterSolid =fHistory.GetTopVolume()->GetLogicalVolume()
->GetSolid();
- ExitNormal= -(daughterSolid->SurfaceNormal(fLastLocatedPointLocal));
+ ExitNormal= NavigateDaughtersForNormal(fHistory.GetTopVolume()->GetLogicalVolume(), fLastLocatedPointLocal);
if( std::fabs(ExitNormal.mag2()-1.0 ) > CLHEP::perMillion )
{
G4ExceptionDescription desc;
desc << " Parameters of solid: " << *daughterSolid
<< " Point for surface = " << fLastLocatedPointLocal << std::endl;
G4Exception("G4Navigator::GetLocalExitNormal()",
- "GeomNav0003", FatalException, desc,
+ "GeomNav0003", JustWarning, desc,
"Surface Normal returned by Solid is not a Unit Vector." );
- }
- fCalculatedExitNormal= true;
- *valid = true;
+ fCalculatedExitNormal= false;
+ *valid = false;
+ } else {
+ fCalculatedExitNormal=true;
+ *valid = true;
+ }
}
else
{
@@ -1417,10 +1442,10 @@ G4ThreeVector G4Navigator::GetLocalExitN
{
*valid = false;
fCalculatedExitNormal= false;
- G4ExceptionDescription message;
- message << "Function called when *NOT* at a Boundary." << G4endl;
- G4Exception("G4Navigator::GetLocalExitNormal()",
- "GeomNav0003", JustWarning, message);
+ //G4ExceptionDescription message;
+ //message << "Function called when *NOT* at a Boundary." << G4endl;
+ //G4Exception("G4Navigator::GetLocalExitNormal()",
+ // "GeomNav0003", JustWarning, message);
}
}
}
Comment 6 Vladimir.Ivantchenko 2014-10-10 20:35:48 CEST
Hi John,

in previous comment I have copied the patch proposed by Piergiulio Lenzi.

Alternatively the patch can be accessed at url:
https://github.com/lenzip/cmssw/blob/geant4e_test_giulio/TrackPropagation/Geant4e/test/geant4.10.p2_exitNormal.patch 

Cheers,
Vladimir
Comment 7 Piergiulio Lenzi 2014-10-13 12:28:51 CEST
Created attachment 288 [details]
Analysisi of the problem and possible patch
Comment 8 Piergiulio Lenzi 2014-10-13 12:30:13 CEST
Comment on attachment 288 [details]
Analysisi of the problem and possible patch

Hi,
this presentation contains a description of the problem and the analysis of the cause.
Cheers
Giulio
Comment 9 John Apostolakis 2014-11-19 16:15:39 CET
I have identified missing functionality in G4ErrorPropagationNavigator which is responsible for errors similar to the ones reported below.

In the simpler case of the error propagation example, patches in this class and in G4Navigator now give reliable results.

I have shared the improved versions of these files with those reporting the issue.

Once it is checked that the same corrections also fix the issue for the CMS use case we can mark this as resolved.
Comment 10 John Apostolakis 2014-11-19 16:20:23 CET
Created attachment 296 [details]
Draft patch file for G4Navigator and G4ErrorPropagationNavigator classes

Proposed corrections in tar format file.
Comment 11 John Apostolakis 2014-12-15 20:14:04 CET
Identified an issue in G4ErrorPropagationNavigator::ComputeStep: it calls G4Navigator::ComputeSafety, which has a side effect regarding the location.

Created a revised version of G4ErrorPropagationNavigator which addresses this.