Problem 2520 - EMZ lists crashes when base materials themselves have base materials
Summary: EMZ lists crashes when base materials themselves have base materials
Status: RESOLVED FIXED
Alias: None
Product: Geant4
Classification: Unclassified
Component: processes/electromagnetic (show other problems)
Version: 11.0
Hardware: All All
: P4 normal
Assignee: Vladimir.Ivantchenko
URL:
Depends on:
Blocks:
 
Reported: 2022-12-06 23:45 CET by Thomas Kittelmann
Modified: 2022-12-21 16:44 CET (History)
2 users (show)

See Also:


Attachments
Standalone example showing the issue (3.11 KB, text/x-c++src)
2022-12-06 23:45 CET, Thomas Kittelmann
Details

Note You need to log in before you can comment on or make changes to this problem.
Description Thomas Kittelmann 2022-12-06 23:45:05 CET
Created attachment 796 [details]
Standalone example showing the issue

I encountered this issue when trying to migrate our G4-based simulation code from v10.x.y to v11.0.x. I finally tracked the issue down to our occasional usage of materials which has more than one level of base materials (this is rarely used of course, but happens occasionally when chaining two of our material-creation factories).

The issue is very easy to reproduce: Simply initialise an EMZ physics list for a geometry which contains a material whose base material itself has a base material. I have created and attached a small standalone code example which does just that.

More details:

Using an _EMZ physics list with such a multi-level-based material triggers a segfault in G4 11.0.0 and later (the issue was confirmed with both Geant4 11.0.0 & 11.0.3 and was confirmed to be absent in 10.7.4). It happens in G4GSMottCorrection::GetMottCorrectionFactors in the file source/processes/electromagnetic/standard/src/G4GSMottCorrection.cc in line 108 in the expression:

  fMCDataPerMaterial[matindx]->fDataPerEkin[ekinIndxLow]

Because fMCDataPerMaterial[matindx] is actually a null pointer. I am assuming that these Mott correction factors are only included in _EMZ and not the other EM lists, and a naive guess might be that somehow the special logic for base materials here misses the fact that the base material itself might itself have a base material. Looking at the code changes for the related EM source files in G4 between 10.7.4 and 11.0.0 I do indeed see several changes which seems to deal with base materials, but I do not have the expertise to hunt this down further.
Comment 1 Vladimir.Ivantchenko 2022-12-07 00:13:39 CET
Hello, Thomas,

it is not assumed to have base materials built from base materials. So, I understand your post not as a bug report but as a feature request. We may consider this request as a work item for the next year but it is not clear is such sequence of base materials with two layers or whatever.

VI
Comment 2 Thomas Kittelmann 2022-12-07 08:56:43 CET
Hi Vladimir,

Thanks a lot for your prompt reply.

If base materials with base materials are not supported, could we have a clear error thrown when one tries to use a base material which itself has a base material? (thrown from either the G4Material constructor or the EM physics code depending on whether it is simply not supported by EMZ or G4Material's in general). The fact that there is no such error, and that there was issues only for one specific physics lists and only in some releases, made this issue rather confusing and time-consuming to hunt down. Even in the setup where the seg-fault happens, it is hard to track down the cause since the seg-fault happens deep inside some code (in this case the Mott correction code) which is unrelated to the point where the user made an error (in this case this is apparently where I was composing materials in what you are now telling me is an unsupported manner). It took a lot of time hunting through debug builds of Geant4 in gdb and looking at Geant4 source code diffs with git, plus a few lucky guesses in the end, to figure out the cause of these segfaults.

I can personally find workarounds for my code now that I understand the issue, but if anything should go on the work plan from this, it might simply to have a clear policy on base-materials with base-materials. The fact that they work most of the time but not always is a bit confusing. In the mean-time it would be great if code that does not support base materials with base materials would emit clear exceptions rather than dereference null pointers and cause confusing segfaults.

Cheers,
Thomas
Comment 3 Vladimir.Ivantchenko 2022-12-07 09:06:46 CET
Hello Thomas,

you are right - we need to add G4Exception first if we cannot handle this problem. We should do this as a patch to previous releases. It may happens that recursive finding of "very basic" material is also possible.

VI
Comment 4 Thomas Kittelmann 2022-12-07 16:16:28 CET
Thanks!

I was actually wondering, if it might actually (in the longer term) be possible to simply handle this in G4Material.cc? So if for instance "mat1" has a base material "matbase", then trying to create a new material "mat2" with a base material "mat1", would actually result in the material "mat2" having "matbase" as the base material, but with defaults for density/state/temperature/... also affected by the mat1 values.

If that would indeed be possible, then users would be able to use any material as a base material, but all the rest of the code could safely assume that a material returned by G4Material::GetBaseMaterial() would never itself have a base material.

Anyway, that was just a stray thought :-)

Cheers,
Thomas
Comment 5 Vladimir.Ivantchenko 2022-12-11 19:27:26 CET
Hello Thomas,

fix integrated into development version of Geant4, it will be publicly available in the patches to Geant4 11.1 and may be other releases.

If you want to check I may send file privately.

VI
Comment 6 Thomas Kittelmann 2022-12-12 14:36:39 CET
Hi Vladimir,

Thanks a lot!

I can see your fix in the geant4-dev repo. I was wondering if perhaps the new code which you added to the constructor (in line 208 of G4Material.cc) in order to make sure that fBaseMaterial ends up as a material with no base, should perhaps instead be moved to the end of the constructor. More specifically so it happens after the call to CopyPointersOfBaseMaterial()?

Because imagine for instance that we have the following scenario:

mat0 : a normal material with no base material and fState = kSolid
mat1:  a material which has mat0 as the base material but fState = kGas

If we now try to create a third material named mat2 with mat1 as the base material, and we do not specify a state for mat2, then I think the code as it is now would end up having mat2's fState being kSolid. But I guess it should really be kGas since the user provided a gaseous base material and did not specify a state.

(sorry if I am simply missing something obvious).

Cheers,
Thomas
Comment 7 Vladimir.Ivantchenko 2022-12-21 16:44:43 CET
Hi Thomas,

it is not possible to move this addition to the end of the constructor, because in intermediate lines of the constructor wrong pointers will be used.

It seems here currently maximum is done for this problem. The fix will be available in 11.1.1, likely soon.

VI