Problem 2448

Summary: G4RunManager::ReinitializeGeometry() accesses stale pointers in G4Regions
Product: Geant4 Reporter: Michael Kelsey <kelsey>
Component: runAssignee: asai
Status: RESOLVED WONTFIX    
Severity: minor CC: birgit.zatschler
Priority: P4    
Version: 10.6   
Hardware: Other   
OS: Linux   

Description Michael Kelsey 2021-11-28 22:02:58 CET
In our simulation framework, we allow users to reconfigure and rebuild the geometry via macro commands.  The relevant code only does calls into the RunManager:

    void CDMSGeomConstructor::UpdateGeometry() {
      G4RunManager* rm = G4RunManager::GetRunManager();
    #if G4VERSION_NUMBER >= 1000
      rm->ReinitializeGeometry(true);
    #endif
      rm->GeometryHasBeenModified();
      rm->InitializeGeometry();
    }

I have found that when we call this function more than once, we eventually get a seg fault or other memory corruption.  Using Valgrind, I got the following report:

==54358== Invalid write of size 1
==54358==    at 0xA5C0EEB: G4Region::RemoveRootLogicalVolume(G4LogicalVolume*, bool) (in /scratch/group/mitchcomp/eb/x86_64/sw/Geant4/10.6.2-foss-2018b-debug/lib64/libG4geometry.so)
==54358==    by 0x5A707FE: G4RunManager::ReinitializeGeometry(bool, bool) (in /scratch/group/mitchcomp/eb/x86_64/sw/Geant4/10.6.2-foss-2018b-debug/lib64/libG4run.so)
==54358==    by 0x545B82: CDMSGeomConstructor::UpdateGeometry() (CDMSGeomConstructor.cc:301)
[...]
==54358==  Address 0x16afe1d9 is 89 bytes inside a block of size 168 free'd
==54358==    at 0x4C2B51D: operator delete(void*) (vg_replace_malloc.c:586)
==54358==    by 0xA5BB785: G4LogicalVolumeStore::Clean() (in /scratch/group/mitchcomp/eb/x86_64/sw/Geant4/10.6.2-foss-2018b-debug/lib64/libG4geometry.so)
==54358==    by 0x5A7079B: G4RunManager::ReinitializeGeometry(bool, bool) (in /scratch/group/mitchcomp/eb/x86_64/sw/Geant4/10.6.2-foss-2018b-debug/lib64/libG4run.so)
[...]
==54358==  Block was alloc'd at
==54358==    at 0x4C2A593: operator new(unsigned long) (vg_replace_malloc.c:344)
==54358==    by 0x5554EB: CDMSVStackBase::BuildEnvelope() (CDMSVStackBase.cc:163)
==54358==    by 0x5D0A2A: CDMSStackConstruction::BuildGeometry() (CDMSStackConstruction.cc:322)
[...]

The "Invalid write" report occurs in G4Region::RemoveRootLogicalVolume(), at the line

    lv->SetRegionRootFlag(false);

Because G4RunManager::ReinitializeGeometry(true) clears the logical volume store first, before cleaning up the regions, the "lv" pointer is stale at the time time the line above is called.

I *think* the appropriate behaviour is to do the region clearing first, then clean up the geometry stores.
Comment 1 Michael Kelsey 2021-11-30 19:57:28 CET
My hypothesis at the end of of my post was not correct.  If G4RegionStore is cleared first (using G4RegionStore::Clean()), the result is a series of invalid reads from G4RunManagerKernel::SetupDefaultRegion().  

It seems as though the region-handling loop in ReinitializeGeometry() shouldn't be used at all, since the "lv" pointer inside that loop is necessarily stale once the geometry stores have been cleared.

I also tried copying the implementation of that loop and calling it myself prior to calling ReinitializeGeometry(), but that also led to invalid reads.
Comment 2 asai 2021-12-14 23:32:05 CET
This seems to be a very special use-case, and I'm afraid we do not have enough resource to investigate this.
Makoto