| Summary: | error messages when calling ::exit(0) | ||
|---|---|---|---|
| Product: | Geant4 | Reporter: | Tom Roberts <tjrob> |
| Component: | processes/hadronic/cross_sections | Assignee: | Vladimir.Ivantchenko |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | Gunter.Folger |
| Priority: | P5 | ||
| Version: | 9.2 | ||
| Hardware: | Apple | ||
| OS: | Mac OS X | ||
| Attachments: | Stack traces of malloc failures inside exit(0). | ||
|
Description
Tom Roberts
2009-03-23 18:32:01 CET
The problem reported doesn't seem related to Geant4. There're no changes in 9.2.p01 versus 9.2 which could justify such behavior (which, btw, we cannot reproduce). Moving to a new version of the underlying libraries may have put visible some already existing problems in the memory management (improper deletion or manipulation of the geometry objects in the user application identified by the call to the run-manager deletion, is a sign of this in my opinion). I can only suggest to use the debugger for investigate further the problem, compiling the user application (and eventually the Geant4 libraries as well) in debug-mode for more detailed information. Created attachment 43 [details]
Stack traces of malloc failures inside exit(0).
This came from a ddd session debugging g4beamline 1.16 linked with a debug version of geant4.9.2.p01.
The problem is most definitely inside Geant4. It is related to the way objects are registered in G4CrossSectionDataSetRegistry. The malloc errors come inside G4CrossSectionDataSetRegistry::Clean() as it deletes them. The code clearly registered mis-aligned pointers. None of my code is involved -- my program has no code related to any hadronic processes. All it did was to instantiate an instance of the QGSP_BIC physics list. I ran with the following malloc debugging environment: export MallocLogFile=$PWD/malloc.log export MallocPreScribble=1 export MallocScribble=1 export MallocCheckHeapStart=100 export MallocCheckHeapEach=100 export MallocCheckHeapAbort=1 export MallocBadFreeAbort=1 The only errors in malloc.log are the same 8 non-aligned pointers being freed (there are >40,000 test passed messages). This would have found most wild-pointer writes in my code (or in Geant4 code). As a test, I put debug printf()-s into G4CrossSectionDataSetRegistry: Clean(), Register(), and DeRegister(). The 8 non-aligned pointers were registered. Combined with the above test, I don't think this is any sort of heap corruption. Beyond this it would be terribly inefficient for me to attempt to debug this further -- there are too many types of Register() and I have no notion of the code structure. This sure looks like a general problem -- malloc never gives non-aligned pointers, so some code somewhere is calling G4CrossSectionDataSetRegistry::Register() with a pointer that did not come from malloc() -- that guarantees a problem in Clean(). Perhaps other OSs (or their C++ compilers) don't detect this, so debugging on MAC OS X might be most efficient. But an expert could probably find all calls to that function and find the ones that were not from malloc. This is probably subtle, so here's a guess: Register() is called with a G4VCrossSectionDataSet*. If the actual argument inherits from multiple classes, then this problem could occur as the compiler casts the object to that type. Then, even though the object was allocated via new, the pointer passed to Register() could be non-aligned. This is likely to be compiler-specific. When I first opened this bug report, I did not know to which part of Geant4 it applied, so I used "global". Perhaps it should be changed to processes/hadronic/cross-sections. Thanks for the detailed information. I believe we now have all the elements to investigate the problem further, as it seems to be due to the recently introduced de-registration mechanism for the hadronic processes. The problem is being assigned to the responsible. Thanks for the detailed information. I believe we now have all the elements to investigate the problem further, as it seems to be due to the recently introduced de-registration mechanism for the hadronic processes. The problem is being assigned to the responsible. Thanks to the bug report the problem was fixed and the fix will be included in the next Geant4 release at June and in a new patch for 9.2 if such patch will be created. |