With geant4.9.3 and geant4.9.2.p02 we experience random crashes during startup when loading reflectivity data. #5 0x0000000000e4bdfb in G4MaterialPropertyVector::GetProperty(double) const () #6 0x000000000084451c in G4OpBoundaryProcess::CalculateReflectivity() () #7 0x00000000008470a8 in G4OpBoundaryProcess::DielectricMetal() () #8 0x00000000008487f8 in G4OpBoundaryProcess::PostStepDoIt(G4Track const&, G4Step const&) () #9 0x0000000000ce580b in G4SteppingManager::InvokePSDIP(unsigned long) () #10 0x0000000000ce5bf3 in G4SteppingManager::InvokePostStepDoItProcs() () #11 0x0000000000ce3ed1 in G4SteppingManager::Stepping() () The randomness seems to be connected to the value of the first event being read from an input file. By editing the input file we can get Geant to work. The problem is reproducible. Running the job again on the same input file gives the same crash. Moving the first event in the ASCII input file further down the file can permit the job to start. The problem is not found on 32 bit versions of Geant compiled on the same release of the same operating system.
I traced this to a problem in the setting of the optical surface "material properties" of two boundary surfaces. The material properties table pointers for the G4OpticalSurfaces were null (due to our own code), but other surface properties such as "type" and "polish" were set. The "type" is set to dielectric_metal for these surfaces. I've studied the logic of G4OpBoundaryProcess::PostStepDoIt(), and identified the problem. Here is the technical description of the problem: if there is a boundary surface with an optical surface, but the optical surface has no material properties table, the values of several variables in G4OpBoundaryProcess are never properly set. These variables include three pointers to property vectors and a double which stores the value of the last calculated reflectivity: PropertyPointer, PropertyPointer1, PropertyPointer2, and theReflectivity. The pointer variables are never assigned a value _except_ in PostStepDoIt(), so if this condition happens on the first call, the values are garbage. If this is a later call, values from a previous call are reused, which means they may also be incorrect, but the code does not crash, and even valgrind sees no problem. (Technical detail: these are member variables instead of local variables.) In geant4.9.1.p03, the pointer variables were not used in the dielectric-metal reflection subroutine DielectricMetal(), and there was no crash. However, the reflection value used in this case would most likely have been 1.0, or possibly a reflectivity previously calculated for an interaction with some other surface. In geant4.9.2.p02, extra logic was added to DielectricMetal() which calls CalculateReflectivity(), which does use the pointer variables. This would cause an access to invalid memory, provoking a crash, if the first optical photon interaction with a boundary was with one of these two surfaces. Otherwise, the results would have been very similar to those obtained with geant4.9.1.p03. Properly specifying the Material Properties tables for these two surfaces seems to have eliminated the crashing problem and the unfortunate dependence on which surface our optical photons hit first. I think it's clear that the variables PropertyPointer, PropertyPointer1, PropertyPointer2, and theReflectivity should always be assigned appropriate values rather than reusing values from a previous interaction. You might also consider a warning in case no material property table is found for an optical surface, unless it was really intended for that case to be a supported user choice.
Thank you very much for your detailed technical description of what went wrong in your situation when you did not define a material properties table for a surface you defined as type - dielectric_metal. I really appreciate you delving into the logic so deeply! You also uncovered an error that is there regardless. I agree with all of your analysis (except for a minor point) and I agree with your suggested remedy. The logic in PostStepDoIt assumes that the pointers are either properly defined or NULL. I shall therefore specifically assign them to be NULL at the start of the logic in PostStepDoIt (as is already done for a host of other such variables/pointers, among them theReflectivity = 1.) They are not local variables of the method DoIt because the code is departmentalized into several helper methods which also need these pointers. The program has defaults and the default for a dielectric_metal surface is perfect reflectivity, hence no warning message. The error was, as you found out the hard way, that the PropertyPointer1 and PropertyPointer2 could be anything if the condition assigning them was not met. As you also pointed out, the bigger issue is that these poiners keep their value between calls to PostStepDoIt which is NO GOOD when the surface has changed. This will be fixed in the next patch for 9.3. In the meantime, please add: PropertyPointer = NULL; PropertyPointer1 = NULL; PropertyPointer2 = NULL; where defaults and prob_ss etc. are set to zero in PostStepDoIt.