During deployment of Geant4 for CMS TBB Helgrind analysis point to possible thread unsafe situation. The full report: ==17243== Possible data race during read of size 8 at 0x2FFA59E8 by thread #3 ==17243== Locks held: none ==17243== at 0x28BAE87C: G4PhaseSpaceDecayChannel::DecayIt(double) (G4PhaseSpaceDecayChannel.cc:89) ==17243== by 0x2924C8EC: G4Decay::DecayIt(G4Track const&, G4Step const&) (G4Decay.cc:252) ==17243== by 0x2A0A4FDA: G4SteppingManager::InvokePSDIP(unsigned long) (G4SteppingManager2.cc:530) ==17243== by 0x2A0A548E: G4SteppingManager::InvokePostStepDoItProcs() (G4SteppingManager2.cc:502) ==17243== by 0x2A0A2765: G4SteppingManager::Stepping() (G4SteppingManager.cc:209) ==17243== by 0x2A0AD17C: G4TrackingManager::ProcessOneTrack(G4Track*) (G4TrackingManager.cc:126) ==17243== by 0x2861D064: G4EventManager::DoProcessing(G4Event*) (G4EventManager.cc:185) ==17243== by 0x28462D9D: RunManagerMTWorker::produce(edm::Event const&, edm::EventSetup const&, RunManagerMT const&) (RunManagerMTWorker.cc:377) ==17243== by 0x2840485B: OscarMTProducer::produce(edm::Event&, edm::EventSetup const&) (OscarMTProducer.cc:171) ==17243== ==17243== This conflicts with a previous write of size 8 by thread #1 ==17243== Locks held: none ==17243== at 0x28BB451A: G4VDecayChannel::FillDaughters() (G4VDecayChannel.cc:346) ==17243== by 0x28BAEAB5: G4PhaseSpaceDecayChannel::DecayIt(double) (G4PhaseSpaceDecayChannel.cc:89) ==17243== by 0x2924C8EC: G4Decay::DecayIt(G4Track const&, G4Step const&) (G4Decay.cc:252) ==17243== by 0x2A0A4FDA: G4SteppingManager::InvokePSDIP(unsigned long) (G4SteppingManager2.cc:530) ==17243== by 0x2A0A548E: G4SteppingManager::InvokePostStepDoItProcs() (G4SteppingManager2.cc:502) ==17243== by 0x2A0A2765: G4SteppingManager::Stepping() (G4SteppingManager.cc:209) ==17243== by 0x2A0AD17C: G4TrackingManager::ProcessOneTrack(G4Track*) (G4TrackingManager.cc:126) ==17243== by 0x2861D064: G4EventManager::DoProcessing(G4Event*) (G4EventManager.cc:185) ==17243== ==17243== Address 0x2FFA59E8 is 72 bytes inside a block of size 96 alloc'd ==17243== at 0x4806A85: operator new(unsigned long) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol0/slc6_amd64_gcc481/external/valgrind/3.9.0-cms3/lib/valgrind/vgpreload_helgrind-amd64-linux ==17243== by 0x28B7EB41: G4PionZero::Definition() (G4PionZero.cc:87) ==17243== by 0x28B7DBF2: G4MesonConstructor::ConstructLightMesons() (G4MesonConstructor.cc:91) ==17243== by 0x28B7DCB8: G4MesonConstructor::ConstructParticle() (G4MesonConstructor.cc:82) ==17243== by 0x28E41953: G4DecayPhysics::ConstructParticle() (G4DecayPhysics.cc:88) ==17243== by 0x2A056E1C: G4VModularPhysicsList::ConstructParticle() (G4VModularPhysicsList.cc:115) ==17243== by 0x2A047B09: G4RunManagerKernel::SetupPhysics() (G4RunManagerKernel.cc:457) ==17243== by 0x2A047CA9: G4RunManagerKernel::SetPhysics(G4VUserPhysicsList*) (G4RunManagerKernel.cc:431) ==17243== by 0x2845E5CC: RunManagerMT::initG4(DDCompactView const*, MagneticField const*, HepPDT::ParticleDataTable const*) (RunManagerMT.cc:144) ==17243== by 0x2844D8D7: OscarMTMasterThread::OscarMTMasterThread(edm::ParameterSet const More comments from CMS expert: Code in G4PhaseSpaceDecayChannel::DecayIt(double) (G4PhaseSpaceDecayChannel.cc:89) 89 if (G4MT_parent == 0) FillParent(); 90 if (G4MT_daughters == 0) FillDaughters(); Member declaration in G4VDecayChannel.hh (G4PhaseSpaceDecayChannel derives from G4VDecayChannel) 230 G4ParticleDefinition* G4MT_parent; 231 G4ParticleDefinition** G4MT_daughters; Code in G4VDecayChannel::FillDaughters() (G4VDecayChannel.cc:346) 346 G4MT_daughters = new G4ParticleDefinition*[numberOfDaughters]; I think the race is that thread 3 reads from G4MT_daughters in G4PhaseSpaceDecayChannel::DecayIt() and thread 1 writes to it in G4VDecayChannel::FillDaughters(), and there is no synchronization. I verified with gdb that indeed within OscarMTProducer at least some G4VDecayChannel-derived objects are shared between worker threads. There are some comments in G4VDecayChannel.hh, that I interpret such that these variables should be thread-local (and there is some discussion on the implementation), but at the moment they definitively are not.
The issue is confirmed and the fix is tagged and proposed.