| Summary: | Multithreaded random engine seeding yields duplicate events | ||
|---|---|---|---|
| Product: | Geant4 | Reporter: | mstoeckl |
| Component: | run | Assignee: | Andrea Dotti <andrea.dotti> |
| Status: | RESOLVED WONTFIX | ||
| Severity: | major | ||
| Priority: | P4 | ||
| Version: | 10.3 | ||
| Hardware: | All | ||
| OS: | All | ||
Hello, thank you for your detailed report. We need to think a moment about that. Changes in CLHEP cannot go in a fast manner in G4. Andrea Hello, I am really sorry for the late answer to this, but it required some investigation and thinking on our side. A new engine MixMax is included in G4. This engine is the most promising we have and it should become the default. Unfortunately CLHEP, to keep compatibility with older engines, has an interface to 32bits, that is clearly a drawback. The issue is with CLHEP and we can only adapt what CLHEP authors provide. The decision to abandon old engines (that would solve the problem at its root), is on them and we can only suggest (we did). Clearly with Geant4-MT we are using a default solution that aims at being 100% compatible with CLEHP. As you point out, this is not adequate for all use-cases. For that reason we provide interfaces to change the RNG behavior. As a starting point please refer to this web-page: https://geant4.web.cern.ch/geant4/UserDocumentation/UsersGuides/ForToolkitDeveloper/html/OOAnalysisDesign/Multithreading/mt.html#random-number-generation-seeding-in-mt I do realize I am not offering a complete solution to your problem, but I hope this can be a starting point. Andrea |
With the MTRunManager configured to set the per-thread RNG seed just before the beginning of each event: By the Birthday Problem, a duplicate event is probable once the space of possible event seeds (size N) is greater than the number of events run (K) squared. There are many use cases for simulations which have K=2^32, (basically any complex geometry or multidimensional energy deposition recording array handling <1MeV photons) and K=2^36 is conceivable. To avoid duplicate events in a simulation, we require a seed space of a least 2^64; moreover, to have meaningful simulations a seed space of 2^32 is required. Geant4 currently has two problems related to this: 1) the G4MTRunManager seeding procedure limits the seed space 2) Several random engines are not so random when reseeded 1. G4MTRunManager sets RNG seeds by passing in an array of two longs. The longs are generated by multiplying a float in (0..1] by 10^8. This restricts the space per long to the minimum of the space of the generate floats and of the range 0..10^8. The correct solution is to use the operator unsigned int() which all RNGs have, twice per long, to turn the seed array into a set of random bits. As it stands, some RNGs have 2^24 bit float spaces (i.e. JamesRandom), in which case the maximum possible seed count is 2^48; those generating a more fine grained range of floats attain 2^53 possible seeds, which still leads to duplicate events at reasonable K. 2. Various different RNGs have different problems. Some of these problems would go away if G4MTRunManager was fixed so the RNGs do not seed themselves with the output of another instance's flat(). 2a. JamesRandom has limited (24 bit) output floats, and only uses 1 long as seed material. Due to self-seeding, the seed space has size 2^24 which limits simulation accuracy for any greater event count. 2b. DualRand uses only 32 bits of the first given seed long and thus has a seed space limited to 2^32. 2c. Ranshi has acceptable flat() output space and can use a large amount of longs as the seed. However, when applying a new event seed, because only two longs are provided the code pads the remainder of the internal buffer with ... a constant value, putting the seed into an unusual part of the state space. The first few random numbers generated immediately after seeding with two longs are heavily biased, which warps the distribution of the particles introduced in the PrimaryGeneratorAction. Using an LCG instead of constant filling will likely resolve the problem. 2d. Many of the remaining number generators only use the bottom 32 bits of the provided long seed array. If G4MTRunManager is fixed, then these should be adjusted to split all the longs and use both halves. MixMax, Ranecu, Ranlux, MTwist in particular would require this change to avoid wasting random bits from the G4MTRunManager seeder. My recommendation is to: At next patch level (10.3): Adjust G4MTRunManager as specified in 1. Increase seed count to 4 longs. At next minor release (10.4): Fix the CLHEP::Random API to include a method to generate random integers or longs. Deprecate JamesRandom and DualRand on the grounds that they accept <= 2^32 bits worth of seeding and thus aren't reliable when multithreading or when aiming for a moderate number of reproducible events. Remove these RNGs eventually. Make the default RNG something like MixMax, Ranlux(64), MTwist, or Ranecu Adjust RNG engines to use all of the provided bits. I am willing to provide patches to resolve these problems, although it might take a few weeks for me to have time to write and test any of them.