This section amplifies the Usage Guide with material on how to choose a random generator for your simulation, what default generators have been implemented for the distribution objects, and information on the set of test programs used.
In this version of Random for Swarm there are a total of 36 different generators implemented. If you are a serious simulationist you need to select which one(s) to use for your model.
Here are some factors to consider:
We want a generator to have as good statistical properties as possible. We measure this by subjecting the generators to various tests. I have subjected the implemented generators to two sets of tests, Diehard and ENT. (Look in directory /testR6 of the test package.) (The Generators quality table summarizes the test results. The highlights are:
The LCG and SCG generators are of very poor quality (they fail many of the tests), and should never be used. [They are likely to disappear in the next release.]
The lagged-Fibonacci generators (ACG, SWBi, PSWB) all fail the Diehard `Birthday Spacings Test', and are therefore only conditionally recommended for use.
The other 32-bit generators pass all tests, and I therefore have no reason not to recommend them all for use.
The 31-bit generators all fail certain tests because one bit has a constant value. Beyond that they all seem to be ok.
We want a generator to have a long enough period for our purpose, and in general the longer the period the better. (However, note that to generate 264 random numbers from a fast generator like MT19937 would take 2.1 million years on a 486/66, so in practice any generator with a period close to 260 or larger will be acceptable.) The PMMLCG generators, although they are of acceptable quality otherwise, have a period less than 231 which we can exhaust in under 3 hours. So use those only for quick `toy' applications.
We want a generator to execute efficiently, the faster the better. The Generators data table indicates the execution times for the generators as implemented.
We want a generator to take as little resources (memory) as possible. The Generators data table indicates the size of each generator's state in bits.
Other things being equal, we want our generator to have a single full period rather than a number of shorter periods, since we may not know whether a particular starting seed will put us into a long or a short subcycle. This disqualifies SWB, and possibly the new MWC and RWC generators (Marsaglia does not say how many periods these generators have, but only gives the period length.)
There are 3 possible strategies for using random generators:
Use one single generator for your whole simulation, and have all `users' of randomness (distributions and other objects) call this single generator in an interleaved fashion. For this purpose, generators such as MT19937, TT800, TT775, TT443 (and possibly PSWB and SWB) seem particularly well suited since they have immense periods. The code in random/random.m shows how to connect several distribution objects to a single random generator. The generator randomGenerator supplied there is of class MT19937.
Use a single generator of long period, but divide this long period up into a number of non-overlapping (hence statistically independent) segments, and let each `user' of randomness draw their random numbers from separate segments. Doing this requires that we be able to quickly access these separate segments. The `split' generators C2LCGX and C4LCGX implement such a scheme. You can specify at creation how you want the period of these generators subdivided. (See the source code for details.)
Use a separate random generator for each `user' of randomness. In this case, you need to make sure that two or more generators of the same type are not started with the same seed, since in this case their output will be highly correlated. Provide your own seeds, or use RANDOMSEED or NEXTSEED.
The distribution objects, if created with the `createWithDefault: aZone' method, will create for themselves a fresh generator, with each class of distribution using a different class of generator. All the generators in this case are initialized with NEXTSEED (or RANDOMSEED, if you start the program with --varyseed).
These tables shows the results of testing the generator objects statistically:
Table C-1. Random Library: Generator Statistical Tests
Generator | timing uS (unsigned) | bits state | period (s) # length | tests failed (x) |
---|---|---|---|---|
abcdefghijklmnopqrstuvwx | ||||
30-bit output | ||||
SCG | 3.328 | 1650 | 1 255 | x.xx2xxxxxxx..x..xx.xx.. |
32-bit output | ||||
LCG1 | 2.564 | 32 | 1 232 | x...xxxxxxx.........x... |
LCG2 | 2.564 | 32 | 1 232 | x...xxxxxxx.........x... |
LCG3 | 2.564 | 32 | 1 232 | x...xxxxxxx.........x... |
ACG | 2.702 | 1760 | 1 255 | x....................... |
SWB1 | 3.285 | 1185 | 64 21178 | x....................... |
SWB2 | 3.285 | 769 | 1536 2757 | x....................... |
SWB3 | 3.285 | 673 | 192 2664 | x....................... |
PSWB | 3.452 | 1377 | 1 21376 | x....................... |
MWCA | 3.399 | 64 | ?? 259 | ........................ |
MWCB | 3.420 | 64 | ?? 259 | ........................ |
MT19937 | 3.698 | 19968 | 1 219937 | ........................ |
TT800 | 4.654 | 800 | 1 2800 | ........................ |
C3MWC | 6.387 | 192 | ?? 2118 | ........................ |
RWC2 | 7.445 | 96 | ?? 292 | ........................ |
RWC8 | 14.649 | 288 | ?? 2250 | ........................ |
31-bit output | ||||
C2TAUS1 | 4.078 | 62 | 1 260 | 1..-1x111x1.........x... |
C2TAUS2 | 4.078 | 62 | 1 260 | 1..-1x111x1.........x... |
C2TAUS3 | 4.078 | 62 | 1 260 | 1..-1x111x1.........x... |
TT775 | 4.654 | 775 | 1 2775 | 1..-1x111x1.........x... |
TT403 | 4.670 | 403 | 1 2403 | 1..-1x111x1.........x... |
PMMLCG1 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG2 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG3 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG4 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG5 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG6 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG7 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG8 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
PMMLCG9 | 4.715 | 31 | 1 231 | 1..-1x111x1.........x... |
C2LCGX | 7.029 | 62 | 1 260 (Split) | 1..-1x111x1.........x... |
MRG5 | 9.674 | 155 | 1 2155 | 1..-1x111x1.........x... |
MRG6 | 10.449 | 186 | 1 2186 | 1..-1x111x1.........x... |
MRG7 | 10.913 | 217 | 1 2217 | 1..-1x111x1.........x... |
C4LCGX | 11.688 | 124 | 1 2210 (Split) | 1..-1x111x1.........x... |
C2MRG3 | 13.459 | 186 | 1 2185 | 1..-1x111x1.........x... |
Test code | Test suite | Explanation of test codes |
---|---|---|
a | Diehard | Birthday Spacings Test |
b | Diehard | Overlapping 5-permutation Test |
c | Diehard | Binary Rank Test (31x31) |
d | Diehard | Binary Rank Test (32x32) |
e | Diehard | Binary Rank Test (6x8) |
f | Diehard | Bitstream Test (overlapping 20-tuples) |
g | Diehard | OPSO (Overlapping Pairs, Sparse Occupancy) |
h | Diehard | OQSO (Overlapping Quadruples, Sparse Occupancy) |
i | Diehard | DNA Test |
j | Diehard | Count-the-1's Test (integers) |
k | Diehard | Count-the-1's Test (specific bytes) |
l | Diehard | Parking Lot Test |
m | Diehard | Minimum Distance Test |
n | Diehard | 3D Spheres Test |
o | Diehard | Squeeze Test |
p | Diehard | Overlapping Sums Test |
q | Diehard | Runs Test |
r | Diehard | Craps Test |
s | ENT | Entropy Test |
t | ENT | Compression test |
u | ENT | Chi-Square Test |
v | ENT | Arithmetic Mean |
w | ENT | Monte Carlo value for Pi |
x | ENT | Serial Correlation Coefficient |
Indication | Explanation of indications |
---|---|
. | The generator passed this test |
1 | The generator passed this test, except for the lowest bit |
2 | The generator passed this test, except for the 2 lowest bits |
x | The generator failed this test completely |
- | This test cannot be passed by this generator (too few bits) |
Notes.
For 31-bit generators, it is normal to fail 1 part of these tests:a,e,g,h,i,k. This is because we left-shift the output of these generators by 1 bit, so that the lowest bit is always 0.
For 31-bit generators, it is normal to fail test d (32x32 rank test).
All 31-bit generators also fail tests f, j, and u (and most others don't.) This is *likely* due to the constant low bit.
Choosing A Generator.
Clearly unacceptable generators: LCG1, LCG2, LCG3, SCG.
Conditionally recommended generators: ACG, SWB1, SWB2, SWB3, PSWB.
Use with caution: the PMMLCGx generators (due to their very short period 231).
All other generators are recommended at this time.
And this table gives further data about the generators:
Table C-2. Random Library: Generator Data
Generator | Seeds | Modulus | Cycles (length) | Bits | Speed | State |
---|---|---|---|---|---|---|
(a) Simple Short Generators | ||||||
LCG1 | 1 | 232 | 1 m | 32 | 1.442 | 32 |
LCG2 | 1 | 232 | 1 m | 32 | 1.442 | 32 |
LCG3 | 1 | 232 | 1 m | 32 | 1.442 | 32 |
PMMLCG1 | 1 | 231-1 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG2 | 1 | 231-1 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG3 | 1 | 231-1 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG4 | 1 | 231-1 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG5 | 1 | 231-105 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG6 | 1 | 231-225 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG7 | 1 | 231-325 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG8 | 1 | 231-85 | 1 m-1 | 31 | 0.784 | 31 |
PMMLCG9 | 1 | 231-249 | 1 m-1 | 31 | 0.784 | 31 |
(b) Simple Long Generators | ||||||
ACG | 55 | 232 | 1 255 | 32 | 1.369 | 760 |
SCG | 55 | 109 | 1 255 | 30 | 1.111 | 650 |
SWB1 | 37+c | 232 | 64 21178 | 32 | 1.126 | 185 |
SWB2 | 24+c | 232 | 1536 2757 | 32 | 1.126 | 769 |
SWB3 | 21+c | 232 | 192 2664 | 32 | 1.126 | 673 |
PSWB | 43+c | 232-5 | 1 21376 | 32 | 1.070 | 1377 |
TT403 | 13 | 231 | 1 2403 | 31 | 0.792 | 403 |
TT775 | 25 | 231 | 1 2775 | 31 | 0.795 | 775 |
TT800 | 25 | 232 | 1 2800 | 32 | 0.795 | 800 |
MT19937 | 624 | 232 | 1 219937 | 32 | 1.000 | 19937 |
MRG5 | 5 | 231-1 | 1 2155 | 31 | 0.382 | 155 |
MRG6 | 6 | 231-1 | 1 2186 | 31 | 0.354 | 186 |
MRG7 | 7 | 231-1 | 1 2217 | 31 | 0.339 | 217 |
MWCA | 2 | 232 | ? 259 | 32 | 1.088 | 64 |
MWCB | 2 | 232 | ? 259 | 32 | 1.088 | 64 |
RWC2 | 3 | 232 | ? 292 | 32 | 0.497 | 96 |
RWC8 | 18s | 232 | ? 2250 | 32 | 0.252 | 288 |
(c) Long Generators with Splitting Facilities: (none) | ||||||
(d) Combined Generators | ||||||
C2TAUS1 (M) | 2 | 231-1 | 1 260 | 31 | 0.907 | 62 |
C2TAUS2 (M) | 2 | 231-1 | 1 260 | 31 | 0.907 | 62 |
C2TAUS3 (M) | 2 | 231-1 | 1 260 | 31 | 0.907 | 62 |
C2MRG3 (M) | 6 | 231-1 | 1 2185 | 31 | 0.275 | 186 |
C3MWC (M) | 6 | 232 | ? 2118 | 32 | 0.570 | 192 |
(e) Combined Generators with Splitting Facilities | ||||||
C2LCG (M,S) | 2 | 231-85 | 1 260 | 31 | 0.526 | 62 |
C4LCG (M,S) | 4 | 231-1 | 1 2120 | 31 | 0.316 | 124 |
Generator | Calculating cycle lengths |
---|---|
LCGi | cycle = m = 232 |
PMMLCGi | cycle = m-1 < 231 |
ACG | cycle = 2r |
SCG | cycle = 2r |
SWBi | cycle = ( 232r - 232s ) / #cycles |
PSWB | cycle = mr - ms |
TGFSRi | cycle = 2w*N - 1 |
MRGi | cycle = mi - 1 |
C2MRG3 | cycle = (m13 - 1)*(m23 - 1)/2 |
C2TAUSi | cycle = Mask1 * Mask2 |
C2LCG | cycle = (m1 * m2) / 2 |
C4LCG | cycle = (m1 * m2 * m3 * m4) / 8 |
MWCA, MWCB | cycle = ? (not specified by Marsaglia) |
C3MWC | cycle = ? " " |
RWC2 | cycle = ? " " |
RWC8 | cycle = ? " " |
When distributions are created using the createWithDefaults: aZone method, they create their own generator and initialize it with NEXTSEED (or with RANDOMSEED, if you started the program with the --varyseed switch).
The generators used are as follows:
Table C-3. Random Library: Default Generators
Distribution | Generator | |
---|---|---|
RandomBitDist | uses | C2TAUS1gen |
BernoulliDist | uses | C2TAUS2gen |
UniformIntegerDist | uses | TT403gen |
UniformUnsignedDist | uses | TT775gen |
UniformDoubleDist | uses | TT800gen |
NormalDist | uses | MWCAgen |
LogNormalDist | uses | MWCBgen |
ExponentialDist | uses | C2TAUS3gen |
GammaDist | uses | PSWBgen |
These generators were chosen on the basis of quality and execution speed.
There are 4 default random objects defined in random/random.m. These are:
id <MT19937gen> randomGenerator; id <UniformIntegerDist> uniformIntRand; id <UniformUnsignedDist> uniformUnsRand; id <UniformDoubleDist> uniformDblRand; |
These objects may be called from anywhere in your program. Note (a): the generator is initialized with NEXTSEED or RANDOMSEED depending on the use of the --varyseed command line option. Note (b): the distribution objects are created without default statistical parameters.
In a separate tar file, available at the SFI ftp site, there are a set of programs which exercise aspects of the generator objects' functionality. The following (very utilitarian) programs are available:
testR0. a program which exercises every generator and distribution, verifying correct operation and comparing output to that obtained on the author's system.
testR1. a program which prints out diagnostic output for code in random.m.
testR2. a program which asks each distribution and generator to describe itself using the Swarm xprint method.
testR3. a program which asks each distribution and generator to describe itself using the objects' 'describe' method.
testR4. a program that performs timing tests on each generator and distribution, computing the time it takes to call each object 10,000,000 times.
testR6. a program that generates a binary file containing 2.5M variates from a specified generator, for purposes of statistical testing (e.g. with ENT or Diehard.)
testR7. a program which records, for each generator, the value of unsignedMax, the number of output bits, and the value of lengthOfSeedVector.
testR9. a program which records, for each generator and distribution, the object's 'magic number' and the buffer size needed for getState/setState.
Statistical testing: the generators have been put through the tests in the /testR6/ENT and /testR6/Diehard directories. The raw data can be found there. The results are summarized in the test log files found in the same test distribution. The distribution objects have not been tested statistically yet.
The code here represents an effort to implement several efficient, reasonably safe generators. The algorithms come from reading the literature (Bibliography). These algorithms have been implemented as accurately as possible and run through some simple tests. Some generators, included here for historical reasons only, are known to have bad statistical properties, and their use is deprecated. See Advanced Usage Guidefor information on the quality of the included generators.
While the objects in this library are believed to function correctly, the prudent and paranoid modeller would do well testing them in some domain-specific way. One easy way to do this is to run an experiment twice: once with one class of generator (say, PMMLCG), and once with another (say, SWB). If the results differ radically, then you can suspect one of the generators. If they don't, well, the generators still might be faulty!
The generators supplied with this release have been subjected to statistical testing using George Marsaglia's Diehard tests as well as John Walker's entropy tests (ENT). The results of these tests are summarized in Generator quality table. Other data on the properties of the generators are found there as well. These data support a discussion of how to choose one or more generators for your simulation.
The ENT test is included in the tarball of test programs mentioned above. The Diehard tests are copyright and hence are not, but they can be downloaded from the web at: ftp://ftp.csis.hku.hk/pub/random .
The distribution objects have not been statistically tested.