A note on `simple' vs. `split' generators Most of the generators supplied are of the 'simple' variety. Think of these as supplying a single, long stream of pseudorandom numbers. The description in the introduction of how to obtain generator output assumed a simple generator. There are also two generators, C2LCGXgen and C4LCGXgen, which are of the `split' variety. Think of them as consisting of a number of virtual generators, each supplying an independent stream of numbers which we can divide up into a number of segments of a given length. To obtain output from such a generator, we need to specify what virtual generator to draw from:
|
I shall first discuss `simple' generators, and then discuss how `split' generators differ from `simple' ones. In the following text, wherever I use PSWBgen as an example you may substitute any other generator.
Note: any name that starts with my is meant to designate a variable of the appropriate type which you have defined in your own program.
Note: on defining variables that hold (pointers to) generators: it is now a convention in the rest of Swarm that if you want to specify what type of object a pointer should point to, you say:
id <protocolname> varname; varname = [classname create: aZone]; |
classname *varname; varname = [classname create: aZone]; |
Although it is usually the case that the protocolname = the classname, in some cases it is not. And publishing the protocols allows the programmers to keep unpublished what should remain internal private class methods.
The generators are different from other Swarm objects, in that they all perform the same function; they are drop-in replacements for each other. The 'split' generators (C2LCGXgen, C4LCGXgen) all conform to the same protocol, <SplitRandomGenerator>. The 'simple' (non-split) generators all conform to the same protocol, <SimpleRandomGenerator>.
Thus, when defining generators in your own program, you should say
id <SimpleRandomGenerator> varname; varname = [classname create: aZone]; |
For backward compatibility, protocols <LCG1gen>, <TT800gen> etc. are still defined, but their use is deprecated and they may disappear later.
You create a generator in one of 3 ways:
id <SimpleRandomGenerator> myGenerator; myGenerator = [ PSWBgen createWithDefaults: [self getZone] ]; |
id <SimpleRandomGenerator> myGenerator; myGenerator = [ PSWBgen create: [self getZone] setStateFromSeed: mySeed ]; |
This allocates the object and initializes it with your seed value. If the object actually requires a vector of seed values to fill the state, this method generates the rest of the values needed using an inline PMMLCG generator.
You can find out later what seed value was used to initialize the generator:
myUnsigned = [ myGenerator getInitialSeed ]; |
And you can find out what the largest valid seed value is by calling
myUnsigned = [ myGenerator getMaxSeedValue ]; |
(In the current version of the library, the largest valid seed value is 232-1 for all the generators. The seed may not be 0.)
You may reset the generator's state at any time using this method:
[ myGenerator setStateFromSeed: mySeedValue ]; |
Alternatively, you may use the new -reset method [myGenerator reset], which resets the generator its state at startup, or its state at the point when -setStateFromSeed(s) was last used. Counters are zeroed.
Assume we have defined a fixed array at compile time:
unsigned int mySeedVector [vectorLength]; |
id <SimpleRandomGenerator> myGenerator; myGenerator = [ PSWBgen create: [self getZone] setStateFromSeeds: mySeedVector ]; |
myUnsigned = [ myGenerator lengthOfSeedVector ]; |
(Obviously, you must first successfully have created the object to do this, for example using createWithDefaults! Or, see data in Generator Data Table)
And we allocate the seed vector dynamically this way:
unsigned int *mySeedVector; mySeedVector = [[self getZone] alloc: [ myGenerator lengthOfSeedVector]]; |
You can find out what vector of seed values was used to initialize the object:
unsigned int *myVector; myVector = [ myGenerator getInitialSeeds ]; |
And you can find out the largest seed values that are allowed for the particular generator:
unsigned int *myVectorToo; myVectorToo = [ myGenerator getMaxSeedValues ]; |
(These values vary from generator to generator, and they may not be the same for all elements of the vector for a given generator. Valid seeds never take the value 0.)
NOTE: in the above two calls, the variable myVector is set to point to an array internal to the generator. If you want to preserve the array's values outside the generator, you need to allocate space in your program either statically or dynamically, and use a for-loop to copy data from myVector[i] to myAllocatedVector[i].
You may reset the generator's state at any time by using the method
[ myGenerator setStateFromSeeds: (unsigned *) mySeedVector ]; |
NOTE: if you set a generator's state from a vector of seeds, the call:
myUnsignedValue = [ myGenerator getInitialSeed ]; |
mySeedVector = [ myGenerator getInitialSeeds ]; |
You can make the generator serve up antithetic values by setting:
[ myGenerator setAntithetic: YES ]; |
You can ascertain if this flag is set by calling
myBooleanValue = [ myGenerator getAntithetic ]; |
You obtain successive pseudorandom numbers from a generator by calling:
myUnsignedValue = [ myGenerator getUnsignedSample ]; |
myUnsignedValue = [ myGenerator getUnsignedMax ]; |
If you would rather have floating point output in the range [0.0,1.0), you call one of these:
// Using 1 unsigned value to fill the mantissa: myFloatValue = [ myGenerator getFloatSample ]; myDoubleValue = [ myGenerator getThinDoubleSample ]; |
// Using 2 unsigned values to fill the mantissa: myDoubleValue = [ myGenerator getDoubleSample ]; myLongDoubleValue = [ myGenerator getLongDoubleSample ]; |
Finally, you can obtain a count of how many variates have been generated:
myLongLongInt = [ myGenerator getCurrentCount ]; |
A split generator is a generator for which we are able to split the output stream into arbitrary non-overlapping segments, which we can access directly and easily. Such segments are statistically independent streams of (pseudo)random numbers.
We configure a split generator as consisting of a number (A) of "virtual generators", each of which has access to a number (2v) of segments of length 2w. The parameters A,v,w are specified when the generator is created. For example, for the C4LCGXgen generator, the default creation values are A=128, v=31, w=41. The only limitation is that A*(2v)*(2w) must not exceed the generator's cycle length, which is 260 for C2LCGXgen and 2120 for C4LCGXgen.
We specify the configuration (A,v,w) at create time this way:
id <SplitRandomGenerator> myGenerator; myGenerator = [ C4LCGXgen create: [self getZone] setA: 64 setv: 20 setw: 76 setStateFromSeed: mySeedValue ]; |
id <SplitRandomGenerator> myGenerator; myGenerator = [ C4LCGXgen create: [self getZone] setA: 32 setv: 25 setw: 60 setStateFromSeeds: (unsigned *) mySeedVector ]; |
For obtaining output, we need to specify which of the A 'virtual' generators we want to draw from:
myUnsignedValue = [ myGenerator getUnsignedSample: 12 ]; myFloatValue = [ myGenerator getFloatSample: myVirtualGenerator ]; myDoubleValue = [ myGenerator getThinDoubleSample: someUnsignedValue ]; myDoubleValue = [ myGenerator getDoubleSample: 32 ]; myLongDoubleValue = [ myGenerator getLongDoubleSample: 0 ]; |
Obtaining the current count of variates generated likewise:
myLongLongInt = [ myGenerator getCurrentCount: myVirtualGenerator ]; myLongLongInt = [ myGenerator getCurrentSegment: myVirtualGenerator ]; |
Other than these methods, the methods discussed above under 'simple' generators are the same for 'split' generators.
In *addition* to this, 'split' generators have the following methods to manage the virtual generators:
// Place all virtual generators at the start of the first segment: [ myGenerator initAll ]; // done automatically at creation |
// Place all virtual generators back to the start of the current segment: [ myGenerator restartAll ]; |
// Place all virtual generators at the start of the next segment: [ myGenerator advanceAll ]; |
// Place all virtual generators at the start of the indicated segment: [ myGenerator jumpAllToSegment: myLongLongIntValue ]; |
You may also address individual virtual generators:
[ myGenerator initGenerator: myVgen ]; [ myGenerator restartGenerator: myVgen ]; [ myGenerator advanceGenerator: myVgen ]; [ myGenerator jumpGenerator: myVgen toSegment: myLongLongIntValue ]; |
InternalState methods common to simple and split generators:
// Print (most of) the object's state data to a stream: [ myNormalDist describe: myStream ]; |
The stream myStream may be created thus:
id myStream = [ OutStream create: [self getZone] setFileStream: stdout ]; or id myStream = [ OutStream create: [self getZone] setFileStream: stderr ]; // Get the (class) name of the object: myString = [ myNormalDist getName ]; // Get the object's 'magic number', used by putStateInto / setStateFrom: myUnsigned = [ myNormalDist getMagic ]; |
You may save, and later restore, the internal state of a generator using these methods:
// Get the size of the memory buffer needed by putStateInto / setStateFrom: myUnsigned = [ myGenerator getStateSize ]; // Extract the generator's state data into your memory buffer: [ myGenerator putStateInto: myBuffer ]; // Set the generator's state from data in a memory buffer: [ myGenerator setStateFrom: myBuffer ]; |
To illustrate, assume the following data definitions:
FILE * myFile; const char * myFileName = "MyGenFile.bin"; // or whatever int stateSizeG; id stateBufG; int status; |
The following code shows how to save an object's state to disk: (You should add your own code to deal with disk file errors, either aborting or printing out error messages.)
// Ask how big a buffer we need: stateSizeG = [ myGenerator getStateSize ]; // Allocate memory for the buffer: stateBufG = [[self getZone] alloc: stateSizeG]; // Ask the generator to put state data into the buffer: [ myGenerator putStateInto: (void *) stateBufG ]; // Open a disk file for output: myFile = fopen(myFileName, "w"); if (myFile == NULL) { }; // error on open: disk full, or no permissions // Write the state buffer to disk in binary form: status = fwrite(stateBufG, stateSizeG, 1, myFile); if (status < 1) { }; // error on write: disk full? // Close the file status = fclose(myFile); if (status) { }; // error on close ? // Free the memory allocated to the buffer: [[self getZone] free: stateBufG]; // Or, for test purposes, just zero the buffer data instead: // memset(stateBufG, 0, stateSizeG); |
This code shows how to set an object's state from a disk file:
// Ask how big a buffer we need: stateSizeG = [ myGenerator getStateSize ]; // Allocate memory for the buffer: stateBufG = [[self getZone] alloc: stateSizeG]; // Open a disk file for input: myFile = fopen(myFileName, "r"); if (myFile == NULL) { }; // error on open: file not found // Read state data into the memory buffer: status = fread(stateBufG, stateSizeG, 1, myFile); if (status < 1) { }; // error on read // Close the file: status = fclose(myFile); if (status) { }; // error on close // Ask the generator set its state from the buffer data: [ myGenerator setStateFrom: (void *) stateBufG ]; // Free the memory allocated to the buffer: [[self getZone] free: stateBufG]; |
Where I use NormalDist in examples below, substitute any other distribution and its parameters as needed.
NOTE: any name that starts with my is meant to designate a variable of the appropriate type which you have defined in your own program.
You create a distribution in one of several ways:
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist createWithDefaults: [self getZone]]; |
This method will create a distribution object with no default statistical parameters set, as well as a fresh generator object connected to it. The generator object is initialized with STARTSEED (see the discussion above). Different distribution classes use different generators for this purpose.
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist create: [self getZone] setGenerator: mySimpleGenerator ]; |
myGenerator must of course first have been set to point to a random generator of the `simple' type. Note that you cannot assign a different generator to a distribution after it has been created.
You can create the generator at the same time as the distribution:
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist create: [self getZone] setGenerator: [TT800gen create: [self getZone] setStateFromSeed: 34453] ]; |
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist create: [self getZone] setGenerator: mySplitGenerator setVirtualGenerator: 7 ]; |
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist create: [self getZone] setGenerator: [C4LCGXgen createWithDefaults: [self getZone]] setVirtualGenerator: 99 ]; |
A split generator can be thought of as comprising a set of virtual generators (streams of random numbers), and a distribution object must be `connected' to one of these streams. You cannot re-assign the generator or the virtual generator after a distribution object has been created.
In all these cases, when we want to obtain a random variate from this distribution object we need to specify the statistical parameters:
myDouble = [ myNormalDist getSampleWithMean: 3.3 withVariance: 1.7]; |
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist create: [self getZone] setGenerator: mySimpleGenerator setMean: 7.6 setVariance: 1.2 ]; |
id <NormalDist> myNormalDist; myNormalDist = [ NormalDist create: [self getZone] setGenerator: mySplitGenerator setVirtualGenerator: 33 setMean: 3.2 setVariance: 2.1 ]; |
In these cases, we do not need to specify parameters to get a random number:
myDouble = [ myNormalDist getDoubleSample ]; |
(Of course, different distributions have different parameters: RandomBitDist has none, the Uniform objects have minimum and maximum limit values, NormalDist and LogNormalDist use Mean and Variance, ExponentialDist only Mean, and GammaDist used alpha and beta. See the individual distribution protocols or the file random/distributions.h for the specific methods available. )
[ myNormalDist setMean: 3.3 setVariance: 2.2 ]; |
// Default parameters: myDouble1 = [ myNormalDist getMean ]; myDouble2 = [ myNormalDist getVariance ]; myDouble3 = [ myNormalDist getStdDev ]; // Get a pointer to the generator object: myOtherGenerator = [ myNormalDist getGenerator ]; // Get the number of the virtual generator (if a split generator is used): myUnsignedValue = [ myNormalDist getVirtualGenerator]; // Find out if default parameters have been set: myBoolean = [ myNormalDist getOptionsInitialized ]; // Find out how many variates the object has delivered so far: // (The counter is an unsigned long long int, which goes up to 2^64.) myLongLongInt = [ myNormalDist getCurrentCount ]; |
[ myNormalDist reset ]; |
[ myGenerator setStateFromSeed: mySeedValue ] |
[ myGenerator reset ]; |
// Print (most of) the object's state data to a stream: [ myNormalDist describe: myStream ]; |
The stream myStream may be created thus:
id myStream = [ OutStream create: [self getZone] setFileStream: stdout ]; or id myStream = [ OutStream create: [self getZone] setFileStream: stderr ]; // Get the (class) name of the object: myString = [ myNormalDist getName ]; // Get the object's `magic number', used by putStateInto / setStateFrom: myUnsigned = [ myNormalDist getMagic ]; |
You may save, and later restore, the internal state of a distribution object using InternalState methods.
See the Generator Usage Guide, which describes how to do this. The code for saving/restoring distributions would be similar.
Note that saving the state of a distribution object will NOT automatically save the state of the attached generator; you are responsible for doing so. (Since it is possible, even encouraged, to use a single generator to feed several distribution objects, this is the only sane way of doing it.)