C.2. Usage Guide

C.2.1. Usage Guide for Generators

I shall first discuss `simple' generators, and then discuss how `split' generators differ from `simple' ones. In the following text, wherever I use PSWBgen as an example you may substitute any other generator.

Note: any name that starts with my is meant to designate a variable of the appropriate type which you have defined in your own program.

Note: on defining variables that hold (pointers to) generators: it is now a convention in the rest of Swarm that if you want to specify what type of object a pointer should point to, you say:
	id <protocolname> varname;
	varname = [classname create: aZone];
instead of
	classname *varname;
	varname = [classname create: aZone];

Although it is usually the case that the protocolname = the classname, in some cases it is not. And publishing the protocols allows the programmers to keep unpublished what should remain internal private class methods.

The generators are different from other Swarm objects, in that they all perform the same function; they are drop-in replacements for each other. The 'split' generators (C2LCGXgen, C4LCGXgen) all conform to the same protocol, <SplitRandomGenerator>. The 'simple' (non-split) generators all conform to the same protocol, <SimpleRandomGenerator>.

Thus, when defining generators in your own program, you should say
	id <SimpleRandomGenerator> varname;
	varname = [classname create: aZone];
(Though see below for the different create methods available.)

For backward compatibility, protocols <LCG1gen>, <TT800gen> etc. are still defined, but their use is deprecated and they may disappear later.

C.2.1.1. Simple generators

You create a generator in one of 3 ways:

C.2.1.1.1. the lazy way

   id <SimpleRandomGenerator> myGenerator;
   myGenerator = [ PSWBgen createWithDefaults: [self getZone] ];
This allocates the object and initializes it with STARTSEED, which equals NEXTSEED if --varyseed was not specified, or RANDOMSEED if it was. (These macros are defined in the file 'randomdefs.h' in the source directory.)

C.2.1.1.2. using a single seed value

   id <SimpleRandomGenerator> myGenerator;
   myGenerator = [ PSWBgen create: [self getZone] 
			setStateFromSeed: mySeed ];

This allocates the object and initializes it with your seed value. If the object actually requires a vector of seed values to fill the state, this method generates the rest of the values needed using an inline PMMLCG generator.

You can find out later what seed value was used to initialize the generator:
   myUnsigned = [ myGenerator getInitialSeed ];

And you can find out what the largest valid seed value is by calling
   myUnsigned = [ myGenerator getMaxSeedValue ];

(In the current version of the library, the largest valid seed value is 232-1 for all the generators. The seed may not be 0.)

You may reset the generator's state at any time using this method:
   [ myGenerator setStateFromSeed: mySeedValue ];
This will also reset to 0 the currentCount variable.

Alternatively, you may use the new -reset method [myGenerator reset], which resets the generator its state at startup, or its state at the point when -setStateFromSeed(s) was last used. Counters are zeroed.

C.2.1.1.3. using a vector of seed values

Assume we have defined a fixed array at compile time:
 unsigned  int mySeedVector [vectorLength];
Then we can do this:
   id <SimpleRandomGenerator> myGenerator;
   myGenerator = [ PSWBgen create: [self getZone]
			setStateFromSeeds: mySeedVector ];
You can find out how many seed values are required by asking
   myUnsigned = [ myGenerator lengthOfSeedVector ];

(Obviously, you must first successfully have created the object to do this, for example using createWithDefaults! Or, see data in Generator Data Table)

And we allocate the seed vector dynamically this way:
   unsigned int *mySeedVector;
   mySeedVector = [[self getZone] alloc: [ myGenerator lengthOfSeedVector]];

You can find out what vector of seed values was used to initialize the object:
   unsigned int *myVector;
   myVector = [ myGenerator getInitialSeeds ];

And you can find out the largest seed values that are allowed for the particular generator:
   unsigned int *myVectorToo;
   myVectorToo = [ myGenerator getMaxSeedValues ];

(These values vary from generator to generator, and they may not be the same for all elements of the vector for a given generator. Valid seeds never take the value 0.)

NOTE: in the above two calls, the variable myVector is set to point to an array internal to the generator. If you want to preserve the array's values outside the generator, you need to allocate space in your program either statically or dynamically, and use a for-loop to copy data from myVector[i] to myAllocatedVector[i].

You may reset the generator's state at any time by using the method
   [ myGenerator setStateFromSeeds: (unsigned *) mySeedVector ];
This will also reset to 0 the currentCount variable.

NOTE: if you set a generator's state from a vector of seeds, the call:
   myUnsignedValue = [ myGenerator getInitialSeed ];
will return a value of 0 (an invalid seed). On the other hand, if you initialize the generator with a single seed value, the call
   mySeedVector = [ myGenerator getInitialSeeds ];
will return the seed vector that would produce identical output to what you obtained using the single seed.

C.2.1.1.4. antithetic values

You can make the generator serve up antithetic values by setting:
   [ myGenerator setAntithetic: YES ];
If thus set, this makes -getUnsignedSample return (unsignedMax-x) instead of x, and the floating point methods return (1.0 - y) instead of y. The default for this parameter is that it is not set.

You can ascertain if this flag is set by calling
   myBooleanValue = [ myGenerator getAntithetic ];

C.2.1.1.5. generator output

You obtain successive pseudorandom numbers from a generator by calling:
   myUnsignedValue = [ myGenerator getUnsignedSample ];
The largest value that may be returned can be found by asking
   myUnsignedValue = [ myGenerator getUnsignedMax ];
(The smallest value returned is always 0.)

If you would rather have floating point output in the range [0.0,1.0), you call one of these:
   // Using 1 unsigned value to fill the mantissa:
   myFloatValue  = [ myGenerator getFloatSample ];
   myDoubleValue = [ myGenerator getThinDoubleSample ];
   // Using 2 unsigned values to fill the mantissa:
   myDoubleValue     = [ myGenerator getDoubleSample ];
   myLongDoubleValue = [ myGenerator getLongDoubleSample ];
NOTE that the last method is not portable, because the size of a long double varies and hence the precision varies between architectures.

Finally, you can obtain a count of how many variates have been generated:
   myLongLongInt = [ myGenerator getCurrentCount ];
(currentCount is an unsigned long long int, which counts up to 264.)

C.2.1.2. Split generators

A split generator is a generator for which we are able to split the output stream into arbitrary non-overlapping segments, which we can access directly and easily. Such segments are statistically independent streams of (pseudo)random numbers.

We configure a split generator as consisting of a number (A) of "virtual generators", each of which has access to a number (2v) of segments of length 2w. The parameters A,v,w are specified when the generator is created. For example, for the C4LCGXgen generator, the default creation values are A=128, v=31, w=41. The only limitation is that A*(2v)*(2w) must not exceed the generator's cycle length, which is 260 for C2LCGXgen and 2120 for C4LCGXgen.

We specify the configuration (A,v,w) at create time this way:
   id <SplitRandomGenerator> myGenerator;
   myGenerator = [ C4LCGXgen create: [self getZone]
			setA: 64 setv: 20 setw: 76 
			setStateFromSeed: mySeedValue ];
   id <SplitRandomGenerator> myGenerator;
   myGenerator = [ C4LCGXgen create: [self getZone]
			setA: 32 setv: 25 setw: 60
			setStateFromSeeds: (unsigned *) mySeedVector ];
(In both cases, the only limitation is that A * 2v * 2w must be less than the generator's period, 260 for C2LCGX and 2120 for C4LCGX.)

For obtaining output, we need to specify which of the A 'virtual' generators we want to draw from:
   myUnsignedValue   = [ myGenerator getUnsignedSample: 12 ];
   myFloatValue      = [ myGenerator getFloatSample: myVirtualGenerator ];
   myDoubleValue     = [ myGenerator getThinDoubleSample: someUnsignedValue ];
   myDoubleValue     = [ myGenerator getDoubleSample: 32 ];
   myLongDoubleValue = [ myGenerator getLongDoubleSample: 0 ];
Virtual generators are numbered from 0 to (A-1).

Obtaining the current count of variates generated likewise:
   myLongLongInt = [ myGenerator getCurrentCount:   myVirtualGenerator ];
   myLongLongInt = [ myGenerator getCurrentSegment: myVirtualGenerator ];
The latter call indicates what segment number the specified virtual generator is currently drawing numbers from.

Other than these methods, the methods discussed above under 'simple' generators are the same for 'split' generators.

In *addition* to this, 'split' generators have the following methods to manage the virtual generators:
   // Place all virtual generators at the start of the first segment:
   [ myGenerator initAll ];	// done automatically at creation
   // Place all virtual generators back to the start of the current segment:
   [ myGenerator restartAll ];
   // Place all virtual generators at the start of the next segment:
   [ myGenerator advanceAll ];
   // Place all virtual generators at the start of the indicated segment:
   [ myGenerator jumpAllToSegment: myLongLongIntValue ];

You may also address individual virtual generators:
   [ myGenerator initGenerator:    myVgen ];
   [ myGenerator restartGenerator: myVgen ];
   [ myGenerator advanceGenerator: myVgen ];
   [ myGenerator jumpGenerator:    myVgen    toSegment: myLongLongIntValue ];

InternalState methods common to simple and split generators:
   // Print (most of) the object's state data to a stream:
   [ myNormalDist describe: myStream ];

The stream myStream may be created thus:
id myStream = [ OutStream create: [self getZone] setFileStream: stdout ]; or
id myStream = [ OutStream create: [self getZone] setFileStream: stderr ];

   // Get the (class) name of the object:
   myString = [ myNormalDist getName ];

   // Get the object's 'magic number', used by putStateInto / setStateFrom:
   myUnsigned = [ myNormalDist getMagic ];

C.2.1.3. Saving and Resetting State

You may save, and later restore, the internal state of a generator using these methods:
   // Get the size of the memory buffer needed by putStateInto / setStateFrom:
   myUnsigned = [ myGenerator getStateSize ];

   // Extract the generator's state data into your memory buffer:
   [ myGenerator putStateInto: myBuffer ];

   // Set the generator's state from data in a memory buffer:
   [ myGenerator setStateFrom: myBuffer ];

To illustrate, assume the following data definitions:
   FILE * myFile;
   const char * myFileName = "MyGenFile.bin"; // or whatever
   int stateSizeG;
   id stateBufG;
   int status;

The following code shows how to save an object's state to disk: (You should add your own code to deal with disk file errors, either aborting or printing out error messages.)
   // Ask how big a buffer we need:
   stateSizeG = [ myGenerator getStateSize ];

   // Allocate memory for the buffer:
   stateBufG  = [[self getZone] alloc: stateSizeG];

   // Ask the generator to put state data into the buffer:
   [ myGenerator putStateInto: (void *) stateBufG ];

   // Open a disk file for output:
   myFile = fopen(myFileName, "w");
   if (myFile == NULL) { };	// error on open: disk full, or no permissions

   // Write the state buffer to disk in binary form:
   status = fwrite(stateBufG, stateSizeG, 1, myFile);
   if (status < 1) { };		// error on write: disk full?

   // Close the file
   status = fclose(myFile);
   if (status) { };		// error on close ?

   // Free the memory allocated to the buffer:
   [[self getZone] free: stateBufG];

   // Or, for test purposes, just zero the buffer data instead:
   // memset(stateBufG, 0, stateSizeG);

This code shows how to set an object's state from a disk file:
   // Ask how big a buffer we need:
   stateSizeG = [ myGenerator getStateSize ];

   // Allocate memory for the buffer:
   stateBufG  = [[self getZone] alloc: stateSizeG];

   // Open a disk file for input:
   myFile = fopen(myFileName, "r");
   if (myFile == NULL) { };	// error on open: file not found

   // Read state data into the memory buffer:
   status = fread(stateBufG, stateSizeG, 1, myFile);
   if (status < 1) { };		// error on read

   // Close the file:
   status = fclose(myFile);
   if (status) { };		// error on close

   // Ask the generator set its state from the buffer data:
   [ myGenerator setStateFrom: (void *) stateBufG ];

   // Free the memory allocated to the buffer:
   [[self getZone] free: stateBufG];

C.2.2. Usage Guide for Distributions

Where I use NormalDist in examples below, substitute any other distribution and its parameters as needed.

NOTE: any name that starts with my is meant to designate a variable of the appropriate type which you have defined in your own program.

C.2.2.1. Creating distributions

You create a distribution in one of several ways:

C.2.2.1.1. the lazy way:

   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist createWithDefaults: [self getZone]];

This method will create a distribution object with no default statistical parameters set, as well as a fresh generator object connected to it. The generator object is initialized with STARTSEED (see the discussion above). Different distribution classes use different generators for this purpose.

C.2.2.1.2. Without default parameters, using a simple generator

   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist create: [self getZone]
			setGenerator: mySimpleGenerator ];

myGenerator must of course first have been set to point to a random generator of the `simple' type. Note that you cannot assign a different generator to a distribution after it has been created.

You can create the generator at the same time as the distribution:
   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist create: [self getZone]
		setGenerator: [TT800gen create: [self getZone] 
				setStateFromSeed: 34453]         ];

C.2.2.1.3. Without default parameters, using a split generator

   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist create: [self getZone]
			setGenerator: mySplitGenerator
			setVirtualGenerator: 7 ];
or perhaps
   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist create: [self getZone]
		setGenerator: [C4LCGXgen createWithDefaults: [self getZone]]
			setVirtualGenerator: 99 ];

A split generator can be thought of as comprising a set of virtual generators (streams of random numbers), and a distribution object must be `connected' to one of these streams. You cannot re-assign the generator or the virtual generator after a distribution object has been created.

In all these cases, when we want to obtain a random variate from this distribution object we need to specify the statistical parameters:
   myDouble = [ myNormalDist getSampleWithMean: 3.3 withVariance: 1.7];
You can use different parameters for every call. (And you can use this call even if default parameters have been set.)

C.2.2.1.4. With default parameters, using a simple generator

   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist create: [self getZone]
			setGenerator: mySimpleGenerator
			setMean: 7.6 setVariance: 1.2 ];

C.2.2.1.5. With default parameters, using a split generator

   id <NormalDist> myNormalDist;
   myNormalDist = [ NormalDist create: [self getZone]
			setGenerator: mySplitGenerator
			setVirtualGenerator: 33
			setMean: 3.2 setVariance: 2.1 ];

In these cases, we do not need to specify parameters to get a random number:
   myDouble = [ myNormalDist getDoubleSample ];
However, you are allowed to specify parameters even if default parameters have been set.

(Of course, different distributions have different parameters: RandomBitDist has none, the Uniform objects have minimum and maximum limit values, NormalDist and LogNormalDist use Mean and Variance, ExponentialDist only Mean, and GammaDist used alpha and beta. See the individual distribution protocols or the file random/distributions.h for the specific methods available. )

C.2.2.1.7. You can obtain the current values of parameters

   // Default parameters:
   myDouble1 = [ myNormalDist getMean ];
   myDouble2 = [ myNormalDist getVariance ];
   myDouble3 = [ myNormalDist getStdDev ];

   // Get a pointer to the generator object:
   myOtherGenerator = [ myNormalDist getGenerator ];

   // Get the number of the virtual generator (if a split generator is used):
   myUnsignedValue  = [ myNormalDist getVirtualGenerator];

   // Find out if default parameters have been set:
   myBoolean        = [ myNormalDist getOptionsInitialized ];

   // Find out how many variates the object has delivered so far:
   // (The counter is an unsigned long long int, which goes up to 2^64.)
   myLongLongInt    = [ myNormalDist getCurrentCount ];

C.2.2.1.8. You can reset the variate counter and other state variables this way

   [ myNormalDist reset ];
This is most likely done in conjunction with resetting the connected generator, using
[ myGenerator setStateFromSeed: mySeedValue ]
or simply
[ myGenerator reset ];

C.2.2.1.9. Finally, we have the InternalState protocol methods

   // Print (most of) the object's state data to a stream:
   [ myNormalDist describe: myStream ];

The stream myStream may be created thus:
id myStream = [ OutStream create: [self getZone] setFileStream: stdout ]; or
id myStream = [ OutStream create: [self getZone] setFileStream: stderr ];

   // Get the (class) name of the object:
   myString = [ myNormalDist getName ];

   // Get the object's `magic number', used by putStateInto / setStateFrom:
   myUnsigned = [ myNormalDist getMagic ];

C.2.2.2. Saving And Restoring State

You may save, and later restore, the internal state of a distribution object using InternalState methods.

  • See the Generator Usage Guide, which describes how to do this. The code for saving/restoring distributions would be similar.

  • Note that saving the state of a distribution object will NOT automatically save the state of the attached generator; you are responsible for doing so. (Since it is possible, even encouraged, to use a single generator to feed several distribution objects, this is the only sane way of doing it.)