Professional Documents
Culture Documents
At the start of the nineties, there were no perfect algorithms to generate pseudo random numbers. The article of Park & Miller (1988) entitled Random generators: good ones are hard to find and is a clear proof. Most of the users thought the rand function they used was good because of a short period and a term to term dependence. Japenese mathematicians Matsumoto and Nishimura invented the first algorithm in 1998 whose period (219937-1) exceeds the number of electron spin changes since the creation of the Universe (106000 against 10120). It was a big breakthrough. As described in L'Ecuyer (1990), a (pseudo) random number generator (RNG) is defined by
Structure (S; ; f; U; g) where S is a finite set of states, is a probability distribution on S, called the initial distribution, a transition function f : S S,
Then the generation of random numbers is as follows: 1. generate the initial state (called the seed) s0 according to and compute u0 = g(s0), 2. iterate for i = 1; : : : , si = f(si=1) and ui =g(si). Generally, the seed s0 is determined using the clock machine, and so the random variates u0. un; seems real (independent and identical) i.i.d. uniform random variates. The period of a RNG, a key characteristic, is the smallest integer p N, such that n N; sp+n = sn.
Speed: - The rate of generating random number should be high to fulfill requirement of problem. Repeatability: - It should be repeatable after a period of time. Long period: - The repeatability should occur after a long period of time. Replicability: - The algorithm should provide the feature of replicability in order to find the variability among the random number generation by changing seeds.
The random number generator should be able to pass certain tests in order to be called as the good generator of random number. These tests can be broadly defined under three groups: The first requirement is that of a large period. The number M must be as large as possible, because a small set of numbers makes the outcome easier to predict a contrast to randomness. This leads to select M close to the largest integer machine number.
A second group of requirements are the statistical tests that check whether the numbers are distributed as intended. The simplest of such tests evaluates the sample mean and the sample variance s2 of the calculated random variates, and compares to the desired values of and 2. (Recall = 1/2 and 2 = 1/12 for the uniform distribution.) Another simple test is to check correlations. For example, it would not be desirable if small numbers are likely to be followed by small numbers.
The third group of tests is to check how well the random numbers distribute in higher-dimensional spaces. Generating uniform (0,1) random number using Linear congruential generator(LCG) Xn+1 = a Xn + c (mod m) m = modulus = 232 1 a = multiplier = choose carefully c = increment = (maybe) 0 X0 = seed Xn { 0, 1, 2, , m 1 }; Un = Xn/m Use odd number as seed Algebra/group theory helps with choice of a Want cycle of generator (number of steps before it begins repeating) to be large Don't generate more than m/1000 numbers
Yn+1 = a2 Yn + c2 (mod m) Wn+1 = Xn + Yn (mod m) Shuffling a random number generator Initialization Generate an array R of n (= 100) random numbers from sequence Xk Generate an additional number X to start the process
Each time generator is called Use X to find an index into the array R j X * n X R[ j ] R[ j ] a new random number Return X as random number for call
Shuffling with two generators Initialization Generate an array R of n (= 100) random numbers from sequence Xk
Each time generator is called Generate X from Xk and Y from Yk Use Y to find an index into the array R
Fibonacci Generators
Fibonacci and Additive Congruential Generators Take Xi = (Xi1 + Xi2)mod(m), i = 1, 2, . . . , where Ri = Xi/m, m is the modulus, X0,X1 are seeds, and a = b mod(m) if a is the remainder of b/m, e.g., 6 = 13mod 7. Problem: Small numbers follow small numbers. Also, its not possible to get Xi1 < Xi+1 < Xi or Xi < Xi+1 < Xi1
Algorithm for a typical Fibonacci series Repeat: := Ui Uj if < 0, set := + 1 Ui := i := i 1 j := j 1 if i = 0, set i := 17 if j = 0, set j := 17 Initialization: Set i = 17, j = 5, and calculate U1, ...,U17 with a congruential generator, for instance with M = 714025, a = 1366, b = 150889. Set the seed N0 = your favorite dream number, possibly inspired by the system clock of your computer.
X has expectation 0 and variance 1. The Central Limit Theorem assures that X is approximately normally distributed. But this crude attempt is not satisfying. Better methods calculate non uniformly distributed random variables, for example, by a suitable transformation out of a uniformly distributed random variable but the most obvious approach inverts the distribution function.
Generate a continuous random variable X F as follows: Generate a random uniform variable U Set X = 1/F(U)
Assumption is that F-1(X) exists. Using the inverse and hence the name Inverse Transform Method. Let X be a random variable with C.D.F. FX (x). Since FX (x) is a nondecreasing function, the inverse function FX-1 (y) may be defined for any value of y between 0 and 1 as: FX-1 (y) = inf{ x: FX (x) >= y} 0<= y <= 1
FX-1 (y) is defined to equal that value x for which F(x)=y. Let us proof that if U is uniformly distributed over the interval (0, 1), then
X = FX-1 (U) has c.d.f. FX (x). The proof: P(X <= x) = P(FX-1 (U) <= x) = P(U <= FX (x)) = FX (x). So to get a value, say x, of a random variable X, obtain a value, say u, of a random variable U, compute FX-1 (U) , and set it equal to x. To generate a random variable from the uniform distribution U(a, b):
The c.d.f. is
X=
= a + (b - a)U ):
U = X2 X= Let Define Generate and . and are respectively: = U1/2 be IID random variable distributed and .
The distributions of
In the particular case where X = U we have = and = To apply this method FX (x) must exist in a form for which the corresponding inverse transform can be found analytically. Distributions in this group are exponential, uniform, Weibull, logistic, and Cauchy. Unfortunately, for many probability distributions it is either impossible or extremely difficult to find the inverse transform, that is to solve
with respect to x. Inverse Transform Method for simulating discrete random variables. The Inverse Transform Method for simulating from continuous random variables have analog in the discrete case. For instance, if we want to simulate a random variable X having p.d.f. P(X = xj) = Pj j=0, 1, ...
To simulate X for which P(X = xj) = Pj let U be uniformly distributed over (0, 1), and set
As,
we see that X has the desired distribution. To Generate a Bernoulli random variable. P(X = 1) = p, P(X = 0) = 1 - p
Let
be a. random variable distributed . Generate a random variable with the above distribution.
Algorithm 1: 1) 2) 3) 4) generate U ~ U(0,1) if U<0.2 --> X=1 if U<0.5 --> X=2 if U<1 --> X=3
Algorithm 2: 1) 2) 3) 4) generate U ~ U(0,1) if U<0.5 --> X=3 if U<0.8 --> X=2 if U<1 --> X=1
To generate a geometric random variables. Suppose we want to simulate X such that P(X = i) = p(1 - p)i-1, i >= 1
X can be thought of as representing the time of the first success when independent trials, each of which is a success with probability p, are performed. As
since j-1 trials are all failures. We can simulate such a random variable by generating a random variable U and then setting X equal to that value j for which 1 - (1 - p)j-1 < U < 1 - (1 - p)j or, equivalently, for which (1 - p)j < 1 - U < (1 - p)j-1
or, equivalently, for which (1 - p)j < U < (1 - p)j-1 We can thus define X by X = min{j: (1 - p) j < U} = min{j: j > log(U)/log(1 - p)} = 1 + [log(U)/log(1 - p)] To Simulate a binomial random variable. A binomial (n, p) random variable can be most easily simulated as the sum of n independent Bernoulli random variables. That is, if U1, . . . Un are independent uniform U(0, 1) variables, then setting
it follows that
One difficulty with the above procedure is that is requires the generation of n random numbers. To show how to reduce the number of random numbers needed, note first that the above procedure does not use the actual value of a random number U but only whether or not it exceeds p. Recall that , i = 0,1,,n We employ the inverse transform method by making use of the recursive identity
With i denoting the value currently under consideration, pr = P(X = i) the probability that X is equal to i, and F = F(i) is the probability that X is less than or equal to i, the algorithm can be expressed as follows: 1. Generate a random number U. 2. 3. 4. 5. c = p/(1-p), i = 0, , F = pr. If U < F, set X = i and stop. pr = c(n i)/(i + 1))pr, F = F + pr, i = i + 1. Go to 3.
The preceding algorithm first check whether X = 0, then X = 1, and so on. Hence, the number of searches it makes is 1 more than the value of X. Therefore, on average, it will take 1 + np searches to generate X. Since a binomial (n, p) random variable represents the number of successes in n independent trials when each is a success with probability p, it follows that such a random variable can also be generated by subtracting from n the value of a binomial (n, 1 p). Hence, when p > , we can generate a binomial (n, 1 - p) random variable by the above method and subtract its value from n to obtain the desired generation. To Simulate a Poisson random variable. The random variable X is Poisson with mean if
i = 0, 1, The key to using the Inverse Transform Method to generate such a random variable is the following identity:
i>=0
Upon using thr above recursion to compute the Poisson probabilities as they become needed, the inverse transform algorithm for generating a Poisson random variable with the mean can be expressed as follows. The quantity i refers to the value presently under consideration; is the probability that X is equal to i, and F = F(i) is the probability that X is less than or equal to i) 1. Generate a random number U. 2. 3. 4. 5. i = 0, , F = p. If U < F, set X = i and stop. p = p/(I + 1), F = F + p, i = i + 1. Go to 3.
To see that the above algorithm does indeed generate a Poisson random variable with the mean , note that it first generates a random number U and then checks whether or not . If so, itsets X = 0. If not, then it computes + by using the recursion. It now
The above algorithm successively checks whether the Poisson value is 0, then whether it is 1, then 2, and so on. Thus, the number of comparisons needed will be 1 greater than the generated value of the Poisson. Hence, on average, the above will need to make 1 + searches. Whereas
this is fine when is small, it can be greatly improved upon when is large. Indeed, since a Poisson random variable with the mean is most likely to take on one of the two integral values closest to , a more efficient algorithm would first check one of these values, rather than starting at 0 and working upward. The algorithm lets I = int( ) and computes (by first taking logarithms and then by raising the result to the power e). It then uses the above recursion to determine F(I). It now generates a Poisson random variable with the mean by generating a random number U , and then noting whether or not X <= I by seeing whether or not U <= F(I). It then searches downward starting from X = I in the case where X <= I and upward starting from X = I + 1 otherwise. The number of searches needing by the algorithm is roughly 1 more than the absolute difference between the random variable X and its mean . Since, for large a Poisson is (by the central limit theorem) approximately normal with mean and variance both equal to , it follows that average numbers of searches ~ 1 + E(|X = 1+ = 1+ = 1 + 0.798 That is, the average numbers of searches grows with the square root of with as becomes larger and larger. rather than where Z ~ N(0, 1) |) where X ~ N( , )