You are on page 1of 12

Generation of Uniform (0,1) Random Numbers

Introduction to random numbers


A random number is a number generated by a process, whose outcome is unpredictable, and which cannot be subsequentially reproduced. This works fine provided that one has some kind of a black box which will fulfill this task and such a black box can be called as a random number generator. The sequence of number is random if the quantity of information it contains in the sense of Shannon's information theory is also infinite. In other words, it must not be possible for a computer program, whose length is finite, to produce this sequence. A concept that is present in both of these definitions and that must be emphasized is the fact that numbers in a random sequence must not be correlated. Random simulation has long been a very popular and well researched field of mathematics. There exists a wide range of applications in biology, insurance, physics and many other fields of applied science. So simulations of random numbers are crucial. Let us recall the only things, that are truly random, are the measurement of physical phenomena such as thermal noises of semiconductor chips or radioactive sources. The only way to simulate some randomness on computers are carried out by deterministic algorithms. Thus the concept of truly randomness doesnt exist through a computer algorithm. Excluding true randomness, there are two kinds random generation: pseudo and quasi random number generators.

Overview of random number generation


Firstly, we present the pseudo random number generation and then the quasi random number generation. Random numbers are known as random variates of the uniform U(0; 1) distribution. More complex distributions can be generated with uniform variates and rejection or inversion methods. Pseudo random number generation aims to seem random whereas quasi random number generation aims to be deterministic but well equidistributed. Statistical randomness tests aim at determining whether a particular sequence of numbers was produced by a random number generator. The approach is to calculate certain statistical quantities and compare them with average values that would be obtained in the case of a random sequence. These average values are obtained from calculations performed on the model of an ideal random number generator. Testing randomness is an empirical task. There exists numerous tests, each one of them revealing a particular type of imperfection in a sequence.

Pseudo random generation

At the start of the nineties, there were no perfect algorithms to generate pseudo random numbers. The article of Park & Miller (1988) entitled Random generators: good ones are hard to find and is a clear proof. Most of the users thought the rand function they used was good because of a short period and a term to term dependence. Japenese mathematicians Matsumoto and Nishimura invented the first algorithm in 1998 whose period (219937-1) exceeds the number of electron spin changes since the creation of the Universe (106000 against 10120). It was a big breakthrough. As described in L'Ecuyer (1990), a (pseudo) random number generator (RNG) is defined by

Structure (S; ; f; U; g) where S is a finite set of states, is a probability distribution on S, called the initial distribution, a transition function f : S S,

a finite set of output symbols U, an output function g : S U.

Then the generation of random numbers is as follows: 1. generate the initial state (called the seed) s0 according to and compute u0 = g(s0), 2. iterate for i = 1; : : : , si = f(si=1) and ui =g(si). Generally, the seed s0 is determined using the clock machine, and so the random variates u0. un; seems real (independent and identical) i.i.d. uniform random variates. The period of a RNG, a key characteristic, is the smallest integer p N, such that n N; sp+n = sn.

Qualities of random generators


The random numbers can be generated by variety of ways. So, can all the algorithms can be accepted as generator of random numbers. What are the good random generator algorithms which produces good random numbers? How do we say that a particular random number is good to be accepted? There are certain qualities based on which we can say that a particular random number is good. The requirement of a random number is that the numbers should meet all aims, or rather pass as many tests as possible. The properties of Pseudo random numbers are Low storage requirement: - The random number and its algorithm should not require extensive memory in order to run the program. This basically explains that the algorithm should be logical and should follow sequential methodology.

Speed: - The rate of generating random number should be high to fulfill requirement of problem. Repeatability: - It should be repeatable after a period of time. Long period: - The repeatability should occur after a long period of time. Replicability: - The algorithm should provide the feature of replicability in order to find the variability among the random number generation by changing seeds.

The random number generator should be able to pass certain tests in order to be called as the good generator of random number. These tests can be broadly defined under three groups: The first requirement is that of a large period. The number M must be as large as possible, because a small set of numbers makes the outcome easier to predict a contrast to randomness. This leads to select M close to the largest integer machine number.

A second group of requirements are the statistical tests that check whether the numbers are distributed as intended. The simplest of such tests evaluates the sample mean and the sample variance s2 of the calculated random variates, and compares to the desired values of and 2. (Recall = 1/2 and 2 = 1/12 for the uniform distribution.) Another simple test is to check correlations. For example, it would not be desirable if small numbers are likely to be followed by small numbers.

The third group of tests is to check how well the random numbers distribute in higher-dimensional spaces. Generating uniform (0,1) random number using Linear congruential generator(LCG) Xn+1 = a Xn + c (mod m) m = modulus = 232 1 a = multiplier = choose carefully c = increment = (maybe) 0 X0 = seed Xn { 0, 1, 2, , m 1 }; Un = Xn/m Use odd number as seed Algebra/group theory helps with choice of a Want cycle of generator (number of steps before it begins repeating) to be large Don't generate more than m/1000 numbers

Composite generator Xn+1 = a1 Xn + c1 (mod m)

Yn+1 = a2 Yn + c2 (mod m) Wn+1 = Xn + Yn (mod m) Shuffling a random number generator Initialization Generate an array R of n (= 100) random numbers from sequence Xk Generate an additional number X to start the process

Each time generator is called Use X to find an index into the array R j X * n X R[ j ] R[ j ] a new random number Return X as random number for call

Shuffling with two generators Initialization Generate an array R of n (= 100) random numbers from sequence Xk

Each time generator is called Generate X from Xk and Y from Yk Use Y to find an index into the array R

jY*n Z R[ j ] R[ j ] X Return Z as random number for call

Fibonacci Generators
Fibonacci and Additive Congruential Generators Take Xi = (Xi1 + Xi2)mod(m), i = 1, 2, . . . , where Ri = Xi/m, m is the modulus, X0,X1 are seeds, and a = b mod(m) if a is the remainder of b/m, e.g., 6 = 13mod 7. Problem: Small numbers follow small numbers. Also, its not possible to get Xi1 < Xi+1 < Xi or Xi < Xi+1 < Xi1

Algorithm for a typical Fibonacci series Repeat: := Ui Uj if < 0, set := + 1 Ui := i := i 1 j := j 1 if i = 0, set i := 17 if j = 0, set j := 17 Initialization: Set i = 17, j = 5, and calculate U1, ...,U17 with a congruential generator, for instance with M = 714025, a = 1366, b = 150889. Set the seed N0 = your favorite dream number, possibly inspired by the system clock of your computer.

Extending to random variables from other distributions


X: = for Ui U[0, 1]

X has expectation 0 and variance 1. The Central Limit Theorem assures that X is approximately normally distributed. But this crude attempt is not satisfying. Better methods calculate non uniformly distributed random variables, for example, by a suitable transformation out of a uniformly distributed random variable but the most obvious approach inverts the distribution function.

Inversion Transform Method

Generate a continuous random variable X F as follows: Generate a random uniform variable U Set X = 1/F(U)

Assumption is that F-1(X) exists. Using the inverse and hence the name Inverse Transform Method. Let X be a random variable with C.D.F. FX (x). Since FX (x) is a nondecreasing function, the inverse function FX-1 (y) may be defined for any value of y between 0 and 1 as: FX-1 (y) = inf{ x: FX (x) >= y} 0<= y <= 1

FX-1 (y) is defined to equal that value x for which F(x)=y. Let us proof that if U is uniformly distributed over the interval (0, 1), then

X = FX-1 (U) has c.d.f. FX (x). The proof: P(X <= x) = P(FX-1 (U) <= x) = P(U <= FX (x)) = FX (x). So to get a value, say x, of a random variable X, obtain a value, say u, of a random variable U, compute FX-1 (U) , and set it equal to x. To generate a random variable from the uniform distribution U(a, b):

The c.d.f. is

X=

= a + (b - a)U ):

To generate a random variable from the exponential distribution Exp(

To generate a random variable with p.d.f.

U = X2 X= Let Define Generate and . and are respectively: = U1/2 be IID random variable distributed and .

The distributions of

In the particular case where X = U we have = and = To apply this method FX (x) must exist in a form for which the corresponding inverse transform can be found analytically. Distributions in this group are exponential, uniform, Weibull, logistic, and Cauchy. Unfortunately, for many probability distributions it is either impossible or extremely difficult to find the inverse transform, that is to solve

with respect to x. Inverse Transform Method for simulating discrete random variables. The Inverse Transform Method for simulating from continuous random variables have analog in the discrete case. For instance, if we want to simulate a random variable X having p.d.f. P(X = xj) = Pj j=0, 1, ...

To simulate X for which P(X = xj) = Pj let U be uniformly distributed over (0, 1), and set

As,

we see that X has the desired distribution. To Generate a Bernoulli random variable. P(X = 1) = p, P(X = 0) = 1 - p

The algorithm: 1) generate U ~ U(0, 1) 2) if U < 1 - p or p < 1 - U or U > p then X = 0, otherwise X = 1.

Let

be a. random variable distributed . Generate a random variable with the above distribution.

Algorithm 1: 1) 2) 3) 4) generate U ~ U(0,1) if U<0.2 --> X=1 if U<0.5 --> X=2 if U<1 --> X=3

Algorithm 2: 1) 2) 3) 4) generate U ~ U(0,1) if U<0.5 --> X=3 if U<0.8 --> X=2 if U<1 --> X=1

To generate a geometric random variables. Suppose we want to simulate X such that P(X = i) = p(1 - p)i-1, i >= 1

X can be thought of as representing the time of the first success when independent trials, each of which is a success with probability p, are performed. As

since j-1 trials are all failures. We can simulate such a random variable by generating a random variable U and then setting X equal to that value j for which 1 - (1 - p)j-1 < U < 1 - (1 - p)j or, equivalently, for which (1 - p)j < 1 - U < (1 - p)j-1

or, equivalently, for which (1 - p)j < U < (1 - p)j-1 We can thus define X by X = min{j: (1 - p) j < U} = min{j: j > log(U)/log(1 - p)} = 1 + [log(U)/log(1 - p)] To Simulate a binomial random variable. A binomial (n, p) random variable can be most easily simulated as the sum of n independent Bernoulli random variables. That is, if U1, . . . Un are independent uniform U(0, 1) variables, then setting

it follows that

is a binomial random variable with parameters n and p.

One difficulty with the above procedure is that is requires the generation of n random numbers. To show how to reduce the number of random numbers needed, note first that the above procedure does not use the actual value of a random number U but only whether or not it exceeds p. Recall that , i = 0,1,,n We employ the inverse transform method by making use of the recursive identity

With i denoting the value currently under consideration, pr = P(X = i) the probability that X is equal to i, and F = F(i) is the probability that X is less than or equal to i, the algorithm can be expressed as follows: 1. Generate a random number U. 2. 3. 4. 5. c = p/(1-p), i = 0, , F = pr. If U < F, set X = i and stop. pr = c(n i)/(i + 1))pr, F = F + pr, i = i + 1. Go to 3.

The preceding algorithm first check whether X = 0, then X = 1, and so on. Hence, the number of searches it makes is 1 more than the value of X. Therefore, on average, it will take 1 + np searches to generate X. Since a binomial (n, p) random variable represents the number of successes in n independent trials when each is a success with probability p, it follows that such a random variable can also be generated by subtracting from n the value of a binomial (n, 1 p). Hence, when p > , we can generate a binomial (n, 1 - p) random variable by the above method and subtract its value from n to obtain the desired generation. To Simulate a Poisson random variable. The random variable X is Poisson with mean if

i = 0, 1, The key to using the Inverse Transform Method to generate such a random variable is the following identity:

i>=0

Upon using thr above recursion to compute the Poisson probabilities as they become needed, the inverse transform algorithm for generating a Poisson random variable with the mean can be expressed as follows. The quantity i refers to the value presently under consideration; is the probability that X is equal to i, and F = F(i) is the probability that X is less than or equal to i) 1. Generate a random number U. 2. 3. 4. 5. i = 0, , F = p. If U < F, set X = i and stop. p = p/(I + 1), F = F + p, i = i + 1. Go to 3.

To see that the above algorithm does indeed generate a Poisson random variable with the mean , note that it first generates a random number U and then checks whether or not . If so, itsets X = 0. If not, then it computes + by using the recursion. It now

checks whether U < sets X = 1; and so on.

(where the right-hand side is the new value of F), and if so it

The above algorithm successively checks whether the Poisson value is 0, then whether it is 1, then 2, and so on. Thus, the number of comparisons needed will be 1 greater than the generated value of the Poisson. Hence, on average, the above will need to make 1 + searches. Whereas

this is fine when is small, it can be greatly improved upon when is large. Indeed, since a Poisson random variable with the mean is most likely to take on one of the two integral values closest to , a more efficient algorithm would first check one of these values, rather than starting at 0 and working upward. The algorithm lets I = int( ) and computes (by first taking logarithms and then by raising the result to the power e). It then uses the above recursion to determine F(I). It now generates a Poisson random variable with the mean by generating a random number U , and then noting whether or not X <= I by seeing whether or not U <= F(I). It then searches downward starting from X = I in the case where X <= I and upward starting from X = I + 1 otherwise. The number of searches needing by the algorithm is roughly 1 more than the absolute difference between the random variable X and its mean . Since, for large a Poisson is (by the central limit theorem) approximately normal with mean and variance both equal to , it follows that average numbers of searches ~ 1 + E(|X = 1+ = 1+ = 1 + 0.798 That is, the average numbers of searches grows with the square root of with as becomes larger and larger. rather than where Z ~ N(0, 1) |) where X ~ N( , )

You might also like