You are on page 1of 48

Variance Reduction

Techniques
Antithetic Variables
Control Variates
1
2
3
Conditioning Sampling
Content
4
Stratified Sampling (optional)
5
Importance Sampling
Recall we estimate the unknown quantity= E(X) by
generating random numbers X
1
,. . . , X
n
, and use
to estimate.
Introduction
The mean square error is
Hence, if we can obtain a different unbiased estimate of having a
smaller variance than does , we would obtain an improved
estimator = E(X)

Antithetic Variables
The Use of Antithetic Variables
X
1
and X
2
are random variables generated identically
distributed having mean . Then



If X
1
and X
2
rather than being independent were negatively
correlated,
The Use of Antithetic Variables
Suppose we simulate U
1
, U
2
, U
m
, which are uniform random
numbers. Then V
1
=1 U
1
., . . , V
m
= 1 - U
m
would also be uniform
random numbers.

Therefore, (U
i
,V
i
,) are negatively correlated.

Actually, it can be proven that If X
1
= h(U
1
,. . . ,U
m
) and X
2
=
h(V
1
,. . . , V
m
,) ( h is a monotone function (either increasing or
decreasing)), then X
1
and X
2
have the same distribution and are
negatively correlated. (Proof: Appendix 8.10 ,Page 210)


How do we generate negatively correlated random numbers?
How to arrange for X
1
and X
2
to be negatively correlated?

Step1:
X1 = h(U
1
,. . . ,U
m
)
where U
1
, , U
m
i.i.d. ~ U(0,1), and h is a monotone function
of each of its coordinates.

Step2:

which has the same distribution as X
1
.
The Use of Antithetic Variables
X
2
=h(1 U
1
., . . , 1 U
m
)
What Does Antithetic Mean?

Antithetic means:
Opposed to, or
The opposite of, or
Negatively correlated with
The idea is to determine the value of an output variable at random,

then determine its antithetic value (which comes from the
opposite part of its distribution),

then form the average of these two values, using the average as
a single observation.

Advantage
The estimator have smaller variance (at least when h is
a monotone function)

We saved the time of generating a second set of
random numbers.
Example1
If X
1
= e
U1
, X
2
=e
U2
,where U
1
,U
2
iid ~ U (0, 1). We
have



where

Suppose we were interested in using simulation to estimate
h (u)= e
u
is clearly a monotone function.
X
1
= e
U
, X
2
=e
1-U
,where U ~ U (0, 1)
Cov (X
1
,X
2
) =



Var ((X
1
+X
2
)/2) =



The variance reduction is 96.7 percent.
Example1
n=1000;
u=rand(n,1);
x=exp([u;1-u]); %Antithetic Variable
theta=sum(x)/(2*n) % esitmator using Antithetic Variable
u0=rand(n,1);
x0=exp([u;u0]); %independent variable
theta0=sum(x0)/(2*n)
Example1: Matlab code
n=1000; m=1000;
u=rand(n,m);
x=exp([u;1-u]); %Antithetic Variable
theta=sum(x)/(2*n);
true=exp(1)-1; %the true value is e-1=1.7183
mseav=sum((theta-true).^2)/m % mean square error
u0=rand(n,m);
x0=exp([u;u0]);
theta0=sum(x0)/(2*n); %independent variable
mse0=sum((theta0-true).^2)/m % mean square error
reduction=1-mseav/mse0
Example1: Matlab code
so an unbiased combined estimator is
(1/2){[ln(U
i
)]
0.9
+[ln(1U
i
)]
0.9
}
=
n
i
n
1
1
Estimate the value of the definite integral
Solution: Firstly, we can generate values from the probability
density function f(x)=e
x
.
.
This is done by setting X
i
=lnU
i,
where U
i
, i = 1n is random
number with U
i
U(0 1).

Example 2
An antithetic variable is
[ln(1U
i
)]
0.9
Additional Example
Consider the case where we want to estimate E(X
2
) where
X ~Normal(2, 1). How to use antithetic variables to estimate
and improve its variance.
n.sim=5000; %set the numbers to simulate
out1=(2+randn(1,n.sim)).^2;%generate a Normal random variable (2,1) and square it
mean(out1)
var(out1)

out2_1=2+randn(1,n.sim/2); % now use antithetic variables
out2_2=4-out2_1;
out2=0.5*(out2_1.^2 + out2_2.^2);
mean(out2)
var(out2)

%how much variance is reduced?
Var.reduction=(var(out1)/n.sim -var(out2)/(n.sim/2))/(var(out1)/n.sim);
Var.reduction

Control Variates
The Use of Control Variates
Assume desired simulation quantity is = E[X];
there is another simulation R.V. Y with known
y
= E[Y ].

Then for any given constant c, the quantity


X + c (Y -
y
)

is also an unbiased estimator of
Consider its variance:
It can be shown that this variance is minimized when c is
equal to
The Use of Control Variates
The variance of the new estimator is:



Y is called the control variate for the simulation estimator X.


We can re-express this by dividing both sides by Var(X):
where
the correlation between X and Y.

The variance is therefore reduced by 100[Corr(X,Y)]
2
percent.
The controlled estimator
The Use of Control Variates
and its variance is given by
=
Note: goal is to choose Y so that Y is X, with Y is easy
to simulate and
Y
is easy to find.
Estimation
If Cov(X,Y) and Var(Y) are unknown in advance.
Use the estimators



and

The approximation of c*
Several Variables as a Control
We can use more than a single variable as a control. For
example if a simulation results in output variables Y
i
, i=1,
, k, and E[Y
i
]=
i
is known, then for any constants c
i
,
i=1, , k, we may use



as an unbiased estimator of E[X].
A natural control variate is random number U
X = e
U
, Y =U ,where U ~ U (0, 1)
Cov (X ,Y )




Var(Y ) = Var(U)=1/12, then

= -12*0.14086= -1.6903
Example 3
Estimate .
}
= =
1
0
] [ dx
x
e
U
e E u
Thus, the controlled estimator is:

=
=
n
i
i
u
u e
n
i
1
)) 5 . 0 ( 6903 . 1 (
1

u
Var (X + c*(Y-
y
))




From Example 1, Var(e
U
)=0.2420
The variance reduction is 98.4 percent
1- 0.0039/0.2420 = 98.4%
Example 3
n=1000;
m=1000;
y=rand(n,m); %control variate
x=exp(y);
c=-1.6903;
z=x+c*(y-0.5); % X + c (Y - y)
theta=sum(z)/(n);
true=exp(1)-1;
msecv=sum((theta-true).^2)/m; % mean square error
theta0=sum(x)/(n);
mse0=sum((theta0-true).^2)/m;
reduction=1-msecv/mse0
Example 3: Matlab code
Example 4:
Suppose we wanted to estimate , where
a) Explain how control variables may be used to estimate .
b) Do 100 simulation runs, using the control given in (a), to
estimate first c and then the variance of the estimator.

c) Explain how to use antithetic variables to estimate. Using the
same data as in (b), determine the variance of the antithetic
variable estimator.

d) Which of the two types of variance reduction techniques worked
better in this example?
n U c e
n
i
i
U
i
/ )) 3 / 1 ( (

1
2
2

=
+ = u
Example 4:
a) Let , so that .One possible choice of control
variable is Y = U
2
. The expected value of Y is
2
U
e X =
) (
2
U
e E = u
So we can use the unbiased estimator of given by
where
Example 4:
b) The following Matlab program can be used to answer the question
m = 100; U = rand(1,m); Y = U.^2; Ybar = 1/3; X = exp(Y);
Xbar = sum(X)/m;
A = sum((X-Xbar).*(Y-Ybar));
B = sum((Y-Ybar).^2);
C = sum((X-Xbar).^2);
CovXY = A/(m-1);
VarY = B/(m-1);
VarX = C/(m-1);
c = -A/B;
%Estimator:
Xc = X + c*(Y-Ybar);
Xcbar = sum(Xc)/m;
%Variance of estimator:
VarXc = (VarX - CovXY^2/VarY)/m
One run of the above program gave the following: the estimated value of c was 1.5950,
and the variance of the estimator X
c
was Var(X
c
) = 4.5860 10
5
.
Example 4:
c) The antithetic variable estimator can be:
Matlab code:
%Antithetic estimator
Xa = (exp(U.^2)+exp((1-U).^2))/2;
Xabar = sum(Xa)/m;
VarXa = var(Xa)/m
The variance of the antithetic variable estimator (using the same
U) was: Var(X
a
) = 2.7120 10
4
.

d) It is clear from part (c) that it is better to use the control variable
method.

Conditioning sampling
Variance Reduction by Conditioning
Review: Conditional Expectation:
E[X|Y] denotes that function of the random variable Y whose
value at Y=y is E[X|Y=y].

If X and Y are jointly discrete random variables,




If X and Y are jointly continuous with joint p.d.f. f(x,y),
Recall the law of conditional expectations: (textbook Page 34)
Variance reduction by conditioning
This implies that the estimator E(X|Y) is also an unbiased
estimator.
Now, recall the conditional variance formula: (textbook Page 34)
Clearly, both terms on the right are non-negative, so that we have
This implies that the estimator, by conditioning, produces a
more superior variance.
Variance Reduction by Conditioning
Procedure:

Step1: Generate r.v. Y=y
i

Step2: Compute the (conditional) expected value
of X given Y : E [X | y
i
].

Step3: An unbiased estimate of is
n
i=1
E[X|y
i
]/n

Example 5: Estimate
To estimate

Recall the simulation introduced in Chapter 1
V
i
= 2U
i
1, i =1,2 where U
i
~ U(0,1)
Set



E[I] = /4.
Use E[I|V
1
] rather than I to estimate /4.
1 -1
1
-1
0
Hence, E[I|V
1
] = (1 V
1
2
)
1/2

Use (1 V
1
2
)
1/2
as the estimator

Example 5: Estimate
The variance




I is a Bernouli r.v. having mean /4, then



The conditioning results in a 70.44 percent reduction in
variance.
Example 5: Estimate
Example 5: Estimate
Procedure 2:
Step1: Generate V
i
= -2U
i-1
, i=1n, where U
i
i.i.d. ~ U(0, 1).

Step 2: Evaluate each and take the average of all these
values to estimate /4.
2 / 1 2
) 1 (
i
V
Matlab code:
n=1000;
m=1000;
u1=rand(n,m);
v1=2*u1-1;
v=(1-v1.^2).^0.5;
theta=4*sum(v)/n;
msecv=sum((theta-pi).^2)/m; % reduction in variance
reduction=1-msecv/mse0
n=1000;
m=1000;
u1=rand(n,m);
v1=2*u1-1;
% ------------raw simulation-------------------------
v2=2*rand(n,m)-1;
s=v1.^2+v2.^2<=1;
theta0=4*sum(s)/(n);
mse0=sum((theta0-pi).^2)/m;
% ----------conditioning sampling-------------------------
v=(1-v1.^2).^0.5;
theta=4*sum(v)/n;
msecv=sum((theta-pi).^2)/m; % reduction in variance
reduction=1-msecv/mse0
Example 5: Estimate
Matlab program for comparison of two simulation procedure:
Suppose that Y ~ Exp (1)
Suppose that, conditional on Y= y, X ~ N (y, 4)
How to estimate= P{X>1}?

Raw simulation:
Step1: generate Y = - log(U), where U ~ Uni(0, 1)
Step2: if Y= y, generate X ~ N (y, 4)
Step3: set


then E[I]=
Example 6:

If Y= y, is a standard normal r.v..

Then,


where .

Therefore, the average value of obtained over many runs
is superior to the raw simulation estimator.
Example 6:
Can we express the exact value of E(I | Y=y) in terms of y?
) ( 1 ) ( x x u = u
Improvement:
)
2
1
(
y
u
Example 6:
Procedure 2:
Step1: Generate Y
i
= - ln(U
i
), i=1n, where U
i
i.i.d. ~ U(0, 1).

Step 2: Evaluate each and take the average of all these
values to estimate .
)
2
1
(
i
Y
u
Matlab code:
n=1000;
EIy=zeros(1,n);
for i=1:n
y=exprnd(1);
EIy(i)=1-normcdf((1-y)/2);
end
theta=mean(EIy)
Further Improvement:
Using antithetic variables






Example 6:
Can we use antithetic variables to improve simulation?
Because the conditional expectation estimator
is monotone in Y, the simulation can be improved by
using antithetic variables.
)
2
1
(
y
u
Example 6:
Procedure 3:

Step 1: Generate U
1
, U
2
, U
m
, 1-U
1
, 1-U
2
, ,1-U
m
.

Step 2: Evaluate
and average all these values to estimate .


m i
u u
i i
... 1 )),
2
) 1 log( 1
( )
2
log 1
( (
2
1
=
+
u +
+
u


The random variable



is said to be a compound random variable if N is a
nonnegative integer valued random variable and X
1
, X
2
,
be a sequence of i.i.d. positive r.v.s that are independent of
N.
Example 7:
In an insurance application, X
i
could represent the amount of
the ith claim made to an insurance company, and N could
represent the number of claims made by some specified time t ;
S would be the total claim amount made by time t .


In such application, N is often assumed be a Poisson random
variable (in which case S is called a compound Poisson
random variable).
Example 7:
Suppose that we want to use simulation to estimate



Raw simulation
Step1: generate N, say N = n
Step2: generate the values of X
1
, , X
n

Step3: set


then E[I] = p
Example 7:
Improvement by conditioning
Introduce a random variable M


What is E[I | M=m]?
We can prove:

Example 7:
N and M are independent
Thus, if given M=m, the value E[I | M=m] obtained is P{Nm}.
Since the distribution of N is known (Specially, Poisson distribution), the
probability P{Nm} is easy to be found.
Procedure of simulation improved by conditioning


Step1: Generate X
i
in sequence, stopping when .


Step2: Calculate P{Nm} as the estimate of p from this run.

Example 7:
c X S
m
i
i
> =

=1
Suppose that N is a Poisson r.v. with rate of 10 per day, the
amount of a claim X is an exponential r.v. with mean $1000,
and that c=$325000. Simulate the probability that the total
amount claimed within 30 days exceeds c.
Code:
n=100;c=325000;I=zeros(1,n);
for i=1:n
s=0;m=0;
while s<c
x=exprnd(1000); % exponential with mean 1000
s=s+x;
m=m+1;
end
p=1-poisscdf(m-1,300); % poisson with rate of 10 per day
I(i)=p;
end
p_bar=sum(I)/n
Example 7:

You might also like