You are on page 1of 16

9.

0 Lesson Plan
Answer Questions
1

Distribution of Linear Combinations


Point Estimation
Unbiased Estimators
Minimum Variance Unbiased Estimators
The Variance of a Uniform RV

9.1 Linear Combinations


A linear combination of random variables X1 , . . . , Xn is a new random
variable Y such that
Y = a1 X1 + an Xn =

n
X

ai Xi

i=1

where the ai are known constants.


Some important linear combinations include:
in which each ai equals 1/n.
The sample mean, X,
A difference, X1 X2 , in which a1 = 1 and a2 = 1. This is helpful
when deciding whether, say, one brand of light bulb outlasts another
brand, or whether one company outperforms another.

Let Xi have mean i and variance i2 . Then


IE[Y ] = IE[

n
X

ai Xi ] =

i=1

n
X

ai IE[Xi ] =

i=1

n
X

ai i .

i=1

This holds even when the Xi are dependent. It follows because


R
R
integration is a linear operator: ai xfi (x) dx = ai xfi (x) dx = ai i .
Also,
n
n X
n
X
X
Var [Y ] = Var [
ai Xi ] =
ai aj Cov[Xi , Xj ].
i=1

i=1 j=1

Why? Var [Y ] = IE[Y 2 ] (IE[Y ])2 and Y 2 = (a1 X1 + . . . + an Xn )


(a1 X1 + . . . + an Xn ), which generates the cross-product terms that define
the Cov[Xi , Xj ]. It takes some algebra.
Theorem: If the Xi have (possibly different, possibly correlated) normal
distributions, then Y is normally distributed.

If one looks at the definitions, one sees that Cov[Xi , Xi ] is just the
variance i2 . So one can write
Var [Y ] =

n
X

a2i i2

+2

i=1

ai aj Cov[Xi , Xj ].

i<j

In the special case when the random variables are independent, then the
covariances are all zero and this simplifies to
Var [Y ] =

n
X

a2i i2 .

i=1

In particular, IE[X1 X2 ] = 1 2 . And if X1 and X2 are independent,


then Var [X1 X2 ] = 12 + 22 .

9.2 Point Estimation


Statisticians provide two things:
a point estimate of some quantity of interest, and
a statement of the uncertainty in that estimate.
5

Other disciplines only provide the point estimate.


A parameter is some property of a distribution function, such as the
mean, median, standard deviation, and so forth. A point estimate for
a parameter is some statistic h(X1 , . . . , Xn ) which, when evaluated for a
random sample, gives a sensible approximation to the parameter.

The Central Limit Theorem indicates one of many approaches. If the


parameter of interest is the population mean , then the statistic
provides a sensible estimate of .
h(X1 , . . . , Xn ) = X

In particular, we know the uncertainty in that estimate: / n.


6

There are several properties one can want from a point estimate:
unbiasedness
minimum variance (i.e, minimum uncertainty)
minimum mean squared error.
We discuss these in the context of several estimation strategies.

Besides the mean, other point estimates for common parameters are:
the sample proportion X/n for the population proportion p.
the sample variance,
n

1 X
2
2

=
(Xi X)
n 1 i=1
for the population variance.
7

the average squared deviation,


n

1X
2
2
(Xi X)
s =
n i=1
for the population variance.
the 10% trimmed sample mean for the population mean; this is the
average of the sample after removing the largest 5% of the values
and the smallest 5% of the values.

9.3 Unbiased Estimates


A point estimate = h(X1 , . . . , Xn ) is said to be an unbiased
= .
estimator for a population parameter if IE[]

= bias()
. For unbiased
The bias in a point estimate is IE[]
estimates, this is zero.
+ bias2 ().

The mean squared error of a point estimate is Var []


The book does not cover the mean squared error, but it is has many
attractive features. In particular, it is sometimes possible to trade-off
a small bias for a large reduction in variance, and this leads to better
accuracy.

Recall from the previous lecture how bias and variance contribute
differently to to total error. The left target has small variance and small
bias. The right target has large bias and large variance, the worst of both
worlds.

If X has the Bin(n, p) distribution, then the sample proportion X/n is


an unbiased estimate for the parameter p.
IE[X/n] =

1
1
IE[X] = np = p.
n
n

of a random sample is unbiased for the population


The sample mean X
mean . We know this from the properties of linear combinations:
10

1X
1X
1

IE[X] = IE[
Xi ] =
IE[Xi ] = (n) = .
n i=1
n i=1
n
In this case we also know the variance of the estimator:
 2
 2 X
n
2
1

1
2
2
=

(n ) = .
=
Var [X]
n
n
n
i=1

Example: Second-Price Auctions. Let X be uniformly distributed


on the interval [0, ] where is the unknown parameter. You have
a random sample X1 , . . . Xn of losing bids and use the statistic
1 = max{X1 , . . . , Xn } as an estimate of , the winning bid. Let
Z z
F (z) = IP[X z] =
1/ dx = z/.
0

Then let G(z) = IP[1 z] = IP[max{X1 , . . . , Xn } z], and note that


11

IP[max{X1 , . . . , Xn } z] = IP[X1 z and and Xn z]


n
Y
=
IP[Xi z]
i=1

The
symbol multiplies its arguments just as the
symbol adds
them. We have shown that
n
n
Y
Y
z/ = (z/)n .
IP[Xi z] =
G(z) =
i=1

i=1

So the distribution of the sample maximum is G(z) = (z/)n for


0 z and thus the probability density function of the maximum is
g(z) = n(1/)nz n1 on 0 z .

12

Since we know the density, we can find the expected value of Z, where Z
is the sample maximum:
Z
Z
n
n 1 n+1
IE[Z] =
z |0
z g(z) dz =
z n z n1 dz = n

n+1
0
0
n
.
=
n+1
So the estimator 1 of has a small bias:

n+1

= /(n + 1).

One can unbias this 1 estimator by using the new estimator


2 = (n + 1)/n 1 = (n + 1)/n max{X1 , . . . , Xn }.

9.4 Minimum Variance Unbiased Estimators


The book holds that the first requirement for a good estimator of a
parameter is that it be unbiased. When there are several unbiased
estimators, one should use the one that has smallest variance.
13

This is not the only way to frame the problem of selecting an estimator.
For example, one might want the estimator which:
minimized the the mean squared error,
had the largest probability of being within some fixed distance from
the true value,
was unbiased and minimized something more practical than the
variance.

Consider again the case of a random sample from Unif(0, ) distribution.


= /2 so 3 = 2X
is an unbiased estimator of .
Clearly, IE[X]
We now have two candidate estimators:

2 = (n + 1)/n max{X1 , . . . , Xn } and 3 = 2X.


14

Which has the smaller variance?


Since 3 is a linear combination, we know that its variance is 4 ( 2 /n)
where 2 is the variance of the Unif(0, ) distribution. And we know
that the variance of the Unif(0, ) distribution is 2 /12. Thus
2

Var [3 ] =
.
3n

To find the variance of 2 we first find


Z
Z
n 2
n
IE[Z 2 ] =
.
z 2 g(z) dz =
z 2 n z n1 dz =

n+2
0
0

15

Since Var [Z] = IE[Z 2 ] (IE[Z])2 , we have



2 

n
n
n 2
2

=

.
Var [Z] =
2
n+2
n+1
(n + 2)(n + 1)
Since 2 = (n + 1)/n Z, then

2 

n+1
n
1
2
2
Var [2 ] =

=
2
n
(n + 2)(n + 1)
n(n + 2)
A little algebra shows that n(n + 2) > 3n for all n > 1, so 2 is better
than 3 .

Variance of a Uniform RV
Let X Unif(0, ). Then IE[X] = /2. To find the variance of X we first
calculate IE[X 2 ], and then use Var [X] = IE[X 2 ] (IE[X])2.

16
2

IE[X ] =

x f (x) dx =
0

x2 1/ dx

= 1/ x3 /3|0 = 2 /3
so

 2

2
Var [X] =

= .
3
2
12
2

Quod erat demonstrandum.

You might also like