You are on page 1of 8

Statistics 512 Notes I D.

Small
Reading: Section 5.1
Basic idea of statistical inference:
Statistical Experiment: Observe data X. The distribution of
X is P.
( ) "Probability X is in E" P X E
Model: Family of possible Ps.
, } P P

{
.
We call a parameter of the distribution.
Examples:
Population
Inference about
population using
statistical tools
Sample
of Data
1. Binomial model. Toss a coin n independent times.
P(Success)=p.
X=# of successes
[0,1]
2. Normal location model. Observe X=(X
1
,...,X
n
), X
i
independent and identically distributed (iid) with a normal
distribution with unknown mean

and known variance


2
.
2
2
1 1
( ; ) exp{ ( ) }
2
2
f x x


( , )
3. Normal model with unknown mean and variance.
Observe X=(X
1
,...,X
n
), X
i
iid with a normal distribution
with unknown mean

and unknown variance


2
.
( , ) (0, )

4. Nonparametric model. Observe X=(X
1
,...,X
n
), X
i
iid real
valued.
{all distributions on }
i
{cdf of distribution of X }
5. Survey sampling. There is a finite population of units
1,...,N that have variables Y
1
,...,Y
N
associated with them.
We observe Y for n of the units u
1
,...,u
n
, i.e., we observe
X
1
=Y
u1
,...,X
n
=Y
un
.
1
{ ,..., }
N
Y Y
We are usually interested in a particular function of such
as the population mean,
1 N
Y Y
N
+ + L
Two methods of choosing the units:
(A) Sampling with replacement: u
1
,...,u
n
are iid from the
uniform distribution on {1,2,...,N}.
(B) Sampling without replacement (simple random
sample): Each unit will appear in the sample at most once.
Each of the possible
N
n
_

,
samples has the same probability.
If N is much greater than n, the two sampling methods are
practically the same.
Statistical Inference: Statement about some aspect of
based on a statistical experiment.
Note: We might not be interested in the entire but only
some function of it, e.g., in Examples 3 and 4, we might
only be interested in the mean of the distribution.
Types of Inferences we will study:
1. Point estimation: Give best estimate of function of
we are interested.
2. Interval estimation (confidence intervals): Give an
interval (set) in which function of lies along with a
statement about how certain we are that function of
lies in the interval.
3. Hypothesis testing: Choose between two hypotheses
about
Point Estimation
Goal of point estimation is to provide the single best
guess of some quantity of interest g( ).
g( ) is a fixed unknown quantity.
A point estimator is any function of the data h(X). The
point estimator depends on the data so h(X) is a random
variable.
Examples of point estimators:
Binomial model: X~Binomial(n,p), n known
Point estimator for p: h(X)=X/n
Notation: We sometimes denote point estimator for a
parameter by putting a hat on it, i.e.,
/ p X n
. Also we
sometimes add a subscript n to denote the sample size,
/
n
p X n
.
Normal model with unknown mean

and known or
unknown variance
2

Point estimator for

:
1

n
n
X X
X
n

+ +

L
Sampling distribution: A point estimator h(X) is a function
of the sample so h(X) is a random variable. The
distribution of a point estimator h(X) for repeated samples
is called the sampling distribution of h(X).
Example: Normal location model. Observe X=(X
1
,...,X
n
),
X
i
independent and identically distributed (iid) with a
normal distribution with unknown mean

and known
variance
2
.
1

n
n
X X
X
n

+ +

L
Sampling distribution:
2
~ ( , )
n
N
n


Properties of a point estimator:
1. Bias. The bias of an estimator of
( ) g
is defined by
1 n 1 n
bias [h(X ,,X )] E [h(X ,,X )]-g( )


We say that h(X
1
,...,X
n
) is unbiased if
1
bias [h(X , , X )] 0
n
K
for all
Here
E

refers to the expectation with respect to the


sampling distribution of the data
1
( ,..., ; )
n
f x x
. It does not
mean we are averaging over a distribution for .
An unbiased estimator is suitably centered.
2. Consistency: A reasonable requirement for an estimator
is that it should converge to the true parameter value as we
collect more and more information.
A point estimator h(X
1
,...,X
n
) of a parameter g( ) is
consistent if h(X
1
,...,X
n
)
( )
P
g
for all .
Recall definition of convergence in probability (Section
4.2). h(X
1
,...,X
n
)
( )
P
g
means that for all 0 > ,
1
lim [| ( ,..., ) ( ) | ] 0
n
n
P h X X g


.
3. Mean Square Error. A good estimator should on
average be accurate. A measure of the accuracy of an
estimator is the average squared error of the estimator:
2
1 n 1 n
MSE [h(X ,...,X )] E [{h(X ,...,X )- } ]


Example: Suppose that an iid sample X
1
,...,X
n
is drawn
from the uniform distribution on [0, ] where is an
unknown parameter and the distribution of X
i
is
1
0<x<
( ; )
0 elsewhere
X
f x

'

Consider the following estimator of :


W=h(X
1
,...,X
n
)=max
i
X
i
Sampling distribution of W:
If w<0, P( W w )=0. If 0<w< ,
1 1
( ) ( ,..., ) [ ( ]
n
n
n
w
P W w P X X w P X w

_


,
If w , P( W w )=0.
Thus,
0 if w<0
( ) if 0 w
1 if w>
n
W
w
F w

_

'

,

and
1
0 w
( )
0 elsewhere
n
n
W
nw
f w

'

Bias:
1 1
0 0
0
E [W] ( )
( 1) 1
n n
W
n n
nw nw n
wf w dw w dw
n n



+
_


+ +
,

1
[ ]
1 1
n
Bias E W
n n


+ +
There is a bias in W but it might still be consistent.
Consistency:
Let W
n
denote W for a sample of size n.
For any 0 > ,
1
(| | ) ( )
( )
1
n n
n
n n
n n
n
n n
P W P W
n w w
dw

< < < +

_


,

Note that for any 0 > , it is possible to find an


n
making
[( ) / ]
n
as small as desired. Thus,
lim (| | ) 1
n n
P W

<
and W
n
is consistent.

You might also like