You are on page 1of 2

Hypothesis Testing- Central Limit Theorem

A hypothesis test allows us to draw conclusions or make decisions regarding


population data from sample data. The logic behind Hypothesis tests is as
follows:

 Assume a population distribution with a specified population mean.

 State the hypothesized population mean (null hypothesis).

 Draw a random sample from the population and calculate the sample
mean.

 Determine the “relative position” on the calculated mean on the


distribution of sample means. If the sample mean is “close” to the
specified population mean, we do not have evidence to reject the
hypothesized population mean.

 If the calculated sample mean is “not close” to the specified population


mean, we conclude that our sample could not have been drawn from
the hypothesized distribution, and thus, we reject the null hypothesis.

When testing a statistical hypothesis, we generally take the mean of a


certain number of samples from a population, and assume that this mean is
a value from a normal distribution. The Central Limit Theorem tells us that
this assumption is approximately correct, for large samples, and tells us the
standard deviation to use.

The central limit theorem states that when an infinite number of successive
random samples are taken from a population, the distribution of sample
means calculated for each sample will become approximately normally
distributed with mean μ and standard deviation (σ / √ N) as the sample size
(N) becomes larger, irrespective of the shape of the population distribution.

Say you have a set of observations O and a null hypothesis H0 and if we are
trying to calculate P(O/ H0 ) i.e., the probability that we observed what we did
given the null hypothesis. If that probability is sufficiently small we’re
confident concluding the null hypothesis is false

We can use whatever level of confidence we want before rejecting the null
hypothesis, but most people choose 90%, 95%, or 99%. For example if we
choose a 95% confidence level we reject the null hypothesis if
P(O/ H0 ) <= 1-0.95 = 0.05

The Central Limit Theorem is the main piece of math here. Briefly, the
Central Limit Theorem says that the sum of any number of re-averaged
identically distributed random variables approximates a normal distribution.

Remember our random variables from before? If we let p= Y/N , then p is


the proportion of Y successes . But by the central limit theorem we also know
that p approximates a normal distribution. This means we can estimate the
standard deviation of p as

You might also like