Recommended Prior Knowledge Students must have studied S1 and Section 1 of Statistics 2 before starting this Section.
Context Sections 1 and 2 must be studied in order, since the ideas in Section 1 are required in Section 2.
Outline This Section introduces the ideas behind sampling and estimation. The idea of a sample mean and how sample means are normally distributed are considered. The Section continues with the central limit theorem and moves on to unbiased estimates and confidence intervals. The Section concludes with an introduction to hypothesis testing and the terminology associated with it. The study looks at hypothesis testing with sample means and with a single observation taken from the Poisson or binomial distribution. It concludes with a study of Type 1 and Type 2 errors.
Topic
Learning Outcomes
Suggested Teaching activities
Resources
On-Line Resources 4 Sampling and estimation Understand the distinction between a sample and a population, and appreciate the necessity for randomness in choosing samples.
Discuss the distinction between a sample and a population. Discuss, in general terms, the ideas behind sampling and the reasons for taking samples. Talk about the idea of random samples and, in particular, discuss the idea of using random numbers, as produced on a calculator, or in random number tables. Discuss the differences between large and small samples - of the impracticality of taking a large sample of measurements when items such as a car or a plane need to be tested to destruction. Discuss how a large sample, if practical, will lead to greater accuracy in forecasting statistics of the parent population, particularly with regard to election results.
w w w . X t r e m e P a p e r s . c o m Explain in simple terms why a given sampling method may be unsatisfactory (knowledge of particular sampling methods, such as quota or stratified sampling, is not required, but candidates should have an elementary understanding of the use of random numbers in producing random samples).
Recognise that a sample mean can be regarded as a random variable, and use the fact that = ) E(X and that n X 2 ) ( Var
= . Use the fact that X has a normal distribution if X has a normal distribution.
Use the central limit theorem where appropriate.
Students are not required to have knowledge of specific sampling methods, such as quota, or stratified, but general discussions of these and other methods is worthwhile if time allows. Similarly students should realise why particular sampling methods (such as using a telephone directory or a data-base of car owners) can lead to bias.
Students should recognise that the mean of a sample (usually referred to as X ) is itself a random variable and that = ) E(X and n X 2 ) ( Var
= . Students should be able to prove these results by referring back to the work on linear combinations of random variables. By taking n X X X X n + + + ... as 2 1 , the results should follow.
www.mathsrevision .net A-Level Section Statistics Central Limit Thoerem
Calculate unbiased estimates of the population mean and variance from a sample, using either raw or summarised data (only a simple understanding of the term unbiased is required).
Discuss with the students the advantage of taking the mean of a sample of readings and using the sample mean as a more accurate statistic in hypothesis testing. This will prepare the ground for future work on hypothesis testing. If time allows, students can take groups of 9 single-digit random numbers (generated on a calculator or by random number tables) and evaluate the mean of the group of 9 numbers. If a large number of these sample means is available, students can represent the results in a frequency table and hence plot a frequency curve (if sample means are recorded to the nearest integer) or a histogram (if sample means are recorded to 3 significant figures). The distribution can now be recognised as being approximately normal. This is a good introduction from which to test = ) E(X and that n X 2 ) ( Var
= . It can also lead to the students recognition that the sample means themselves are normally distributed and lead to an introduction to the application of the central limit theorem. This is a good opportunity to test different distributions (discrete probability distribution, binomial distribution, Poisson distribution, normal distribution, continuous distributions) with the central limit theorem.
Introduce the idea of an unbiased estimate (a difficult concept to fully understand) and calculate unbiased estimates for the population mean and variance from a sample of raw data. In particular students should be aware of the need to use (n1) rather than n in the formula for the population variance.
Determine a confidence interval for a population mean in cases where the population is normally distributed with known variance or where a large sample is used.
Determine, from a large sample, an approximate confidence interval for a population proportion.
Look at the idea of confidence intervals ( 90%, 95% and 99% in particular) for: a population mean in the case where the population is normally distributed with known variance or where a large sample is used a population proportion taken from a large sample.
5 Hypothesis tests Understand the nature of a hypothesis test, the difference between one-tail and two-tail tests, and the terms null hypothesis, alternative hypothesis, significance level, rejection region (or critical region), acceptance region and test statistic.
Formulate hypotheses and carry out a hypothesis test in the context of a single observation from a population which has a binomial or Poisson distribution, using either direct evaluation of probabilities or a normal approximation, as appropriate.
Discuss general ideas of hypothesis testing as used in quality control in the manufacturing industry, or for testing in such situations as election surveys etc. Students must acquire full understanding of the idea of a null (H 0 ) and alternative (H 1 ) hypothesis and in particular, of the difference between one-tail and two-tail tests. Students should be encouraged to discuss the difference in such hypotheses as x60 (a two-tail test) and x>60 (a one-tail test). Understanding of these principles comes more easily from a simple worked example.
A typical example might be a quality control on light bulbs produced on a machine. Results, over a long period of time, show that the life-time (L) of such bulbs has a mean of 200 hours and a standard deviation of 20 hours. It is suspected that a fault has developed in the process and that bulbs are being produced with a shorter life time. One such bulb has L =170. Is this sufficient evidence at a 5% significance level that the machine involved is producing bulbs which have a lower lifetime than previously?
Have ready a number of tests that require the student to make a decision on whether the test is one-tail or two-tail.
Have ready an OHP showing the full solution of at least 2 questions (one using a sample mean, and the other a Poisson or binomial distribution), using all the terminology
www.mathsrevision .net A-Level Section Statistics Hypothesis Testing One and Two Tailed Tests Formulate hypotheses and carry out a hypothesis test concerning the population mean in cases where the population is normally distributed with known variance or where a large sample is used.
Understand the terms Type 1 error and Type 2 error in relation to hypothesis tests.
Calculate the probabilities of making Type 1 and Type 2 errors in specific situations involving tests based on a normal distribution or direct evaluation of binomial or Poisson probabilities. The students should recognise the following terms:- H 0 The machine is functioning as previously with L =200. H 1 The machine is producing bulbs for which the mean value of L <200. Test statistic: L =170. Significance level of the test: 5%. Type of Test: One-tail. Rejection (or critical) region: Values of L that lie in the bottom 5% of the normal distribution. Calculation: The limit for L is in fact given by z =1.64, from which the corresponding value of L =167.2 is obtained. The value of L=170 lies within the acceptable region. The result is not significant. Therefore H 0 is accepted and H 1 is rejected.
It is worth repeating the above exercise for similar questions using examples that involve the binomial and Poisson distributions, using either direct evaluation of probabilities or a normal approximation as appropriate.
It is good practice for the students to list the various terms for each example as in the solution to the question above.
Repeat the exercise with a population mean in cases where the population is normally distributed with known variance or where a large sample is used.
Discuss with students the meaning of Type 1 and Type 2 errors. Students should realise that a Type 1 error occurs when the null Hypothesis has been rejected when it is in fact true. The probability of a Type 1 error is therefore the same as the probability of a reading lying in the critical region i.e. the significance level of the test.
mentioned in the teaching activity.
A Type 2 error is made when the null hypothesis is accepted, when in fact an alternative hypothesis is true. The probability of making a Type 2 error is dependent upon the alternative hypothesis. For example, in the test above, if subsequently the information was given that the lifetime of light bulbs (L) is a normal distribution with mean 150 hours, then a Type 2 error will be made whenever a value of L greater than 167.2 is obtained for N(150,400). Again, students need considerable practice in using Type 1 and Type 2 errors.
Have ready worked examples illustrating the use of both Type 1 and Type 2 errors.