You are on page 1of 30

#5.48 ACT scores of high school seniors.

The scores of high school seniors on the ACT college en


(a)    What is the approximate probability that a single student randomly chosen from all those
(b)   Now taken as SRS of 25 students who took the test. What are the mean and standard dev
(c)    What is the approximate probability that the mean score
of these students is 23 or higher?
(d)   Which of your two Normal probability calculations in (a) and (c) is more accurate? Why?

Solutions: Given that, µ=20.8 and SD δ=4.8

(a)
Required Prob: P( X >= 23 )

P( X >= 23 ) = 1 - P( X < 23 )

P( X < 23 ) = 0.6766 (by using excel normdist() function, click on the respective value for how it is

P( X >= 23 ) = 1 - P( X < 23 ) = 1-0.6766 = 0.3234

(b)
For this problem lets recollect that
By the properties of means and variances of random variables, the mean and variance of the sample mean are the

Note: Here x-bar also refer as M

Now, when n=25

Expected value of M = 20.8

Standard deviation of M = σ / sqrt(n) = 4.8 / sqrt(25) = 4.8 / 5 = 0.96

( c)

Required Prob: P( sample mean(M) >= 23 )

P( M >= 23 ) = 1 - P( M < 23 )

P( M < 23 ) = 0.9890 (by using eMcel normdist() function, click on the respective value for how it is

P( M >= 23 ) = 1 - P( M < 23 ) = 1-0.9890 = 0.0110

(d)

Normal probability calculation in (c) is more accurate, because it is capturing the current sample's variability,
whereas (a) captures overall population's variability
iors on the ACT college entrance examination in 2003 had mean µ=20.8 and SD δ=4.8. The distribution of scor
mly chosen from all those taking the test scores 23 or higher?
e mean and standard deviation of the sample means score x of these 25 students?

s more accurate? Why?

he respective value for how it is calculated)

ance of the sample mean are the following:

the respective value for how it is calculated)

urrent sample's variability,


=4.8. The distribution of scores is only roughly Normal.
#5.52 A Lottery payoff. A $1 bet in a state lottery’s Pick 3 game pays $500 if the three-digit numb
the winning number, which is drawn at random. Here is the distribution of the payoff X:
Payoff X $0 $500
Probability 1 0

Each day’s drawing is independent of other drawings.


(a)    What are the mean and SD of x?
(b)   Joe buys a Pick 3 ticket twice a week. What does the law of large numbers say about the a
(c)    What does the central limit theorem say about the distribution of Joe’s average payoff afte
(d)   Joe comes out ahead for the year if his average payoff is greater than $1 (the amount he s
What is the probability that Joe ends the year ahead?

Solutions:
(a)
x $0 $500 Imp note: Generally there should be loss factor, here it sh
P(x) 1 0 but considering as it is to avoide the confusion

mean = Σxp(x) = 0*0.999 + 500*0.001 = 0.5

SD = sqrt ( V(x) )

V(x) = Σ x*x*p(x) - (mean*mean) = 0*0*0.999 + 500*500*0.001 - 0.5*0.5 = 249.75

SD = sqrt ( 249.75 ) = 15.803

(b)
From the law of large numbers, the average payoff joe receives from his bets will be close population's average pa

(c)
The central limit theorem (CLT) states conditions under which the mean of a sufficiently
large number of independent random variables, each with finite mean and variance, will be approximately normally

(d)
Required prob: P( X > 1)

As here, n = 104
Mean of sample mean = 0.5
SD of sample mean = σ / sqrt(n) = 15.803 / sqrt(104) = 1.5496
therefore X follows normal dist with mean = 0.5 and SD=1.5496

P(X>1) = 1 - P(X<=1) =
= 1 - 0.63
= 0.37
if the three-digit number you choose exactly matches

umbers say about the average payoff Joe receives from his bets?
oe’s average payoff after 104 bets in a year?
han $1 (the amount he spent each day on a ticket).

uld be loss factor, here it should be -1 in place of 0


t is to avoide the confusion

lose population's average payoff

will be approximately normally distributed


#5.60 Advertisements and brand image. Many companies place advertisements to improve the
(a)    There are 28 students in each group. Although individual scores are discrete, the mean sc
(b)   What are the means and SD of the sample mean scores ӯ for the Journal group and ̅
for the Enquirer group?
(c)    We can take all 56 scores to be independent because students are not told each other’s sc
between the mean scores in the two groups?
(d)    Find P (ӯ - x-bar ≥1 ).

Solutions:
(a)

Because the central limit theorem (CLT) states conditions under which the mean of a sufficiently
large number of independent random variables, each with finite mean and variance, will be approximately normally
So based on this CLT,we could say that mean score for a group of 28 will be close to Normal

(b)

Given that, Journal's mean 4.8 and SD 1.5, and Enquirer's mean 2.4 and SD 1.6

here sample number, n=28

By the properties of means and variances of random variables, the mean and variance of the sample mean are the

Now, for jounral when n=28

Expected value of ӯ of journal = 4.8

Standard deviation of ӯ = σ / sqrt(n) = 1.5 / sqrt(28) = 4.8 / 50.283


=

Now, for Enquirer when n=28

Expected value of M of Enquirer = 2.4 Note: Here x-bar refers as M

Standard deviation of M = σ / sqrt(n) = 1.6 / sqrt(28) = 4.8 / 50.302


=

(c)

The distribution of y-bar - x-bar would be normal based central limit theorem and normal dist properties

From CLT,we already know that y-bar and x-bar will be normal and form normal dist properties,
if x1 and x2 follow normal dist then x1+x2 and x1-x2 also follows normal dist

(d)
for required prob, first we will need to calculate mean and SD of y-bar - x-bar

using (b) and (c ), y-bar mean = 4.8 and var = square of 0.283 = 0.0804
x-bar mean = 2.4 and var = square of 0.302 = 0.0914

y-bar - x-bar follows normal dist with mean = 4.8-2.4 = 2.4 and var = 0.0804 + 0.0914 =
SD = sqrt(0.1718) =
P (ӯ - x-bar ≥1 ) = 1 - P (ӯ - x-bar<1)
= 1 - 0.0004
= 0.9996
tisements to improve the image of their brand rather than to promote specific products. In a randomized compa
are discrete, the mean score for a group of 28 will be close to Normal. Why?
e Journal group and ̅

re not told each other’s scores. What is the distribution of the difference ӯ-

a sufficiently
e, will be approximately normally distributed

ance of the sample mean are the following:

Note: Here x-bar refers as M

ormal dist properties

st properties,
4 and var = 0.0804 + 0.0914 = 0.1718
0.4145
ducts. In a randomized comparative experiment, business students read ads that cited either the
ited either the Wall Street Journal or the National Enquirer for important facts about a fictitious company. The s
ut a fictitious company. The students then rated the trustworthiness of the source on a 7-point scale. Suppose t
on a 7-point scale. Suppose that in the population of all students’ scores for the Journal have mean 4.8 and SD
urnal have mean 4.8 and SD 1.5, while scores for the Enquirer have mean 2.4 and SD 1.6
#6.18 Mean OC in young women. Refer to the previous exercises. A biomarket for bone formation
in the same study was osteocalcin (OC), measured in the blood. The units are nanograms per milliliter (ng/ml).
For the 31 subjects in the study the mean was 33.4 ng/ml. Assume that the SD is known to be 19.6 ng/ml. report the 95% conf

Confidence Interval Estimate for the Mean

Data
Population Standard Deviation 19.6
Sample Mean 33.4
Sample Size 31
Confidence Level 95%

Intermediate Calculations
Standard Error of the Mean 3.5203 Note: Click on corresponding cell, to know how it is calculate
Z Value -1.9600
Interval Half Width 6.8996

Confidence Interval
Interval Lower Limit 26.5004
Interval Upper Limit 40.2996
market for bone formations measured
milliliter (ng/ml).
19.6 ng/ml. report the 95% confidence interval.

cell, to know how it is calculated


#6.32 Accuracy of a laboratory scale. To assess the accuracy of a laboratory scale,
a standard weight known to weigh 10 grams is weighed repeatedly. The scale readings are Normally
distributed with unknown mean (this mean is 10 grams if the scale has no bias.) the SD of the scale readings is known to be 0

(a)    The weight is measured five times. The main result is 10.0023 grams. Give a 98% confidence inte
(b)   How many measurements must be averaged to get a margin of error of ±0.0001 with 98% confid
(a)

Confidence Interval Estimate for the Mean

Data
Population Standard Deviation 0
Sample Mean 10
Sample Size 5
Confidence Level 98%

Intermediate Calculations
Standard Error of the Mean 0.0001 Note: Click on corresponding cell, to know how it is calculat
Z Value -2.3263
Interval Half Width 0.0002

Confidence Interval
Interval Lower Limit 10.0021
Interval Upper Limit 10.0025

(b)
Sample Size Determination

Data
Population Standard Deviation 0
Sampling Error 0
Confidence Level 98%

Intemediate Calculations
Z Value -2.33 Note: Click on corresponding cell, to know how it is calculat
Calculated Sample Size 21.65

Result
Sample Size Needed 22
scale readings is known to be 0.0002 gram.

ive a 98% confidence interval for the mean of repeated measurements of the weight.
±0.0001 with 98% confidence?

cell, to know how it is calculated

cell, to know how it is calculated


#6.58 A two-sided test and the confidence interval. The P-value for a two-sided test of the null
(a)    Does the 95% confidence interval include the value 30? Why?
(b)   Does the 90% confidence interval include the value 30? Why?

Solutions:
(a)
No, it does not include the value 30 at 95% confidence interval
Because generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level
Here, P-value 0.04 < 0.05 (significance level), so it is rejecting the null hypothesis.
Means, it doesn't inlcude 30

(b)
No, it does not include the value 30 at 90% confidence interval
Because generally, one rejects the null hypothesis if the p-value is smaller than or equal to the significance level
Here, P-value 0.04 < 0.1 (significance level), so it is rejecting the null hypothesis.
Means, it doesn't inlcude 30
or a two-sided test of the null hypothesis H0:µ=30 is 0.04.

n or equal to the significance level

n or equal to the significance level


#6.66 Are the pine trees randomly distributed north to south? In example 6.1 we looked at th
One way to formulate hypotheses about whether or not the trees are randomly distributed in the tract is to examine the averag
north-south direction. The values range from 0 to 200, so if the trees are uniformly distributed in this direction, any difference fr
values (100) should be due to chance variation. The sample means for the 584 trees in the tract is 99.74. A theoretical calcula
assumption that the trees are uniformly distributed gives a SD of 58. Carefully state the null and alternative hypotheses in term
Note that this requires that you translate the research question about the random distribution of the trees into specific statemen
of a probability distribution. Test your hypotheses, report your results, and write a short summary of what you have found.

Solution:

Null hypothesis: H0 : The pine trees are randomly distributed north to south (µ = 100)
Alternative hypothesis H1 : The pine trees are not randomly distributed north to south (µ <> 100)

Z Test of Hypothesis for the Mean

Data
Null Hypothesis µ= 100
Level of Significance 0.05
Population Standard Deviation 58
Sample Size 584
Sample Mean 99.74

Intermediate Calculations
Standard Error of the Mean 2.4
Z Test Statistic -0.11

Two-Tail Test
Lower Critical Value -1.96
Upper Critical Value 1.96
p-Value 0.91
Do not reject the null hypothesis

As p-value is grater than 0.05 (alpha), we do not reject the null hypothesis. Means, The pine trees are randomly distrib
example 6.1 we looked at the distribution of longleaf pine trees in the Wade Tract.
the tract is to examine the average location in the
d in this direction, any difference from the middle
ract is 99.74. A theoretical calculation based on the
and alternative hypotheses in terms of this variable.
of the trees into specific statements about the mean
mary of what you have found.

pine trees are randomly distributed north to south


6.68 Who is the author? Statistics can help decide the authorship of literary works.
Sonnets by a certain Elizabethan poet are known to contain an average of µ=8.9 new words (words are not used in the poet’s
other works). The SD of the number of new word is δ=2.5. Now a manuscript with debating whether it is the poet’s work. The n
=10.2 words not used in the poet’s known works. We expect poems by another author to contain mo
H0:µ = 8.9
Ha: µ > 8.9
Give the z test statistics and its P-value. What do you conclude about the authorship of the new poem

Solution:
Z Test of Hypothesis for the Mean

Data
Null Hypothesis µ= 8.9
Level of Significance 0.05
Population Standard Deviation 2.5
Sample Size 1
Sample Mean 10.2

Intermediate Calculations
Standard Error of the Mean 2.5
Z Test Statistic 0.52

Upper-Tail Test
Upper Critical Value 1.6449
p-Value 0.3015
Do not reject the null hypothesis

As we are not rejecting the null hypothesis, it is saying that µ=8.9, so the authorship of the new poems is Elizabethan
erary works.
s (words are not used in the poet’s
whether it is the poet’s work. The new sonnets contain an average of
nother author to contain more new words, so to see if we have evidence that the new sonnets are not by our po

authorship of the new poems?

of the new poems is Elizabethan poet


ew sonnets are not by our poet we test

You might also like