You are on page 1of 9

MAST20005 Statistics, Assignment 2

Brendan Hill - Student 699917 (Tutorial Thursday 10am)

November 19, 2016

Question 1
To determine the size for the new sample resulting in a 100(1 − α)% confidence interval of width ± we must solve
the following equation for n:
σ
 = zα/2 · √
n

√ a 95% confidence interval (α = .05) of width ± = 0.5, and from the previous experiment we can assume
We want
σ = 34.9. Hence:


34.9
0.5 = z0.025 · √
n
 √ 2
34.9
⇒ n = 1.96 · 0.5

⇒ n = 536.2677

Rounding up, the sample size required is n = 537.

1
Question 2
I will use the textbook convention of yi = α + β(xi − x̄) + i . Note that x̄ = 23.0667

(a)
The least squares regression line is:

y = 26.33333 + 0.5062(x − 23.0667)

(b)
The scatterplot with regression line is:

assignment 2 Q2b.PNG
While there is significant variance, the linear model may still be appropriate.

(c)
Point estimates for linear model (where σ̂ 2 is calculated using (n − 2) degrees of freedom):

α̂ = ȳ = 26.3333
Pn
yi (xi − x̄)
β̂ = Pi=1
n 2
= 0.5062
i=1 (xi − x̄)
n
X
σ̂ 2 = 1
n−2 (yi − α̂ − β̂(xi − x̄)) = 16.29896
i=1

(d)
The 95% confidence intervals for α, β and σ 2 are given by:
σ̂
α : α̂ ± t0.025 (n − 2) √ = [24.08137, 28.58530]
n
" #
σ̂
β : β̂ ± t0.25 (n − 2) pPn = [0.0445, 0.9678]
2
i=1 (xi − x̄)
h nσ̂ 2 nσ̂ 2 i
σ2 : , = [9.88390, 48.81145]
χ0.975 (n − 2) χ0.025 (n − 2)

(e)
Let x0 = 25. Then, the 95% confidence interval for the mean score is:
s
0 1 x0 − x̄
yc : α + β(x − x̄) ± t0.025 (n − 2) · σ̂ + Pn 2
= [24.88953, 29.73430]
n i=1 (xi − x̄)

And the 95% prediction interval:


s
1 x0 − x̄
yp : α + β(x0 − x̄) ± t0.025 (n − 2) · σ̂ 1+ + Pn 2
= [18.25994, 36.36390]
n i=1 (xi − x̄)

2
Question 3
(a)
Given H0 : θ = 2, the probability of a Type I error is:

α = P (X > 3|θ = 2) = 1 − (1 − e−3/θ ) = 0.223140

(b)
Given H1 : θ = 5, the probability of a Type II error is:

β = P (X ≤ 3|θ = 5) = 1 − e−3/θ = 0.451188

(c)
The power of the test is:
1 − β = 0.548812

(d)
Note that under H0 :

P (X > 5.991465|θ = 2) = 1 − (1 − e−5.991465/2 ) ≈ 0.05


So the following test of H0 and H1 has a significance of 0.05:

Reject H0 if the observed value x > 5.991465.

3
Question 4
(a)
Assume that X ≈ N (µ, σ 2 ). Given H0 : µ = 0.5, the test with significance 0.05 is:

X̄ − 0.5
t= √ ≥ t0.05 (n − 1)
s/ n

(b)
The sample provided yields n = 10, x̄ = 0.484, s = 0.2398, so:
0.484 − 0.5
t= √ = −0.210973
0.2398/ 10
t0.05 (9) = 1.833113
Since it is not the case that −0.210973 > 1.833113, this sample does not provide enough evidence to reject H0 .

(c)
The two-sided 95% confidence interval is given by the following formula:

x̄ ± t0.025 (n − 1) · √s
n

So the two-side confidence interval given by the sample is:

0.2398
= 0.484 ± t0.025 (9) · √
10
= 0.484 ± 2.262157 · 0.075839
= 0.484 ± 0.1715597
= [0.3124403, 0.6555597]

(d)
The test statistic t did not fall in the (one sided) critical region, which is sufficient to reject the alternative hypothesis
H1 : µ > 0.5 at the 0.05 significance level.

Additionally, the null hypothesis H0 : µ = 0.5 falls within the 95% (two-sided) confidence interval for µ, which
would be sufficient to reject an alternative hypothesis H2 : µ 6= 0.5 at the 0.05 significance level.

4
Question 5
(a)
The test statistic t and critical value are given by the following the following inequality:

W̄ − 0
t= √ ≤ −t0.05 (n − 1)
s/ n

(b)
The sample provided yields n = 20, w̄ = −0.325, s = 0.6463, so:
−0.325
t= √ ≤ −t0.05 (19)
0.6463/ 20
Hence:

t = −2.248709
−t0.05 (19) = −1.729133
Since −2.248709 ≤ −1.729133, the observed value of w̄ is more extreme that we would expect under H0 at the 95%
confidence level, so we reject the null hypothesis.

(c)
At the 99% confidence level we have the critical value:

−t0.01 (19) = −2.539483


Since −2.539483 < −2.248709 = t however, we cannot reject H0 at this level of confidence.

(d)
The p-value is 0.018295.

5
Question 6
We shall assume that the plant growth rates distribute normally.

2
So the growth rate of plants exposed to normal air distributes according to N (µX , σX ), and the growth rate of
2
plans exposed to enriched air distributes according to N (µY , σY ).

The sample variances are sX = 0.9562 and sY = 1.6098. Given the difference, we will not assume that the vari-
ances are equal.

Let the null hypothesis be H0 : µX = µY and the alternative hypothesis be H1 : µX < µY .

The test statistic and critical value for 95% confidence are given by:

X̄ − Ȳ
t= q 2 2
≤ −t0.05 (n + m − 2)
SX SY
n + m

The sample yields n = 12, m = 8, x̄ = 4.16333, ȳ = 5.105, sX = 0.9562 and sY = 1.6098, so:
4.16333 − 5.105
t= q = −1.488675
0.95622 1.60982
12 + 8

−t0.05 (18) = −1.734064


Since it is not the case that t < −1.734064, we cannot reject H0 at the 95% confidence level.

Hence, there is not enough evidence from this sample to conclude that the enriched air increased plant growth.

6
Question 7
Suppose the null hypothesis is H0 : σ 2 = σ02 and the alternative hypothesis is H1 : σ 2 > σ02 .

Then, the usual test statistic t at significance level α is:

(n − 1)s2
t= ≥ χ2α (n − 1)
σ02
In general, a χ2 distribution approaches a normal distribution as the degrees of freedom v becomes large, according
to the following relationship:

χ2 (v) − v
√ ≈ N (0, 1), as v → ∞
2v
Hence for large enough n, given degrees of freedom v = (n − 1), the following test statistic can be used:
(n−1)s2
σ02
− (n − 1)
z= p ≥ zα
2(n − 1)
So for large enough n, an approximate critical region for testing H0 against H1 at the α significance level is given by:
(n−1)s2
2 −(n−1)
σ0
⇒ √ ≥ zα
2(n−1)

(n−1)s2
p
⇒ σ02
− (n − 1) ≥ zα 2(n − 1)
(n−1)s2
p
⇒ σ02
≥ (n − 1) + zα 2(n − 1)

s2 2(n−1)
⇒ σ02
≥ (n−1)
(n−1) + zα (n−1)
q
s2 2
⇒ σ02
≥ 1 + zα n−1
 q 
2
⇒ s2 ≥ σ02 1 + zα n−1

7
Question 8
(a)
Given the large sample size, the normal distribution can be used instead of the T distribution.

So the test statistic and critical region are:


p̂1 − p̂2
z=p q ≥ z0.05 = 1.64
p̂(1 − p̂) n11 + 1
n2

Y1 +Y2
Where p̂ = n1 +n2 , given that under the null hypothesis p1 = p2 .

(b)
Note that p̂ = (135 + 77)/(900 + 700) = 0.1325, p̂1 = (135/900) = 0.15 and p̂2 = (77/700) = 0.11. So the test statistic
is:

0.15 − 0.11
z=p q = 2.3411
1 1
0.1325(1 − 0.1325) 900 + 700

Since 2.3411 > 1.64, we reject H0 at the 95% significance level.

(c)
If α = 0.01 then the critical region is give by z > z0.01 = 2.3263 .

Since z = 2.3411 > 2.3263, we reject the null hypothesis at the 99% confidence level as well.

(d)
The p-value of this test is 0.009613

8
Question 9
Given a random sample of size n from a population distributed according to N (µ, σ 2 ) with known σ 2 , the sample
mean X̄ distributes according to a normal distribution (for large enough n), which when standardized is:

X̄ − µ0
tN = √ ∼ N (0, 1)
σ/ n
Since the sum of k standard normal distributions each squared is a χ2 (k) distribution, and we have a single standard
normal distribution on the LHS, squaring both sides gives the following test statistic:
 X̄ − µ 2
tχ2 = √ 0 ∼ χ2 (1)
σ/ n
Hence for large enough n, the hypothesis H0 : µ = µ0 can be tested against the alternative H1 : µ 6= µ0 using the
following test statistic and critical region:
 X̄ − µ 2
tχ2 = √ 0 ≥ χ2α (1)
σ/ n
The squaring of the standard normally distributed variable causes both the left and right tails (each of area α/2) to
map to the right tail of the χ2 . Since this reduces to a single tail, a significance level α is appropriate.

You might also like