MAST20005 Statistics Assignment 2

MAST20005 Statistics, Assignment 2
Brendan Hill - Student 699917 (Tutorial Thursday 10am)
November 19, 2016
Question 1
To determine the size for the new sample resulting in a 100(1 − α)% confidence interval of width ± we must solve
the following equation for n:
σ
= zα/2 · √
n
√ a 95% confidence interval (α = .05) of width ± = 0.5, and from the previous experiment we can assume
We want
σ = 34.9. Hence:
√
34.9
0.5 = z0.025 · √
n
√ 2
34.9
⇒ n = 1.96 · 0.5
⇒ n = 536.2677
Rounding up, the sample size required is n = 537.
1
Question 2
I will use the textbook convention of yi = α + β(xi − x̄) + i . Note that x̄ = 23.0667
(a)
The least squares regression line is:
y = 26.33333 + 0.5062(x − 23.0667)
(b)
The scatterplot with regression line is:
assignment 2 Q2b.PNG
While there is significant variance, the linear model may still be appropriate.
(c)
Point estimates for linear model (where σ̂ 2 is calculated using (n − 2) degrees of freedom):
α̂ = ȳ = 26.3333
Pn
yi (xi − x̄)
β̂ = Pi=1
n 2
= 0.5062
i=1 (xi − x̄)
n
X
σ̂ 2 = 1
n−2 (yi − α̂ − β̂(xi − x̄)) = 16.29896
i=1
(d)
The 95% confidence intervals for α, β and σ 2 are given by:
σ̂
α : α̂ ± t0.025 (n − 2) √ = [24.08137, 28.58530]
n
" #
σ̂
β : β̂ ± t0.25 (n − 2) pPn = [0.0445, 0.9678]
2
i=1 (xi − x̄)
h nσ̂ 2 nσ̂ 2 i
σ2 : , = [9.88390, 48.81145]
χ0.975 (n − 2) χ0.025 (n − 2)
(e)
Let x0 = 25. Then, the 95% confidence interval for the mean score is:
s
0 1 x0 − x̄
yc : α + β(x − x̄) ± t0.025 (n − 2) · σ̂ + Pn 2
= [24.88953, 29.73430]
n i=1 (xi − x̄)
And the 95% prediction interval:

s
1 x0 − x̄
yp : α + β(x0 − x̄) ± t0.025 (n − 2) · σ̂ 1+ + Pn 2
= [18.25994, 36.36390]
n i=1 (xi − x̄)
2
Question 3
(a)
Given H0 : θ = 2, the probability of a Type I error is:
α = P (X > 3|θ = 2) = 1 − (1 − e−3/θ ) = 0.223140
(b)
Given H1 : θ = 5, the probability of a Type II error is:
β = P (X ≤ 3|θ = 5) = 1 − e−3/θ = 0.451188
(c)
The power of the test is:
1 − β = 0.548812
(d)
Note that under H0 :
P (X > 5.991465|θ = 2) = 1 − (1 − e−5.991465/2 ) ≈ 0.05

So the following test of H0 and H1 has a significance of 0.05:
Reject H0 if the observed value x > 5.991465.
3
Question 4
(a)
Assume that X ≈ N (µ, σ 2 ). Given H0 : µ = 0.5, the test with significance 0.05 is:
X̄ − 0.5
t= √ ≥ t0.05 (n − 1)
s/ n
(b)
The sample provided yields n = 10, x̄ = 0.484, s = 0.2398, so:
0.484 − 0.5
t= √ = −0.210973
0.2398/ 10
t0.05 (9) = 1.833113
Since it is not the case that −0.210973 > 1.833113, this sample does not provide enough evidence to reject H0 .
(c)
The two-sided 95% confidence interval is given by the following formula:
x̄ ± t0.025 (n − 1) · √s
n
So the two-side confidence interval given by the sample is:
0.2398
= 0.484 ± t0.025 (9) · √
10
= 0.484 ± 2.262157 · 0.075839
= 0.484 ± 0.1715597
= [0.3124403, 0.6555597]
(d)
The test statistic t did not fall in the (one sided) critical region, which is sufficient to reject the alternative hypothesis
H1 : µ > 0.5 at the 0.05 significance level.
Additionally, the null hypothesis H0 : µ = 0.5 falls within the 95% (two-sided) confidence interval for µ, which
would be sufficient to reject an alternative hypothesis H2 : µ 6= 0.5 at the 0.05 significance level.
4
Question 5
(a)
The test statistic t and critical value are given by the following the following inequality:
W̄ − 0
t= √ ≤ −t0.05 (n − 1)
s/ n
(b)
The sample provided yields n = 20, w̄ = −0.325, s = 0.6463, so:
−0.325
t= √ ≤ −t0.05 (19)
0.6463/ 20
Hence:
t = −2.248709
−t0.05 (19) = −1.729133
Since −2.248709 ≤ −1.729133, the observed value of w̄ is more extreme that we would expect under H0 at the 95%
confidence level, so we reject the null hypothesis.
(c)
At the 99% confidence level we have the critical value:
−t0.01 (19) = −2.539483

Since −2.539483 < −2.248709 = t however, we cannot reject H0 at this level of confidence.
(d)
The p-value is 0.018295.
5
Question 6
We shall assume that the plant growth rates distribute normally.
2
So the growth rate of plants exposed to normal air distributes according to N (µX , σX ), and the growth rate of
2
plans exposed to enriched air distributes according to N (µY , σY ).
The sample variances are sX = 0.9562 and sY = 1.6098. Given the difference, we will not assume that the vari-
ances are equal.
Let the null hypothesis be H0 : µX = µY and the alternative hypothesis be H1 : µX < µY .
The test statistic and critical value for 95% confidence are given by:
X̄ − Ȳ
t= q 2 2
≤ −t0.05 (n + m − 2)
SX SY
n + m
The sample yields n = 12, m = 8, x̄ = 4.16333, ȳ = 5.105, sX = 0.9562 and sY = 1.6098, so:
4.16333 − 5.105
t= q = −1.488675
0.95622 1.60982
12 + 8
−t0.05 (18) = −1.734064

Since it is not the case that t < −1.734064, we cannot reject H0 at the 95% confidence level.
Hence, there is not enough evidence from this sample to conclude that the enriched air increased plant growth.
6
Question 7
Suppose the null hypothesis is H0 : σ 2 = σ02 and the alternative hypothesis is H1 : σ 2 > σ02 .
Then, the usual test statistic t at significance level α is:
(n − 1)s2
t= ≥ χ2α (n − 1)
σ02
In general, a χ2 distribution approaches a normal distribution as the degrees of freedom v becomes large, according
to the following relationship:
χ2 (v) − v
√ ≈ N (0, 1), as v → ∞
2v
Hence for large enough n, given degrees of freedom v = (n − 1), the following test statistic can be used:
(n−1)s2
σ02
− (n − 1)
z= p ≥ zα
2(n − 1)
So for large enough n, an approximate critical region for testing H0 against H1 at the α significance level is given by:
(n−1)s2
2 −(n−1)
σ0
⇒ √ ≥ zα
2(n−1)
(n−1)s2
p
⇒ σ02
− (n − 1) ≥ zα 2(n − 1)
(n−1)s2
p
⇒ σ02
≥ (n − 1) + zα 2(n − 1)
√
s2 2(n−1)
⇒ σ02
≥ (n−1)
(n−1) + zα (n−1)
q
s2 2
⇒ σ02
≥ 1 + zα n−1
q
2
⇒ s2 ≥ σ02 1 + zα n−1
7
Question 8
(a)
Given the large sample size, the normal distribution can be used instead of the T distribution.
So the test statistic and critical region are:

p̂1 − p̂2
z=p q ≥ z0.05 = 1.64
p̂(1 − p̂) n11 + 1
n2
Y1 +Y2
Where p̂ = n1 +n2 , given that under the null hypothesis p1 = p2 .
(b)
Note that p̂ = (135 + 77)/(900 + 700) = 0.1325, p̂1 = (135/900) = 0.15 and p̂2 = (77/700) = 0.11. So the test statistic
is:
0.15 − 0.11
z=p q = 2.3411
1 1
0.1325(1 − 0.1325) 900 + 700
Since 2.3411 > 1.64, we reject H0 at the 95% significance level.
(c)
If α = 0.01 then the critical region is give by z > z0.01 = 2.3263 .
Since z = 2.3411 > 2.3263, we reject the null hypothesis at the 99% confidence level as well.
(d)
The p-value of this test is 0.009613
8
Question 9
Given a random sample of size n from a population distributed according to N (µ, σ 2 ) with known σ 2 , the sample
mean X̄ distributes according to a normal distribution (for large enough n), which when standardized is:
X̄ − µ0
tN = √ ∼ N (0, 1)
σ/ n
Since the sum of k standard normal distributions each squared is a χ2 (k) distribution, and we have a single standard
normal distribution on the LHS, squaring both sides gives the following test statistic:
X̄ − µ 2
tχ2 = √ 0 ∼ χ2 (1)
σ/ n
Hence for large enough n, the hypothesis H0 : µ = µ0 can be tested against the alternative H1 : µ 6= µ0 using the
following test statistic and critical region:
X̄ − µ 2
tχ2 = √ 0 ≥ χ2α (1)
σ/ n
The squaring of the standard normally distributed variable causes both the left and right tails (each of area α/2) to
map to the right tail of the χ2 . Since this reduces to a single tail, a significance level α is appropriate.

MAST20005 Statistics Assignment 2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MAST20005 Statistics Assignment 2

Uploaded by

Copyright:

Available Formats

MAST20005 Statistics, Assignment 2

Brendan Hill - Student 699917 (Tutorial Thursday 10am)

November 19, 2016

Rounding up, the sample size required is n = 537.

y = 26.33333 + 0.5062(x − 23.0667)

And the 95% prediction interval:

α = P (X > 3|θ = 2) = 1 − (1 − e−3/θ ) = 0.223140

β = P (X ≤ 3|θ = 5) = 1 − e−3/θ = 0.451188

P (X > 5.991465|θ = 2) = 1 − (1 − e−5.991465/2 ) ≈ 0.05

Reject H0 if the observed value x > 5.991465.

So the two-side confidence interval given by the sample is:

−t0.01 (19) = −2.539483

Let the null hypothesis be H0 : µX = µY and the alternative hypothesis be H1 : µX < µY .

−t0.05 (18) = −1.734064

Then, the usual test statistic t at significance level α is:

So the test statistic and critical region are:

Since 2.3411 > 1.64, we reject H0 at the 95% significance level.

You might also like