You are on page 1of 45

IE 28

Statistical Analysis for Industrial Engineers

11.18.2010

Agenda
Submission of Homework
2. Type II Error and Choice of Sample Size
1.

1.

3.

OC Curves

Goodness of Fit

Type II Error and Choice of Sample


Size

Type II Error
Finding the probability of Type II Error

Type II Error
Finding the Probability of Type II Error

Sample Size Computation


Two-sided alternative hypothesis

One-sided alternative hypothesis

Example
From the M&Ms weight hypothesis test, we want to design

the test such that if the true mean of the weights differs from
50 grams by as much as 1 gram, the test will detect this (i.e.
H0: = 50) with a high probability of 90% of time.
= 2
= 51 50 = 1
= 0.05

= 0.10

Or

We could use an OC Curve

OC Curves Variance known


Operating Characteristic curves

Can help in performing sample size and type II errors


The curves plot the type II error, , against a parameter d for

various sample sizes.

0
d

Curves are provided for both alpha = 0.05 and 0.01 at

Appendix

OC Curve Example
Type II Error ()

0
d

Example
From the M&Ms weight hypothesis test, we want to design

the test such that if the true mean of the weights differs from
50 grams by as much as 1 gram, the test will detect this (i.e.
H0: = 50) with a high probability of 90% of time.
= 2
= 51 50 = 1
= 0.05

= 0.10

Example

n 45

Example
We want to know that if we would use 25

samples every time we test the weights.


How much of the time would we accept the
hypothesis if the true mean is 51 grams?

OC Curve Example
Type II Error () = 30%
There is approximately
30% chance that if the
true weight is 51
grams, it will not be
detected by the test
with sample size =25

Notes on OC curves
1.

The further the true value of the mean is from 0, the


smaller the probability of Type II error for a given n and .
We see that for a specified sample size and , large differences

in the mean are easier to detect than small ones.


Small Difference

Big Difference

True Mean

Notes on OC Curves
2.

For a given and , the probability of type II error decreases


as n increases.
To detect a specified difference in the mean, we may make the
test more powerful by increasing sample size.

From the OC Curves


These curves can be used to evaluate the -error

(or power) associated with a particular test


They can be used to DESIGN a test

OC Curves Variance Unknown


Type II Error is plotted against a parameter (two-sided

and one-sided) where is estimated by the sample


average X and 0 is the stated mean value in H0

0
d

It is the value of the ratio d that is important in

determining sample size


We can use s to estimate

Example
A 1992 article in the journal of the American Medical Association

(A Critical Appraisal of 98.6 degrees F, the Upper Limit of the


Normal Body temperature, and other Legacies of Carl Reinhold
August Wundrlich) reported body temperatures, gender, and
heart rate for a number of subjects. The body temperatures for 25
female subjects follow
97.8, 97.2, 97.4, 97.6, 97.8, 97.9, 98.0, 98.9, 98.0, 98.1, 98.2,
98.3, 98.3, 98.4, 98.4, 98.4, 98.5, 98.6, 98.6, 98.7, 98.8, 98.8,
98.9 98.9, and 99.0
Test the hypothesis H0: = 98.6 versus to 98.6 using =5%
(use p-value)
Compute for the power of the test if the true mean female body
temperature is as low as 98.0
What sample size would be required to detect a true mean female
body temperature as low as 98.2 if we wanted the power of the
test to be at least 0.9?

OC Curves
Two Means Variance known
One-sided/two-sided formula

1 2 0
1 2
2

1 2
2

IF sample size are not equal


n1 n2 n

2
1
2
1

n1

2
2
2
2

n2

Example
The plant manager of an orange juice canning facility is

interested in comparing the performance of two different


production lines in his plant. As line number 1 is relatively
new, he suspects that its output in number of cases per day is
greater than the number of cases produced by the older line
2. Ten days of data are selected at random for each line, for
which it is found that X1 = 824.9 cases per day, X2 = 818.6
cases per day. From experience with operating this type of
equipment, it is known that 12=40 and 22=50. If the true
difference in mean production rates were 10 cases per day,
find the sample size required to detect this difference with a
probability of 0.90. Use =0.05.

OC Curves Two Means, Variance


Unknown and Equal
Two/one-sided

1 2 0

2
2
Note: Sp
OC Curves must be used with sample size n* = 2n 1

(n1 1) s (n2 1) s
s
n1 n2 2
2
p

2
1

2
2

Example
Two different catalysts are being analyzed to determine how

they affect the mean yield of a chemical process. Specifically,


catalyst 1 is currently in use but catalyst 2 is acceptable. Since
catalyst 2 is cheaper, if it does not change the process yield, it
should be adapted. Pilot pant data yields n1=8, X1=91.73,
s12=3.89, n2=8, X2=93.75, s22=4.02. Assume that the two
normal populations have equal variances and =5%. Suppose
that if catalyst 2 produces a yield that differs from the yield of
catalyst 1 by 3.0%, we would like to reject the null
hypothesis with probability of at least 0.85. What sample size
is required?

OC Curves Two Means, Variance


Unknown and Unequal
OC Curves are not reliable for this test

OC Curves
Test on Variance
Two sided

Where sigma is the true value and sigma not is

the hypothesized value


One sided

H1 : upper
2

2
0

H1 : lower
2

2
0

OC Curve for chi-square (two sided)

Example
A machine is being used to fill cans with softdrink beverage.

If the variance of the fill volume exceeds 0.02 (fluid


ounces)2, then an unacceptably large percentage of the cans
will be underfilled. The bottler is interested in testing the
hypothesis.
2
2

H 0 : 0 0.02
H1 : 2 02 0.02

A random sample of n=20 cans yields a sample variance of

s2=0.0225. Find the probability of rejecting H0, if the true


variance is as large as 0.03. Use alpha=5%.

Example

OC Curves
Test on Ratio of Two Variances
Two Sided:

12 22
1

One-sided: define s12 as the larger sample

variance, and the alternative hypothesis is always


H1: 12 > 22
These curves assume that n1 = n2 = n

Example
In semiconductor manufacturing, wet chemical etching is often

used to remove silicon from the backs of water prior to


metalization. The etch rate is an important characteristic in this
process and known to follow a normal distribution. Two different
etching solutions have been compared, using two random samples
of 10 wafers for each solution. The observed etch rates are as
follows (in mils per minute)
Solution 1
Solution 2
9.9 10.6
10.2 10.0
9.4 10.3
10.6 10.2
9.3 10.0
10.7 10.7
9.6 10.3
10.4 10.4
10.2 10.1
10.5 10.3
Suppose that if on population variance is twice as large as the
other, we want to detect this with probability at least 0.90 (using
alpha=0.05). Are the sample sizes n1 = n2 = 10 adequate?

Proportions
Cant use OC Curves but Montgomery
outlines a procedure to compute for
sample size, type I and type II error
Section 9-5

Goodness of Fit

Testing for Goodness of Fit


The test is based on the chi-square distribution.

Assume there is a sample of size n from a

population whose probability distribution is


unknown.
Let Oi be the observed frequency in the ith class
interval.
Let Ei be the expected frequency in the ith class
interval.

Goodness of Fit Methodology


Test Statistic

(Oi Ei )

Ei
i 1
k

2
0

Rejection Region

Reject H0 if


2
0

where
v=kp1
v = degrees of freedom
k = number of class
p = number of parameters

2
0.05,v

Example
A die is tossed 60 times.

The result is as follows:

1
15

2
6

3
12

4
14

5
7

Is the die fair at 5% level of significance?

6
6

Example
Parameter of interest
Fairness of die
Null Hypothesis
H0: The die is fair
Alternative Hypothesis
H1: The die is not fair
Level of Significance
Alpha = 5%
Test Statistic
Rejection Criteria

Testing for Goodness of Fit


Example 9-12
The number of defects in printed circuit boards is hypothesized
to follow a Poisson Distribution. A random sample of n=60
printed boards has been collected, and the following number of
defects are observed.

Testing for Goodness of Fit


Example 9-12

Testing for Goodness of Fit


From the problem context, identify the parameter of
interest

1.

Form of the distribution of defects in printed circuit boards

State the null hypothesis, H0

2.

The form of the distribution of defects is Poisson

Specify an appropriate alternative hypothesis, H1.

3.

The form of distribution of defects is not Poisson

Choose a significance level, .

4.

5%

Testing for Goodness of Fit


Determine an appropriate test statistic.

5.

2
(
O

E
)
i
02 i
Ei
i 1
k

State the rejection region for the statistic.

6.

Reject H0 if


2
0

2
0.05,1

3.84

Compute any necessary sample quantities, substitute these into


the equation for the test statistic, and compute that value.

7.

Computations:

Testing for Goodness of Fit


8.

Decide whether or not H0 should be rejected and report


that in the problem context.

Testing for Goodness of Fit


Example 9-13
A manufacturing engineer is testing a power supply used in a
notebook computer and, using = 0.05, wishes to determine
whether output voltage is adequately described by a normal
distribution. Sample estimates of the mean and standard
deviation of x-bar= 5.04V and s = 0.08V are obtained from a
random sample of n=100 units.

Testing for Goodness of Fit


Example 9-13
Class Interval

Oi

x < 4.948

12

4.948 x < 4.986

14

4.986 x < 5.014

12

5.014 x < 5.040

13

5.040 x < 5.066

12

5.066 x < 5.094

11

5.094 x < 5.132

12

5.132 x

14

Testing for Goodness of Fit


MINIMUM FREQUENCY CORRECTION TO THE 2
STATISTIC
All the expected frequency values Ei used in the 2
goodness-of-fit test must have a minimum frequency of 5 for
the 2 approximation to be accurate.

If necessary, you should combine the Ei and Oi values of


adjacent values to guarantee that all Ei > 5. This addition should
be done so as to not unrealistically combine the variable values
into an implausible arrangement.

fin

You might also like