Simple Comparative Experiments: Design of Experiments - Montgomery Section 2-4 and 2-5

Simple Comparative Experiments
Comparing Two Means
Design of Experiments - Montgomery

Section 2-4 and 2-5
STAT 514
Statistical Inference
Will review two common types of statistical inference
Hypothesis testing
Confidence intervals
Will consider data collected under two different experi-

mental designs
Completely randomized design (CRD)
Randomized complete block design (RCBD)
Method of analysis depends on the design used
Comparing Two Means 2

STAT 514
Completely Randomized Design

Our focus now is on two treatments and one population
Experimental units obtained from the population
Randomly assign treatments to each experimental unit
Key: All treatment allocations are equally likely
Each EU has the same allocation probabilities
If n1 to be assigned Trt 1 and n2 to be assigned Trt 2
P (EU is assigned Trt 1) = n1 /(n1 + n2 )
Response data typically denoted using subscripts

Treatment 1 y11 , y12 , . . . , y1n1
Treatment 2 y21 , y22 , . . . , y2n2

STAT 514
Two-sample t Test
Interest in comparing mean responses to Trt 1 and Trt 2
H0 : 1 = 2 (Null Hypothesis)
<
H1 : 1 > 2 (Alternative Hypothesis)
6=
Compute t statistic
s
S12 S22
to = (y 1 y 2 )/ +
n1 n2
Is observed to unusual if H0 : 1 = 2 ?

STAT 514
Assumptions
1. Independent responses
2. Normally distributed observations
Assuming H0 : 1 = 2 , the sampling distribution of to
is approximately t distributed with k degrees of freedom.
Why is it approximate?
k typically estimated using a Satterthwaite approximation
Unusual quantified as the probability that a randomly
drawn t is more extreme then to (tail region)
Reject the null hypothesis when when this statistic (the
P -value) is small. Small typically based on choice of
significance level
STAT 514
Equal Variance t test

One can assume variances of treatment response are
equal
In that case, we pool our two variance estimates
r
1 1 (n1 1)S12 + (n2 1)S22
to = (y 1 y 2 )/Sp + , Sp2 =
n1 n2 n1 + n2 2
Then assuming H0 : 1 = 2 results in to being t

distributed with n1 + n2 2 degrees of freedom
This method is generally not recommended. Used to be
more popular when the P -value was determined from t
tables rather than a computer

STAT 514
Example
(Samuels 7.36) In a study of lettuce growth, ten seedlings
were randomly allocated to be grown in either a standard
nutrient solution or in a solution containing extra
nitrogen. After 22 days, the plants were harvested and
weighed. The table below summarizes the results. Can we
conclude that extra nitrogen enhances growth?
Leaf Dry Weight (gm)

Nutrient Solution n Mean SD
Standard 5 3.62 0.54
Extra 5 4.17 0.67

STAT 514
Solution I
The test statistic is
4.17 3.62
to = q = 1.43
0.542 0.672
5
+ 5
The degrees of freedom are approximated by

(0.542 /5 + 0.672 /5)2
df =
(0.542 /5)2 (0.672 /5)2
51
+ 51
= 7.6546
This question asks about a one-sided alternative. With 7.655

degrees of freedom, the P -value equals 0.09625. At the
= 0.05 significance level, we would not reject the null
and conclude that there is not sufficient evidence to state
that extra nitrogen enhances growth.

STAT 514
Solution II
The pooled variance is
S2p = (4(.54)2 + 4(.67)2 )/8 = 0.37
Our test statistic is

p
to = (4.17 3.62)/ 2(.37)/5 = 1.43
This question asks about a one-sided alternative. With 8

degrees of freedom, the P -value equal 0.09502. If = 0.10,
we would reject the null and conclude that extra nitrogen
enhances growth. For = 0.05, we would not reject the
null and conclude there is not sufficient evidence to state
that the extra nitrogen enhances growth.

STAT 514
Statistical Model Approach

For pooled two-sample t test, can express model as

i = 1, 2
yij = + i + ij
j = 1, 2, . . . n
i
where
+ i = mean for treatment i
ij are iid random errors . . . ij N (0, 2 )
Can express Null in terms of treatment effects 1 and 2
H 0 : 1 = 2 = 0
H1 : at least one i different than 0

STAT 514
Will use these linear model representations

throughout the course

STAT 514
Using SAS - t Test

tandpermtest.sas
data lettuce;
input solution $ weight @@;
cards;
Standard 3.25 Standard 3.83 Standard 4.28
Standard 2.91 Standard 3.83 Extra 4.85
Extra 3.30 Extra 4.23 Extra 4.76 Extra 3.71
;
proc ttest;
class solution;
var weight;
run;

STAT 514
Output
The TTEST Procedure
Variable: weight
solution N Mean Std Dev Std Err Minimum Maximum

Extra 5 4.1700 0.6676 0.2985 3.3000 4.8500
Standard 5 3.6200 0.5396 0.2413 2.9100 4.2800
Diff (1-2) 0.5500 0.6070 0.3839
solution Method Mean 95% CL Mean Std Dev

Extra 4.1700 3.3411 4.9989 0.6676
Standard 3.6200 2.9500 4.2900 0.5396
Diff (1-2) Pooled 0.5500 -0.3352 1.4352 0.6070
Diff (1-2) Satterthwaite 0.5500 -0.3421 1.4421
Method Variances DF t Value Pr > |t|

Pooled Equal 8 1.43 0.1898
Satterthwaite Unequal 7.6633 1.43 0.1914

STAT 514
Using SAS - Permutation Test

proc multtest data=lettuce permutation nsample=1000 outsamp=pdist seed=612;
test mean(weight);
contrast Extra vs Standard 1 -1;
class solution;
*********Generate Histogram of Sampling Distribution***********;
ods graphics off; ods exclude all; ods noresults;
proc glm data=pdist;
class _class_;
model weight = _class_;
by _sample_;
estimate Extra vs Standard _class_ 1 -1;
ods output Estimates=pdist1;
run;
ods graphics on; ods exclude none; ods results;
proc univariate noprint;

histogram tValue;
run;

STAT 514
Output
The Multtest Procedure
Model Information
Test for continuous variables Mean t-test
Degrees of Freedom Method Pooled
Tails for continuous tests Two-tailed
Strata weights None
P-value adjustment Permutation
Center continuous variables No
Number of resamples 1000
Seed 612
p-Values
Variable Contrast Raw Permutation
weight Extra vs Standard 0.1898 0.2390

STAT 514
Approximate Sampling Distribution

STAT 514
Type I and Type II errors

In hypothesis testing, two types of errors are possible
TEST RESULT
DNR R
R
E H0 I
A
L
I
T H1 II
Y
Type I error: P(reject H0 |H0 true) (false positive)
Type II error: P(do not reject H0 |H0 false) (false negative)
Power of a test (for specific H1 ) is 1P(Type II error)

Significance level is our selected P(Type I error)

STAT 514
Choice of Sample Size/Power
P(Type I)
Null hypothesis
-4 -2 0 2 4
P(Type II)
Alternative hypothesis
-4 -2 0 2 4
Goal of test: Detect diff of size = |1 2 | with high prob

Choice of based on science (e.g., practical significance)
Probability to detect difference is the power
Power depends on , , , and n

STAT 514
Calculating Power
Assume is known (i.e., use Normal) and n1 = n2 = n
H0 : y 1 y 2 N (0, 2 2 /n)
H1 : y 1 y 2 N (, 2 2 /n)
Reject if (using the H0 sampling dist)
p p
y 1 y 2 > z/2 2 2 /n or y 1 y 2 < z/2 2 2 /n
Power: (using the H1 sampling dist and the rejection region)
p p
P(z > z/2 / 2 2 /n) + P(z < z/2 / 2 2 /n)

STAT 514
Example I
Suppose = .05, 2 =12.5, n = 25, and = 3.5
z/2 = 1.96 and 2 2 /25 = 1
Power = P(z > 1.96 3.5) + P(z < 1.96 3.5)
= .9382 + .0000
If n were only 10,
Power = P(z > 1.96 2.214) + P(z < 1.96 2.214)
= .6001 + .0000
The larger the sample size, the greater the power

STAT 514
Power Calculations ( unknown)
Can reference an Operating Characteristic Curve (Power

curve) for different levels of , /, and n
Can use non-central and central t distribution functions
Reject if:
p p
y 1 y 2 > t/2,2(n1) 2Sp2 /n or y 1 y 2 < t/2,2(n1) 2Sp2 /n
Power: P(reject | H1 )
y1 y2
p
p t2(n1) / 2 2 /n
2Sp2 /n
p
Noncentral parameter / 2 2 /n
Compute probability of rejection given noncentral t

STAT 514
Using SAS - Example on Page 47

tpower.sas
proc power;
twosamplemeans alpha=.05 nulldiff=0 sides=2
meandiff=.5 npergroup=. stddev=.25
power=.95;
run;
proc power;
meandiff=.25 npergroup=. stddev=.25
power=.90;
run;
proc power;
meandiff=.25 stddev=.25 power=.
npergroup=2 to 25 by 1;
plot interpol=join yopts=(ref=0.80);
run;

STAT 514
Output
The POWER Procedure
Two-Sample t Test for Mean Difference
Fixed Scenario Elements

Distribution Normal
Method Exact
Number of Sides 2
Null Difference 0
Alpha 0.05
Mean Difference 0.5
Standard Deviation 0.25
Nominal Power 0.95
Computed N Per Group
Actual N Per
Power Group
0.960 8

STAT 514
Output
The POWER Procedure
Two-Sample t Test for Mean Difference

Distribution Normal
Method Exact
Number of Sides 2
Null Difference 0
Alpha 0.05
Mean Difference 0.25
Nominal Power 0.9
Computed N Per Group
Actual N Per
Power Group
0.912 23

STAT 514

STAT 514
Confidence Intervals
In addition to an estimate, want statement of precision
100(1-)% confidence intervals
q
tn1 +n2 2,/2 Sp 1/n1 + 1/n2
q
tdf,/2 S12 /n1 + S22 /n2
If underlying assumptions are true, will fall within

100(1-)% of the possible intervals
You are 100(1-)% confident your single CI is one of
these intervals that contains

STAT 514
Hypothesis Testing and CIs
Consider a two-sided hypothesis test at level

p
Reject if |y 1 y 2 | > tn1 +n2 2,/2 Sp 1/n1 + 1/n2
Now consider a 100(1 )% CI

p
Half-width of CI is tn1 +n2 2,/2 Sp 1/n1 + 1/n2
p
0 not in interval if |y 1 y 2 | > tn1 +n2 2,/2 Sp 1/n1 + 1/n2
Will reject H0 if 0 not in confidence interval
Allows us to immediately test any H0 : = 0 at level

STAT 514
Sample Size Determination
Want the sample size for a desired precision/half-width

Consider equal variance and n1 = n2 = n
Half-width of CI depends on Sp and n
Problem: Sp is a random variable (depends on sample)
Solution: Specify desired probability that the observed
half width is less than or equal to the desired precision

STAT 514
Using SAS
tpower.sas
proc power;
twosamplemeans ci=diff alpha=.05 halfwidth=.25 stddev=.25
npergroup=. probwidth=0.80;
run;
proc power;
twosamplemeans ci=diff alpha=.05 halfwidth=.25 stddev=.25
npergroup=. probwidth=0.50;
run;

STAT 514
Output
The POWER Procedure
Confidence Interval for Mean Difference
Distribution Normal
Method Exact
Alpha 0.05
CI Half-Width 0.25
Nominal Prob(Width) 0.8
Number of Sides 2
Prob Type Conditional
Actual
Prob N Per
(Width) Group
0.801 11

STAT 514
Output
The POWER Procedure
Confidence Interval for Mean Difference
Distribution Normal
Method Exact
Alpha 0.05
CI Half-Width 0.25
Nominal Prob(Width) 0.5
Number of Sides 2
Prob Type Conditional
Actual
Prob N Per
(Width) Group
0.533 9

STAT 514
Paired Comparison
Can often improve precision by pairing similar EUs
Removes variation between EUs
Twins for drug/health studies - subject variability
Same tissue specimen given both trts - specimen variability
Similar plots in a field - plot variability
Because of the pairing, we perform inference on the pair
differences di . This reduces the 2n observations into n
indep differences
1
di = y1i y2i Sd2 = (di d)2
P
n1

to = d/(Sd / n)
to tn1

STAT 514
Randomized Complete Block Design
Design allows you to remove known sources of variability that

you are not interested in and therefore improve precision
Group EUs into blocks such that the EUs in a block are as
similar as possible. EUs across blocks can be very different.
Randomly assign treatments to EUs within a block using

process similar to CRD
Creates a restriction on the possible allocations...fewer

possible allocations than with CRD
Pairs of EUs is the smallest block size

STAT 514
Statistical Model
Pairing included as block effect (j ) in linear model

i = 1, 2
yij = + i + j + ij
j = 1, 2, . . . n
E(y 1. y 2. ) is still 1 2 as the s cancel

Lose degrees of freedom because of block effects
Two sample Paired
V(y1. y2. ): 2 2 /n 2( 2 Cov(Y1 , Y2 ))/n
DF: 2(n 1) (n 1)
Pairing advantageous when positive correlation. If correlation

slight relative to 2 , then loss of dfs may result in blocking
being a disadvantage. Only block on known sources of
variability that you want to remove.
STAT 514
Example
Paired T-test/Randomization Paired Test
In a study of egg cell maturation, the eggs from each of four

female frogs were divided into two batches and one batch was
exposed to progesterone. After two minutes, the cAMP content
was measured. It is believed that cAMP is a substance that can
mediate cellular response to hormones.
FROG cAMP Content
Control Progesterone Diff
1 6 4 2
2 4 5 -1
3 5 2 3
4 4 2 2

STAT 514
Solutions
t-test: d = {2, 1, 3, 2} d = 1.5 and sd = 0.866. The test
statistic is 1.732. With 3 df, the P-value is close to 0.18.
randomization: Assume the difference does not depend on

the allocation of treatments. There are 24 = 16 allocations.
The observed outcome is 2 1 + 3 + 2 = 6.
P
| d| # of occurrences
8 2
6 2
4 4
2 6
0 2
From the table, there are four of sixteen outcomes as or more unlikely simply
due to chance. Thus the P -value is 0.25.

STAT 514
Using R
- Using simulation to approximate the P-value -
#### t-test ####
diff <- c(2,-1,3,2)
tdiff <- (mean(diff)/sqrt(var(diff)/4))
pvalue <- 2*pt(-abs(tdiff),3)
#### permutation test ####

absdiff <- abs(diff)
tdist <- numeric(length=10000)
for(iperm in 1:10000)
{
randassign <- 2*rbinom(4,1,.5)-1 ##Creates set of +s and -s
diff1 <- absdiff*randassign
tdist[iperm] <- (mean(diff1)/sqrt(var(diff1)/4)) ##Creates sampling dist of t statistics
}
hist(tdist,nclass=530)
pvalue1 <- length(tdist[abs(tdist) >= abs(tdiff)])/10000
print(c(pvalue,pvalue1))

STAT 514
Histogram of tdist
1500
Frequency
1000
500
0
4 2 0 2 4
tdist

STAT 514
Using SAS
data camp;
input frog ctrl prog;
diff = prog - ctrl;
cards;
1 6 4
2 4 5
3 5 2
4 4 2
;
proc ttest;
paired ctrl*prog;
run;

STAT 514
Output
The TTEST Procedure
Difference: ctrl - prog
N Mean Std Dev Std Err Minimum Maximum

4 1.5000 1.7321 0.8660 -1.0000 3.0000
Mean 95% CL Mean Std Dev 95% CL Std Dev

1.5000 -1.2561 4.2561 1.7321 0.9812 6.4580
DF t Value Pr > |t|

3 1.73 0.1817

Simple Comparative Experiments: Design of Experiments - Montgomery Section 2-4 and 2-5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Simple Comparative Experiments: Design of Experiments - Montgomery Section 2-4 and 2-5

Uploaded by

Copyright:

Available Formats

Simple Comparative Experiments

Comparing Two Means

Design of Experiments - Montgomery

Will consider data collected under two different experi-

Method of analysis depends on the design used

Comparing Two Means 2

Completely Randomized Design

Response data typically denoted using subscripts

Comparing Two Means 3

Comparing Two Means 4

Equal Variance t test

Then assuming H0 : 1 = 2 results in to being t

Comparing Two Means 6

Leaf Dry Weight (gm)

Comparing Two Means 7

The degrees of freedom are approximated by

This question asks about a one-sided alternative. With 7.655

Comparing Two Means 8

Our test statistic is

This question asks about a one-sided alternative. With 8

Comparing Two Means 9

Statistical Model Approach

Comparing Two Means 10

Will use these linear model representations

Comparing Two Means 11

Using SAS - t Test

Comparing Two Means 12

solution N Mean Std Dev Std Err Minimum Maximum

solution Method Mean 95% CL Mean Std Dev

Method Variances DF t Value Pr > |t|

Comparing Two Means 13

Using SAS - Permutation Test

proc univariate noprint;

Comparing Two Means 14

Comparing Two Means 15

Approximate Sampling Distribution

Comparing Two Means 16

Type I and Type II errors

Type I error: P(reject H0 |H0 true) (false positive)

Type II error: P(do not reject H0 |H0 false) (false negative)

Power of a test (for specific H1 ) is 1P(Type II error)

Comparing Two Means 17

Choice of Sample Size/Power

Goal of test: Detect diff of size = |1 2 | with high prob

Comparing Two Means 18

Power: (using the H1 sampling dist and the rejection region)

Comparing Two Means 19

z/2 = 1.96 and 2 2 /25 = 1

Power = P(z > 1.96 3.5) + P(z < 1.96 3.5)

If n were only 10,

Power = P(z > 1.96 2.214) + P(z < 1.96 2.214)

The larger the sample size, the greater the power

Comparing Two Means 20

Power Calculations ( unknown)

Can reference an Operating Characteristic Curve (Power

Compute probability of rejection given noncentral t

Comparing Two Means 21

Using SAS - Example on Page 47

Comparing Two Means 22

Fixed Scenario Elements

Computed N Per Group

Comparing Two Means 23

Fixed Scenario Elements