You are on page 1of 10

Analysis of Variance and Analysis of Variance and Analysis of Variance and Analysis of Variance and

Design Design of Experiments of Experiments--II


MODULE MODULE III III
gg pp
EXPERIMENTAL DESIGN EXPERIMENTAL DESIGN
LECTURE LECTURE - - 12 12
EXPERIMENTAL DESIGN EXPERIMENTAL DESIGN
MODELS MODELS
Dr. Shalabh
Department of Mathematics and Statistics
Indian Institute of Technology Kanpur
We consider the models which are usedin designingan experiment.
2
The experimental conditions, experimental setup and the objective of the study essentially determine that what type of
design is to be used and hence which type of design model can be used for the further statistical analysis to conclude
about the decisions. These models are based on one-way classification, two way classifications (with or without
interactions), etc. We discuss them now in detail in few setups which can be extended further to any order of
classification We discuss themnowunder the set up of one-wayand two-way classifications classification. We discuss themnowunder the set up of one-wayand two-way classifications.
It may be noted that it has already been described howto develop the likelihood ratio tests for the testing the hypothesis
of equality of more than two means fromnormal distributions and now we will concentrate more on deriving the same
tests through the least squares principle under the setup of linear regression model. The design matrix is assumed to be
not necessarilyof full rank andconsists of 0s and 1s only.
One-way classification
3
One-way classification
Let p random samples from p normal populations with same variances but different means and different sample sizes
have been independently drawn.
Let the observations follow the linear regression model setup and
Y
ij
: denotes the j
th
observation of dependent variable Y when effect of i
th
level of factor is present.
Then Y
ij
are independently normally distributed with
ij
p y y
2
( ) , 1,2,..., , 1,2,...,
( )
ij i i
ij
E Y i p j n
Var Y

= + = =
=
where
- is the general mean effect.
- is fixed.
i id b t th l diti f th i t l it dt t t

- gives an idea about the general conditions of the experimental units and treatments.
- is the effect of i
th
level of the factor.
- can be fixed or random.
i

Example Consider a medicine experiment in which are three different dosages of medicines - 2 mg., 5 mg., 10 mg.
which are given to the patients having fever These are the 3 levels of medicines Let Y denotes the time taken by the
4
which are given to the patients having fever. These are the 3 levels of medicines. Let Y denotes the time taken by the
medicine to reduce the body temperature fromhigh to normal. Suppose two patients have been given 2 mg. of dosage,
so and will denote their responses. So we can write that when is given to the two patients, then
1
2mg. =
1j
Y
12
Y
Similarly, if 5 mg. and 10 mg. of dosages are given to 4 and 7 patients respectively then the responses follow the model
1 1
( ) ; 1,2.
j
E Y j = + =
Here will denote the general mean effect which may be thought as follows: The humanbody has tendencyto fight against
2 2
3 3
( ) ; 1,2,3,4
( ) ; 1,2,3,4,5,6,7.
j
j
E Y j
E Y j


= + =
= + =

Here will denote the general mean effect which may be thought as follows: The human body has tendency to fight against
the fever, so the time taken by the medicine to bring down the temperature depends on many factors like body weight, height
etc. So denotes the general effect of all these factors which are present in all the observations.

In the terminology of linear regression model, denotes the intercept term which is the value of the response variable when
all the independent variables are set to take value zero. In experimental designs, the models with intercept term are more
commonly used and so generally we consider these types of models.

y g y yp
Also, we can express
5
; 1,2,..., , 1,2,...,
ij i ij i
Y i p j n = + + = =
where is the random error component in . It indicates the variations due to uncontrolled causes which can
influence the observations. We assume that are identically and independently distributed as with
ij

ij
Y
'
ij
s
2
(0, ) N
2
Note that the general linear model considered is
( ) E Y X =
for which can be written as Y
2
( ) 0, ( ) .
ij ij
E Var = =
for which can be written as
ij
Y
( )
ij i
E Y =
when all the entries in X are 0s or 1s. This model can also be re-expressed in the form of
( ) .
ij i
E Y = + ( ) .
ij i
E Y +
This gives rise to some more issues.
Consider
( )
( )

ij i
i
i
E Y


=
= +
= +
where
1
1
.
p
i
i
i i
p


=
=

Nowlet us see the changes in the structure of design matrix and the vector of regressioncoefficients
6
Now let us see the changes in the structure of design matrix and the vector of regression coefficients.
The model can be now written as
h i 1 t d * ( )
( )
ij i i
E Y = = +
2
( ) * *
( )
E Y X
Cov Y I

=
=
1
1
*
X
X



=

where is a p x 1 vector and
is a matrix, and X denotes the earlier defined design matrix with
1 2
* ( , , ,..., )
p
=
( 1) n p +
first n
1
rows as (1,0,0,,0),
second n
2
rows as (0,1,0,,0)
1


We earlier assumed that rank(X) =p but can we also say that rank(X*) =p in the present case?
, and
last n
p
rows as (0,0,0,,1).
It is thus apparent that all the linear parametric functions of are not estimable. The question now arises
Since the first column of X* is the vector sum of all its remaining p columns, so
( ) * rank X p =
1 2
, ,...,
p
pp p q
that what kind of linear parametric functions are estimable?
1 2
, , ,
p
Consider any linear estimator with
1 1
i
n p
ij ij
i j
L a Y =

1
.
i
n
i ij
C a =

7
Now
1 1
( ) ( )
i
n p
ij ij
i j
n p
E L a E Y
= =
=

1 1 i j = =
1 j =
1 1
1 1 1 1
( )

i
i i
n p
ij i
i j
n n p p
ij ij i
i j i j
a
a a


= =
= = = =
= +
= +


Thus is estimable if and only if
1 1
( ) .
j j
p p
i i i
i i
C C
= =
= +

p
C

Thus is estimable if and only if


1
i i
i
C
=

1
0
i e is a contrast
,
p
i
i
p
C
C
=
=

Thus, in general neither nor any is estimable. If it is a contrast, then it is estimable.


1
, i.e. is a contrast.
i i
i
C
=

1
p
i
i

=
1 2
, , ,...,
p

This effect and outcome can also be seen from the following explanation based on the estimation of parameters
1 2
, , ,..., .
p

Consider the least squares estimation of respectively.
1 2
, , ,...,
p

1 1 2
, , ,..., ,
p

8
2 2
1 1 1 1
t bt i
i i
n n p p
ij ij i
i j i j
S ( y )


= = = =
= =

Minimize the sum of squares due to s '
ij

1
1 1
0 0
0 0 12
toobtain
(a)
(b)
i
i
p
n p
ij i
i j
n
, ,..., .
S
( y )
S
( ) i

= =

= =

Note that (a) can be obtained from (b) or vice versa. So (a) and (b) are linearly dependent in the sense that there are
(p +1) unknowns and p linearly independent equations. Consequently do not have a unique solution.
1
0 0 12 (b)
ij i
j
i
( y ) , i , ,..., p.

=
= = =


1
, ,...,
p

Same applies to the maximum likelihood estimation of
If a side condition that
1 .
, ,... .
p

1 1
0 0 or
p p
i i i i
i i
n n
= =
= =

( ) ( )
, ,
1 1 1
is imposed then a and b have a unique solution as
1
say where
i
n p p
ij oo i
i j i
y y n n
n

= = =
= = =


.
1
1
i
n
i ij
j
p
io oo
y
n
y y

=
=
=

9
So the model needs to be rewritten so that all the parameters can be uniquely estimated. Thus
ij i ij
y = + +
, , .
1 1 1 1
In case all the sample sizes are the same then the condition or reduces to or 0 0 0 0
p p p p
i i i i i i
i i i i
n n
= = = =
= = = =




ij i ij
i ij
* *
i ij
Y
( ) ( )



= + +
= + + +
= + +
where


*
*
i i


= +
=
1
1
and

p
i
i
p

=
=

1
0
p
*
i
i
.
=
=

This is a reparameterized form of the linear model.


Thus in a linear model, when X is not of full rank, then the parameters do not have unique estimates. A restriction
10
p
(or equivalently in case all n
i
s are not same) can be added and then the least squares (or
maximum likelihood) estimators obtained are unique. The model
1
0
p
i
i

=
=

1
0
p
i i
i
n
=
=

* *
1
1
( ) * ; 0
p
ij i
i
E Y
=
= + =

is called a reparametrization of the original linear model.

You might also like