You are on page 1of 7

Estimating Reliability for Multidimensional Measure

Wahyu Widhiarso
Gadjah Mada University
wahyu_psy@ugm.ac.id
Measurement in psychology that consist multiple unidimensional scales
whether the composite of the entire test items measure several
psychological attributes was little discussed. This study compared
reliability precision estimation among seven methods to estimate the
reliability

under

three

different

multidimensional

conditions. Data was generated

by

simulation

factor-structure
study

on

three

multidimensional conditions (parallel, tau-equivalent, congeneric). The


result revealed that the three methods (Alpha Stratified, Mosier
Coefficient, Wang Coefficient) estimated the true reliability much better
than Composite, Construct and Maximal Reliability Coefficient. Study
also found that in the multidimensional condition, Alpha Coefficient
consistent as lower bound estimator of reliability. Comparisons among the
seven methods are outlined.
Keywords: Reliability Coefficient, Estimation Precision, Multidimensional
Measurement

When citing this paper please use the following format :


Widhiarso, W. (2007). Estimating Reliability for Multidimensional Measure.
Unpublished Research Summary. Faculty of Psychology. Gadjah Mada
University

Introduction
The purpose of this study is comparing reliability coefficient on
multidimensional measure. Brunner and Sub (2005) shown that very few reliability
analyses take the multidimensional structure of empirical data into account. In practice,
however, researchers often work with multidimensional measures because of the need
for parsimony, inadequate examination of measurement models (Anderson & Gerbing,
1988). Most empirical problems in psychological measurement are multidimensional
because it is difficult to develop items that measure only one dimension.
Many researchers use popular coefficient (i.e. coefficient alpha) to estimate
reliability for their data without checking an assumption, whereas coefficient alpha is an
internal consistency estimate of composite test reliability that fit for one attribute
measure (Green et.al., 1977). When the measure was multidimensional, most of the
reliability coefficients, including Coefficient-alpha, underestimated the true reliability of
the scale. It was noticed that the internal consistency of a test, was not a sufficient
criterion for multidimensional measurement.
Dimensionality measure is defined as the number of latent variables that
account for the correlations among item responses in a particular data set (Camilli,
et.al., 1995). If the numbers of latent variables are more than two, we call
multidimensional measure. There are two types of multidimensionality associated with
items linked to the attribute measure: a) dependent multidimensionality that describes
unspecified latent factors that are shared by more than one item, b) independent
multidimensionality that describes the uniqueness components of items as latent
combinations of item specific factors and error.

Reliability Estimation for Multidimensional Measures


There are several coefficients for estimating reliability under multidimensional
measure circumstances.
Mosiers Composite Reliability. Mosier (1943) has developed reliability
coefficient for composite reliability, a general formula for the reliability of a weighted
composite that can be estimated from a knowledge of the weights whatever their source,
reliabilities, dispersions, and intercorrelations of the components.

rxx ' = 1

(w 2j s 2j ) (w 2j s 2j r jj ' )
(w 2j s 2j ) + 2(w j wk s j s k r jk )

(1)

Cronbachs Stratified Alpha. Stratified-alpha was proposed by Cronbach et.al.


(1965). It is intended for cases where components of a test can be grouped into subtests
on the basis of content. It assumes k components, where ith component (i = 1, ., k)
consisted of ni components. Stratified a is obtained by

s = 1

2
i

i =1

(1 i )

(2)

x2

Wangs Composite Reliability. Wang and Stanley (1970, p. 672) provide a


general formula for the reliability of a composite L composed of n variables. Solving for
n=2 variables and simplifying, we have the reliability of a composite as a function of
the weights, component reliabilities and the positive correlation between the
components.
n

rxx ' =

wi2 rii ' + wi w j rij


i =1

i =1 j =1

(3)

w + w w r
2
i

i =1

i =1 j =1

j ij

where i2 is the variance of items in the ith component, i is the reliability of ith
component, and x2 is the variance of the test. Stratified may be suitable for estimating
reliability for multidimensional composite scores.

Composite Reliability. The composite reliability y of the p indicators Yi of a


scale score Y = Y1 + Y2 + ... + Yp can be expressed as
p

rxx ' =

var (
i =1

var (
i =1

)
ij

j =1

+ E )
j =1

ij

i =1

(2)

where ij is the unstandardized pattern coefficient of indicator Yi on factor i and Ei is


the error term of indicator Yi (Raykov & Shrout, 2002).
Maximal Reliability. Maximal Reliability was derived by Li, Rosenthal, and
Rubin (1996). Hancock and Mueller (2001) called it coefficient H. Maximal
Reliability should be suitable for estimating reliability for multidimensional composite
scores when items within subtests are parallel. However, if items within subtests are not
parallel, underestimation of reliability is expected.
li2

2
i =1 (1 l i )
w =
p
l2
1+ i 2
i =1 (1 l i )
p

(14)

where l i is the standardized pattern coefficient of indicator i. w can be regarded as the


squared correlation between a factor and the optimum linear composite formed by its
indicators (Hancock & Mueller, 2001).

METHOD
Reliability estimation methods were evaluated through simulations. In order to
generate data, the reliability values and covariance matrix of the true scores for the 10
components were predetermined. A simulated dataset X with N = 5000 cases and k = 6
items was generated according to the model.
There are two attributes measure (T1 and T2) which support multidimensional
model. We divided the model into three conditions. We call multidimensional parallel
since it has same true and error coefficient, multidimensional tau-equivalent since only
true coefficient which same and multidimensional congeneric which all parameter is
different.

Multidimensional Parallel
Model

Multidimensional TauEquivalent Model

Multidimensional
Congeneric Model

X a1 = 3T1 + 2 E1

X a1 = 3T1 + 1.2 E1

X a1 = 2T1 + 1.2 E1

X a 2 = 3T1 + 2 E2

X a 2 = 3.T1 + 1.4 E 2

X a 2 = 2.2.T1 + 1.4 E2

X a 3 = 3T1 + 2 E3

X a 3 = 3T1 + 1.6 E3

X a 3 = 2.4T1 + 1.6 E3

X b1 = 3T2 + 2 E4

X b1 = 3T2 + 1.8 E 4

X b1 = 2.6T2 + 1.7 E4

X b 2 = 3T2 + 2 E5

X b 2 = 3T2 + 2 E 5

X b 2 = 2.8T2 + 1.8 E5

X b 3 = 3T2 + 2 E6

X b 3 = 3T2 + 2.2 E 6

X b 3 = 3T2 + 2 E6

T 1T 2 =0.003

T 1T 2 =0.003

T 1T 2 =0.003

T1 was a true score and El to E6 were zero-mean normal errors. Since all parameters of
the data generating model are known, the (true) reliability of the scale was known from
2
square correlation between true and observed score ( TX ). From equation in the model,
we also can get the (true) reliability of X which determined as:

xx =

(ij ) 2 + (ik ) 2 + 2.ij .ik .r jk


[(ij ) 2 + (ik ) 2 + 2.ij .ik .r jk ] + ii

For example, true reliability in the model congeneric multidimensional, the estimate of
the reliability coefficient of the composite results as:

xx =

( 2 + 2.2 + 2.4) 2 + ( 2.6 + 2.8 + 3) 2 + ( 2 6.6 8.4 0.003)


= 0.8775
( 2 + 2.2 + 2.4) 2 + ( 2.6 + 2.8 + 3) 2 + ( 2 6.6 8.4 0.003) + 16.09

Applying that equation, we found 0.8700 for multidimensional parallel, and 0.9001 for
multidimensional tau-equivalent.

RESULT
We found that Coefficient-alpha largely underestimated the true reliability and
consistent across all 3 conditions. Therefore, it is not appropriate to use Coefficientalpha as an estimate of the reliability of a multidimensional composite scale score. This
finding was consistent with previous literature (Cortina, 1993). The other methods
estimated the true reliability much better than Coefficient-alpha in all simulation
conditions. Table 1 until 3 show estimate precision each reliability coefficients.
Table 1. Simulations Result for Multidimensional Parallel Model ( xx = 0.8700)
No
1
2
3
4
5
6
7

Reliability Coefficient
Alpha
Stratified Alpha
Maximal Reliability (Muller & Hancock)
Composite Reliability (Raykov)
Composite Reliability (Wang)
Composite Reliability (Mosier)
Construct Reliability (McDonald)

Estimate
0.7009
0.8741
0.9326
0.6343
0.8741
0.8741
0.9326

Bias Estimate
-0.1725
0.0008
0.0593
-0.2390
0.0008
0.0008
0.0592

Table 2. Simulations Result for Multidimensional Tau Equivalent Model ( xx = 0.9001)


No
1
2
3
4
5
6
7

Reliability Coefficient
Alpha
Stratified Alpha
Maximal Reliability (Muller & Hancock)
Composite Reliability (Raykov)
Composite Reliability (Wang)
Composite Reliability (Mosier)
Construct Reliability (McDonald)

Estimate
0.7230
0.9020
0.9562
0.6540
0.9028
0.9020
0.9499

Bias Estimate
-0.1781
0.0009
0.0551
-0.2472
0.0017
0.0009
0.0488

Table 3. Simulations Result for Multidimensional Congeneric Model ( xx = 0.8775)


No
1
2
3
4
5
6
7

Reliability Coefficient

Estimate
0.6932
0.8646
0.9323
0.5554
0.8677
0.8646
0.9312

Alpha
Stratified Alpha
Maximal Reliability (Muller & Hancock)
Composite Reliability (Raykov)
Composite Reliability (Wang)
Composite Reliability (Mosier)
Construct Reliability (McDonald)

Bias Estimate
-0.1600
0.0114
0.0791
-0.2978
0.0145
0.0114
0.0780

Three alternative methods (Mosiers Composite Reliability, Stratified-alpha, and


Wang Composite Reliability) estimated the true reliability much better than others. The
magnitude of the bias estimation was relatively small (0.02 maximum). In the parallel
and tau equivalent model those coefficients seem have equal estimate, except in
congeneric model that only Stratified Alpha and Mosier Coefficient have equal
estimated. Comparing by the models, bias estimation was increase as the models has
changed to congeneric. Bias estimation in all coefficients was increase, although the
differences were small. Coefficient based on factor analysis (e.g. Maximal Reliability,

Raykovs Composite reliability, Construct Reliability) tends to have larger bias


estimation.

DISCUSSION
Generally, any of the three alternative methods (Mosiers Composite Reliability,
Stratified-alpha, and Wang Composite Reliability) can be recommended when the scale
is not unidimensional. In the majority of cases where a measure possesses some degree
of multidimensionality, the use of coefficient alpha is inappropriate. Although we
possible get higher value of reliability, coefficient alpha seems not sensitive to
dimensional measure. Cortina (1993) demonstrated the general tendency of alpha to
decrease as a function of multidimensionality and increase as a function of item
intercorrelations, but noted that the effect of measure length can create high values of
alpha even in the case of three orthogonal subscales.
Stratified-alpha appeared to be the good procedure among the other methods we
investigated, although the other two methods may outperform stratified alpha depending
on the condition. However, the difference between the three methods was very small.
Also, this study revealed that the reliability could be overestimated by alternative
methods, especially by MD-omega, which can result in a larger error. Therefore, it is
not recommended to take an approach to estimate reliability by multiple methods and
use the highest value as the best estimate.
Although this study investigated three different factor-structure conditions, the
performance difference between the seven methods was rather subtle. We will further
investigate the characteristics of these those reliability estimates to make more specific
recommendations, including the number of items and the number of dimensions. The
measurement model is needed to specify the multidimensional structure of the
measurement, but it must be stressed that the reliability is a property of the
measurement scale.

REFFERENCES
Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A
review and recommended two-step approach. Psychological Bulletin, 103(3),
411-423.
Brunner, S., Martin Sub, H. (2005). Analyzing The Reliability Of Multidimensional
Measures: An Example From Intelligence Research. Educational and
Psychological Measurement, Vol. 65 No. 2, April 2005 227-240
Camilli, G., Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the
law school admission test. Journal of Educational Measurement, 32(1), 79
96.
Cortina, J. M (1993). What is coefficient alpha? An examination of theory and
application. Journal of Applied Psychology, 78, 98-104.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of a test.
Psychometrika, 16, 297-334.

Green, S. B., Lissitz, R.W.,& Mulaik, S. A. (1977). Limitations of coefficient alpha as


an index of test unidimensionality. Educational and Psychological
Measurement, 37, 827-838
Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent
variable systems. In R. Cudeck, S. du Toit,&D. Soerbom (Eds.), Structural
equation modeling: Present and futureA festschrift in honor of Karl
Jreskog (pp. 195-216). Lincolnwood, IL: Scientific Software International.
Mosier, C.I. (1943). On the reliability of a weighted composite. Psychometrika, 8, 161
168. (6,11)
Raykov, T.,& Shrout, P. E. (2002). Reliability of scales with general structure: Point
and interval estimation using a structural equation modeling approach.
Structural Equation Modeling, 9, 195-212.
Wang, M.D. & Stanley J.C. (1970). Differential weighting: A review of methods and
empirical studies. Review of Educational Research, 40, 663-705.