Wahyu Widhiarso

Gadjah Mada University

wahyu_psy@ugm.ac.id

Measurement in psychology that consist multiple unidimensional scales

whether the composite of the entire test items measure several

psychological attributes was little discussed. This study compared

reliability precision estimation among seven methods to estimate the

reliability

under

three

different

multidimensional

by

simulation

factor-structure

study

on

three

result revealed that the three methods (Alpha Stratified, Mosier

Coefficient, Wang Coefficient) estimated the true reliability much better

than Composite, Construct and Maximal Reliability Coefficient. Study

also found that in the multidimensional condition, Alpha Coefficient

consistent as lower bound estimator of reliability. Comparisons among the

seven methods are outlined.

Keywords: Reliability Coefficient, Estimation Precision, Multidimensional

Measurement

Widhiarso, W. (2007). Estimating Reliability for Multidimensional Measure.

Unpublished Research Summary. Faculty of Psychology. Gadjah Mada

University

Introduction

The purpose of this study is comparing reliability coefficient on

multidimensional measure. Brunner and Sub (2005) shown that very few reliability

analyses take the multidimensional structure of empirical data into account. In practice,

however, researchers often work with multidimensional measures because of the need

for parsimony, inadequate examination of measurement models (Anderson & Gerbing,

1988). Most empirical problems in psychological measurement are multidimensional

because it is difficult to develop items that measure only one dimension.

Many researchers use popular coefficient (i.e. coefficient alpha) to estimate

reliability for their data without checking an assumption, whereas coefficient alpha is an

internal consistency estimate of composite test reliability that fit for one attribute

measure (Green et.al., 1977). When the measure was multidimensional, most of the

reliability coefficients, including Coefficient-alpha, underestimated the true reliability of

the scale. It was noticed that the internal consistency of a test, was not a sufficient

criterion for multidimensional measurement.

Dimensionality measure is defined as the number of latent variables that

account for the correlations among item responses in a particular data set (Camilli,

et.al., 1995). If the numbers of latent variables are more than two, we call

multidimensional measure. There are two types of multidimensionality associated with

items linked to the attribute measure: a) dependent multidimensionality that describes

unspecified latent factors that are shared by more than one item, b) independent

multidimensionality that describes the uniqueness components of items as latent

combinations of item specific factors and error.

There are several coefficients for estimating reliability under multidimensional

measure circumstances.

Mosiers Composite Reliability. Mosier (1943) has developed reliability

coefficient for composite reliability, a general formula for the reliability of a weighted

composite that can be estimated from a knowledge of the weights whatever their source,

reliabilities, dispersions, and intercorrelations of the components.

rxx ' = 1

(w 2j s 2j ) (w 2j s 2j r jj ' )

(w 2j s 2j ) + 2(w j wk s j s k r jk )

(1)

(1965). It is intended for cases where components of a test can be grouped into subtests

on the basis of content. It assumes k components, where ith component (i = 1, ., k)

consisted of ni components. Stratified a is obtained by

s = 1

2

i

i =1

(1 i )

(2)

x2

general formula for the reliability of a composite L composed of n variables. Solving for

n=2 variables and simplifying, we have the reliability of a composite as a function of

the weights, component reliabilities and the positive correlation between the

components.

n

rxx ' =

i =1

i =1 j =1

(3)

w + w w r

2

i

i =1

i =1 j =1

j ij

where i2 is the variance of items in the ith component, i is the reliability of ith

component, and x2 is the variance of the test. Stratified may be suitable for estimating

reliability for multidimensional composite scores.

scale score Y = Y1 + Y2 + ... + Yp can be expressed as

p

rxx ' =

var (

i =1

var (

i =1

)

ij

j =1

+ E )

j =1

ij

i =1

(2)

the error term of indicator Yi (Raykov & Shrout, 2002).

Maximal Reliability. Maximal Reliability was derived by Li, Rosenthal, and

Rubin (1996). Hancock and Mueller (2001) called it coefficient H. Maximal

Reliability should be suitable for estimating reliability for multidimensional composite

scores when items within subtests are parallel. However, if items within subtests are not

parallel, underestimation of reliability is expected.

li2

2

i =1 (1 l i )

w =

p

l2

1+ i 2

i =1 (1 l i )

p

(14)

squared correlation between a factor and the optimum linear composite formed by its

indicators (Hancock & Mueller, 2001).

METHOD

Reliability estimation methods were evaluated through simulations. In order to

generate data, the reliability values and covariance matrix of the true scores for the 10

components were predetermined. A simulated dataset X with N = 5000 cases and k = 6

items was generated according to the model.

There are two attributes measure (T1 and T2) which support multidimensional

model. We divided the model into three conditions. We call multidimensional parallel

since it has same true and error coefficient, multidimensional tau-equivalent since only

true coefficient which same and multidimensional congeneric which all parameter is

different.

Multidimensional Parallel

Model

Multidimensional

Congeneric Model

X a1 = 3T1 + 2 E1

X a1 = 3T1 + 1.2 E1

X a1 = 2T1 + 1.2 E1

X a 2 = 3T1 + 2 E2

X a 2 = 3.T1 + 1.4 E 2

X a 2 = 2.2.T1 + 1.4 E2

X a 3 = 3T1 + 2 E3

X a 3 = 3T1 + 1.6 E3

X a 3 = 2.4T1 + 1.6 E3

X b1 = 3T2 + 2 E4

X b1 = 3T2 + 1.8 E 4

X b1 = 2.6T2 + 1.7 E4

X b 2 = 3T2 + 2 E5

X b 2 = 3T2 + 2 E 5

X b 2 = 2.8T2 + 1.8 E5

X b 3 = 3T2 + 2 E6

X b 3 = 3T2 + 2.2 E 6

X b 3 = 3T2 + 2 E6

T 1T 2 =0.003

T 1T 2 =0.003

T 1T 2 =0.003

T1 was a true score and El to E6 were zero-mean normal errors. Since all parameters of

the data generating model are known, the (true) reliability of the scale was known from

2

square correlation between true and observed score ( TX ). From equation in the model,

we also can get the (true) reliability of X which determined as:

xx =

[(ij ) 2 + (ik ) 2 + 2.ij .ik .r jk ] + ii

For example, true reliability in the model congeneric multidimensional, the estimate of

the reliability coefficient of the composite results as:

xx =

= 0.8775

( 2 + 2.2 + 2.4) 2 + ( 2.6 + 2.8 + 3) 2 + ( 2 6.6 8.4 0.003) + 16.09

Applying that equation, we found 0.8700 for multidimensional parallel, and 0.9001 for

multidimensional tau-equivalent.

RESULT

We found that Coefficient-alpha largely underestimated the true reliability and

consistent across all 3 conditions. Therefore, it is not appropriate to use Coefficientalpha as an estimate of the reliability of a multidimensional composite scale score. This

finding was consistent with previous literature (Cortina, 1993). The other methods

estimated the true reliability much better than Coefficient-alpha in all simulation

conditions. Table 1 until 3 show estimate precision each reliability coefficients.

Table 1. Simulations Result for Multidimensional Parallel Model ( xx = 0.8700)

No

1

2

3

4

5

6

7

Reliability Coefficient

Alpha

Stratified Alpha

Maximal Reliability (Muller & Hancock)

Composite Reliability (Raykov)

Composite Reliability (Wang)

Composite Reliability (Mosier)

Construct Reliability (McDonald)

Estimate

0.7009

0.8741

0.9326

0.6343

0.8741

0.8741

0.9326

Bias Estimate

-0.1725

0.0008

0.0593

-0.2390

0.0008

0.0008

0.0592

No

1

2

3

4

5

6

7

Reliability Coefficient

Alpha

Stratified Alpha

Maximal Reliability (Muller & Hancock)

Composite Reliability (Raykov)

Composite Reliability (Wang)

Composite Reliability (Mosier)

Construct Reliability (McDonald)

Estimate

0.7230

0.9020

0.9562

0.6540

0.9028

0.9020

0.9499

Bias Estimate

-0.1781

0.0009

0.0551

-0.2472

0.0017

0.0009

0.0488

No

1

2

3

4

5

6

7

Reliability Coefficient

Estimate

0.6932

0.8646

0.9323

0.5554

0.8677

0.8646

0.9312

Alpha

Stratified Alpha

Maximal Reliability (Muller & Hancock)

Composite Reliability (Raykov)

Composite Reliability (Wang)

Composite Reliability (Mosier)

Construct Reliability (McDonald)

Bias Estimate

-0.1600

0.0114

0.0791

-0.2978

0.0145

0.0114

0.0780

Wang Composite Reliability) estimated the true reliability much better than others. The

magnitude of the bias estimation was relatively small (0.02 maximum). In the parallel

and tau equivalent model those coefficients seem have equal estimate, except in

congeneric model that only Stratified Alpha and Mosier Coefficient have equal

estimated. Comparing by the models, bias estimation was increase as the models has

changed to congeneric. Bias estimation in all coefficients was increase, although the

differences were small. Coefficient based on factor analysis (e.g. Maximal Reliability,

estimation.

DISCUSSION

Generally, any of the three alternative methods (Mosiers Composite Reliability,

Stratified-alpha, and Wang Composite Reliability) can be recommended when the scale

is not unidimensional. In the majority of cases where a measure possesses some degree

of multidimensionality, the use of coefficient alpha is inappropriate. Although we

possible get higher value of reliability, coefficient alpha seems not sensitive to

dimensional measure. Cortina (1993) demonstrated the general tendency of alpha to

decrease as a function of multidimensionality and increase as a function of item

intercorrelations, but noted that the effect of measure length can create high values of

alpha even in the case of three orthogonal subscales.

Stratified-alpha appeared to be the good procedure among the other methods we

investigated, although the other two methods may outperform stratified alpha depending

on the condition. However, the difference between the three methods was very small.

Also, this study revealed that the reliability could be overestimated by alternative

methods, especially by MD-omega, which can result in a larger error. Therefore, it is

not recommended to take an approach to estimate reliability by multiple methods and

use the highest value as the best estimate.

Although this study investigated three different factor-structure conditions, the

performance difference between the seven methods was rather subtle. We will further

investigate the characteristics of these those reliability estimates to make more specific

recommendations, including the number of items and the number of dimensions. The

measurement model is needed to specify the multidimensional structure of the

measurement, but it must be stressed that the reliability is a property of the

measurement scale.

REFFERENCES

Anderson, J. C., & Gerbing, D. W. (1988). Structural equation modeling in practice: A

review and recommended two-step approach. Psychological Bulletin, 103(3),

411-423.

Brunner, S., Martin Sub, H. (2005). Analyzing The Reliability Of Multidimensional

Measures: An Example From Intelligence Research. Educational and

Psychological Measurement, Vol. 65 No. 2, April 2005 227-240

Camilli, G., Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the

law school admission test. Journal of Educational Measurement, 32(1), 79

96.

Cortina, J. M (1993). What is coefficient alpha? An examination of theory and

application. Journal of Applied Psychology, 78, 98-104.

Cronbach, L. J. (1951). Coefficient alpha and the internal structure of a test.

Psychometrika, 16, 297-334.

an index of test unidimensionality. Educational and Psychological

Measurement, 37, 827-838

Hancock, G. R., & Mueller, R. O. (2001). Rethinking construct reliability within latent

variable systems. In R. Cudeck, S. du Toit,&D. Soerbom (Eds.), Structural

equation modeling: Present and futureA festschrift in honor of Karl

Jreskog (pp. 195-216). Lincolnwood, IL: Scientific Software International.

Mosier, C.I. (1943). On the reliability of a weighted composite. Psychometrika, 8, 161

168. (6,11)

Raykov, T.,& Shrout, P. E. (2002). Reliability of scales with general structure: Point

and interval estimation using a structural equation modeling approach.

Structural Equation Modeling, 9, 195-212.

Wang, M.D. & Stanley J.C. (1970). Differential weighting: A review of methods and

empirical studies. Review of Educational Research, 40, 663-705.

