Bailey K A Three Level Measurement Model

Quality and Quantity, 18 (1984) 225-245
Elsevier Science Publishers B.V., Amsterdam
225
-
Printed in The Netherlands
A Three-Level Measurement Model

KENNETH
Department
D. BAILEY
of Sociology, University of California,
Los Angeles, CA, U.S.A.
Some sociologists have advocated what might be termed single-level

approaches to measurement.
While some verbal theorists have attacked
empiricism
(Parsons, 1937, p. 23), strict operationalists
have shown equal
disdain for purely verbal theorizing (Lundberg,
1939, pp. 51-69). Recent
writers, such as Blalock (1968), have moved beyond such single-level
approaches to a two-level approach. This two-level approach is seen as bridging
the gap between theory and empirical research. Blalock (1968, p. 12) stated
that:
Perhaps the most common practice in sociology is to refer to underlying or unmeasured
concepts, on the one hand, and indicators or composite indices, on the other. The problem
of bridging the gap between theory and research is thus seen as one of measurement error
(italics in the original).
Costner
(1969, p. 245) wrote in a similar
vein, saying:
Traditionally,
sociological theorists have focused on abstractions with loose and ill-defined
implications
about matters of fact. More recently, some sociological formulations have
shifted to the opposite extreme, stating only connections between measures without any
attempt to make more abstract claims. Either of these modes of theory construction is
costly, sacrificing either the clarity of empirical implications or the integrating potential of
abstract concepts.
There seems to be a general acceptance of the two-level measurement model

among mainstream
sociologists. Sociologists can still be found who chiefly
write verbal essays, as well as those who eschew verbal theorizing in favor of
empirical
investigation.
However, most of these would probably
see this
extant theoretical/empirical
split more as a division of labor in an age of
specialization,
and would recognize the usefulness of work in the other camp.
In spite of the general acceptance of the two-level model, probably few of
its adherents would argue that it is completely
adequate. One irony is that
while the two-level model is often embraced by positivistic
sociological
statisticians, such as Blalock (1968), Costner (1969), Hauser and Goldberger
0033-5177/84/$03.00
0 1984 Elsevier Science Publishers
B.V.
226
(1971) Werts et al. (1972) and Mayer and Younger (1974) the existence of a
link between the two levels is not verifiable. This must be discomfiting
to a
school of thought which emphasizes verification
and has often chastised
verbal theorists for their shortcomings
in this regard. Still, it is a fact they
must live with. As Blalock (1968, p. 12) stated:
Unfortunately,
however, measurement errors can never be known quantities, though they
may be estimated if one is willing to make certain untestable simplifying assumptions.
Another symptom of the unfinished

nature of the two-level model is the
chronic confusion over the concept of validity. Blalock (1968, p. 13) stated
that there are at least two types of validity,
and proposed giving them
completely
different names.
Such anomalies in the two-level model clearly signal the need for a new
model. The purpose of this paper is to show that an adequate model of
sociological
measurement
must contain at least three levels of analysis. A
three-level model is necessary simply because the actual measurement
process entails work on all three levels. Two-level models confuse and conceal
these three levels. Two-level models are insufficient
for their task and will
inevitably cause problems for their users.
The three-level model will not solve all of the major problems of sociological measurement,
such as measurement
error and the notion of validity.
However, it can facilitate further work on these problems by reducing some
of the confusion present in an incomplete
model. We can best proceed with
our task by first analyzing the two-level model in some detail. Next, we
present the three-level model and compare it with alternative models. We
end the paper with an evaluation of the three-level mcJz1 and suggestions for
future needed research.
The Two-Level
Model
The two-level model is firmly ensconced in the sociological literature. It is

common for sociologists to speak of the conceptual and empirical levels of
analysis. One of the most thorough and explicit discussions of the two-level
model was given by Blalock (1968). In the course of presenting his own
two-level approach, Blalock (1968, p. 12) noted the variety of two-level
formulations
in the literature. These include Northrops
(1947) distinction
between concepts by intuition
and concepts by postulation,
Bierstedts
(1959) nominal
versus real definitions,
and Coombs
(1953) distinction
between phenotypic
and genotypic levels of analysis. Blalock (1968, p. 12)
227
concluded that most of these ter~nological
differences are primarily
semantic in nature.
Two-level models are also prevalent in the fields of theory construction
and typology construction. In theory construction, Coleman (1964, p. 9)
made a distinction between synthetic and explanatory theories. Costner
(1969) also utilized the two-level approach to theory construction. He
expressed this dichotomy alternatively as abstract concepts and concrete
implications, abstract conceptions and concrete events, and abstract
and empirical levels (Costner, 1969, p. 245). In discussing typologies, Winch
(1947) utilized the two-level distinction in distinguishing between heuristic
and empirical typologies. The distinction is also represented by Capecchis
(1966) abstract versus nonabstract types, Hempels (1952) ideal versus classificatory types, and McKinneys (1966) ideal versus extracted types.
Blalock (1968) presented a relatively comprehensive and sophisticated
two-level formulation built (at least partly) upon Northrops (1947) distinction between concepts by intuition and concepts by postulation, and his
notion of an epistemic correlation which connects the two types of concepts.
Since Blalocks (1968) discussion of the two-level model is probably the best
and most complete, it is somewhat ironic that it will bear the brunt of our
criticism of the two-level model. It should be made clear, however, that our
basic criticism applies to the two-level model and not to Blalocks analysis of
it,
Concepts by intuition and concepts by postulation have very different
roles in measurement theory. Concepts by intuition are those that can be
sensed directly, thus minimizing, if not eliminating, measurement problems.
Examples of such individual characteristics are skin color and height. Concepts by intuition can be said to denote (Blalock, 1968, p. 10). In contrast, a
concept by postulation is one the meaning of which in whole or part is
designated by the postulates of the deductive theory in which it occurs
(Northrop, 1947, p- 83). Northrop associated concepts by intuition with an
early, descriptive stage of science, while concepts by postulation are associated with a later, deductive stage.
Deductive theory, according to Northrop, can contain only concepts by
postulation, and confusion and error will result if an attempt is made to mix
both concepts by postulation and concepts by intuition in the same theory.
Instead, a deductive theory is constructed entirely from concepts by postulation. In order for the theory to be testable, at least some (but not necessarily
all) of these concepts by postulation must be connected to corresponding
concepts by intuition. The link between a concept by postulation and its
corresponding concept by intuition is called an epistemic correlation. An
epistemic correlation is a relation joining an unobserved component of
anything designated by a concept by postulation to its directly inspected
228
component denoted by a concept by intuition (Northrop, 1947, p. 119).
Epistemic correlations are thus links between concepts and the empirical
world. As such they are not observable or subject to proof. Rather, scientists
simply must agree, before research is undertaken, that the observed phenomenon represents the concept. For example, as a concept by postulation,
blue is a number of a wavelength in electromagnetic theory, and as a
concept by intuition it is sensed directly (Blalock, 1968, p. 10). Scientists
agree that the latter is the empirical counterpart of the former, and that an
epistemic correlation exists between them.
In commenting upon Northrops two-level formulation, Blalock (1968, p.
10) stated:
In a sense we seem to have two distinct languages, each composed of concepts defined in a
very special way. Tests of hypotheses are made in the one language; our thinking is done
in the other. At least some of the concepts in what might be termed the theoretical
language must be associated (through epistemic correlations) with concepts defined
operationally.
Notice the subtle change here, however, from Northrops formulation. Instead of speaking of concepts by postulation, Blalock has switched to a
theoretical language. This will probably not cause any confusion. However, instead of speaking of concepts by intuition as Northrop does, Blalock
is now speaking of concepts defined operationally. It is not at all clear
that the term concepts defined operationally can be used synonymously
with concepts by intuition. For example, consider the concept of crime. It
is easy to write a deductive theory in which crime appears. Whether this
concept is termed a concept by postulation, as Northrop does, or a concept
in the theoretical language, as Blalock does, would seem to be of little
consequence. However, it is not so clear that crime expressed as a concept by
intuition is the same as crime defined operationally. As a concept by
intuition, crime must be a concept the complete meaning of which is given
by something which can be immediately apprehended (Northrop, 1947, p.
36). In my mind, such a concept is one which is understood by observing an
actual violation of law. On the other hand, an operational definition of crime
would be represented by some crime index compiled from official statistics,
or by a pencil and paper test of criminality. To include both the crime itself
(concept by int~tion) and what is essentially a secondary indicator of the
crime (the operational definition) under the same rubric is overly broad, and
may cause confusion.
Blalock (1968) presented a rather comprehensive causal model of measurement. He stated that We shall see both how theory and measurement can
become intertwined in a confusing way and why it is necessary to introduce
simplifying assumptions in the process (Blalock, 1968, pp. 13-14). A
229
central distinction
in Blalocks model is between concepts that can themselves be measured directly, and concepts that cannot be sensed directly, but
must be measured in terms of their effects which can be observed. Mass,
power and discrimination
are examples of the latter. Blalock argued that to
study concepts that cannot be measured directly, it is necessary not only to
find a set of operations with a high degree of reliability,
but also to make a
series of theoretical
assumptions
concerning variables other than the one
being measured (Blalock, 1968, p. 14).
Blalock also introduced the notion of an auxiliary theory. His strategy was
to formulate
a general theory consisting
of a number of definitions,
assumptions,
and propositions
modeled after the ideal of a completely
deductive system of thought (Blalock, 1968, p. 24). This general theory is
constructed only according to the rules of deductive theory, and so can be
true to these rules without worrying about testability.
This general theory is
augmented
by an auxiliary theory which will be specific to the research
design, population
studied, and measuring instruments used (Blalock, 1968,
p. 25). The relationship
between the main theory and the auxiliary theory is
shown in Blalocks (1968, p. 25) Fig. 1.5, reproduced here as Fig. 1. The
auxiliary
theory (below the dotted line) includes primarily
(in Blalocks
terms) operational
indicators of theoretical concepts (marked with a superscript), but includes one theoretical
concept (X7). Conversely, the main
theory does not include any operational
indicators,
but does include one
variable (X,) which is said to be measured. Notice that it is the only concept
in Fig, 1 that is measured without being causally linked to an operational
indicator (there is no Xg). Also notice that an unmeasured concept can be
causally linked to the operational
definition
of another unmeasured concept
Fig. 1. Blalocks Model involving Distinctions between (1) Main and Auxiliary
(2) Measured and Unmeasured Variables. Source: Blalock, 1968.
Theories and
230
(X, is linked with both X; and Xi), and that a single theoretical concept can
have more than one operational indicator (X5 has two, X; and X;). According to the principles of strict operationalism,
such multiple
indicators of a
single theoretical concept should be avoided. As Blalock (1968, p. 8) stated,
Bridgman
points out that in changing the operation
we are in effect
changing the concept. However, Blalock (1968, p. 11) considered that such
single-indicator
measurement
is an ideal that cannot now be attained in a
science in its infancy.
Before closing this discussion of Blalocks two-level formulation
it will be
helpful to consider in some detail his notion of directly measured variables. In Fig. 1, Blalock (1968, pp. 25-26) stated that the variables X,-X, are
not considered to be measured directly. Two of these variables (X, and X,)
do not have operational indicators in Fig. 1, and so can never be included in
testable hypotheses. Variables X,-X, all have operational
indicators, and so
can be included in testable hypotheses. According to Blalock, variables of
this sort are the most common. The causal unidirectional
arrows between
each variable and its operational indicator or indicators (e.g., from X, to X;)
indicate that the variable is measured not directly, but only through the
causal influence that it has on its operational
indicator.
For example, it
could be said that a pencil and paper test of intelligence
does not measure
intelligence
directly, but does indicate intelligence indirectly inasmuch as the
degree of intelligence
a person possesses causes him or her to achieve a
certain score on the test.
In addition to these indirectly measured variables, relatively few concepts
are considered by Blalock to be measured directly. An example is X, in Fig.
1. It is not causally linked to an operational
indicator, yet it is considered to
be measured (Blalock, 1968, p. 25). There seem to be three basic criteria for
determining
whether or not a variable is considered to be measured directly.
One is that the theoretical concept is close to the operational level. Blalock
(1968, p. 25-26) stated that the variable X, in Fig. 1 is sufficiently
close to
the operational
level that one is willing to simplify the diagram by referring
to its measurement
as direct. A second criterion for labeling a variable as
measured directly is that there is very little measurement
error involved in
gathering data concerning
this variable. For example, one such directly
measured variable is sex. Blalock (1968, p. 19) stated that Indicators
of sex
are so reliable, except in certain deviant quarters, that one usually assumes
that there will be relatively minor random errors that occur primarily
as a
result of the coding process. A third criterion for labeling a variable as
directly measured is that it is seen as a property of an individual,
rather than
as a property which has been conceived theoretically
in terms of its causal
implications,
and thus must be measured indirectly
in terms of its effects
rather than being measured directly. One physical example of such an
231
indirectly measured property is mass (Blalock, 1968, p. 14).
The problem here is that Blalocks notion of a directly measured variable
does not do justice to the two-level model. The basic point is that, as
Northrop made clear, concepts by postulation and concepts by intuition are
radically different concepts which can appear in pairs, but which can never
be identical. Thus, Blalocks statement that the theoretical level is close to
the operational level is quite misleading here. The dimensions of postulation
and intuition, like the dimensions of time and space, are quite distinct and
invariant, and must remain so. It simply does not conform to the logic of the
two-level model to say that the postulation level is close to the intuition
level. The two levels are always a constant distance apart, if one insists in
speaking in terms of distance, although it is questionable whether the notion
of distance between levels is fruitful. One can indeed say that there is an
epistemic correlation of 1.0 between a concept by postulation and a concept
by intuition, but this is not in any way equivalent to saying that the two
levels are close.
Among examples of variables that Blalock assumes can be measured
directly are age and sex (Blalock, 1968, p. 19), and education and income
(Blalock, 1968, p. 20). He states, The variable age and the attribute sex
are ordinarily regarded as directly measured although of course this is not
strictly true (Blalock, 1968, p. 19; italics added). Let us examine exactly
what Blalock is doing here by discussing a property that meets his three
criteria for direct measurement, but also has clear meaning in terms of
Northrops two-level model. A good example is the property of skin color. It
is a property of an individual, and it can be measured with a degree of
consensus similar to that of age or sex. Blalock (1968, p. lo), in discussing
Northrops model, stated that The sensed color blue is given as an example
of the former type of concept [intuition] whereas blue in the. sense of the
number of a wavelength in electromagnetic theory would be a concept by
postulation.
Thus, it is clear that a persons skin color (black, brown, red, white,
yellow, etc.) is sensed directly by the sociological researcher, and is obviously
a concept by intuition in Northrops terms. Each of these colors also has a
number on a wavelength in electromagnetic theory. Thus, there is a concept
by postulation related to each skin color by an epistemic correlation. This is
also true for age and sex, which can generally be sensed by the observer
with a relatively small degree of error just as the color blue can. Clearly these
are concepts by intuition. Clearly a corresponding concept by postulation
can be formulated deductively for each of these examples also. However,
Blalock characterizes them not as concepts by intuition, but as concepts by
postulation which are close to the operational level, but still must be
translated to the level of concepts by intuition. For example, in speaking
232
of age, sex and religion, he states that In many instances the problem of
translating from one language to the other is relatively straightforward
(Blalock, 1968, p. 14).
The salient point here is that his direct measurement does not represent
translation from the postulation to the intuition language at all. Rather, the
concepts that are labeled by Blalock as directly measured are actually
concepts by intuition in Northrops terms. Thus, in Fig. 1, the sixth variable
is labeled incorrectly: it should be XL instead of X,. The reason that Blalock
sees such variables as age and sex as close to the level of concepts by
intuition is becuase they are indeed concepts by intuition, pure and simple.
There is no way that such concepts, as discussed by Blalock, can be
considered concepts by postulation. A concept by postulation could be
formulated for each of the so-called directly measured variables. Further,
direct measurement cannot be represented in Blalocks formulation by a
single variable such as X,, as is done in Fig. 1. At a minimum it must be
diagrammed causally as X, + XA. This is a simple causal diagram of direct
measurement, where X, is a concept by postulation, Xi is its concept-by-intuition counterpart, and the arrow represents the epistemic correlation
between them (which could be very high in value).
In fact, in first diagramming examples of direct measurement, Blalock
(1968, p. 19) does represent them in this fashion, as in Fig. 2, where X
represents the true value and X the measured value. Yet in Fig. 1 he
switches, apparently inadvertently, to a single-variable diagram of direct
measurement (X,). Earlier, between pp. 19 and 21, both direct measurement
and indirect measurement are diagrammed with causal arrows. The only way
one can tell the difference (aside from the fact that indirectly measured
variables are generally diagrammed with multiple indicators) is that while
Fig. 2 represents direct measurement in terms of Xs, indirect measurement
is coded in terms of U for unmeasured variables and I for operational
indicators, as in Blalocks (1968, p. 20) Fig. 1.1, reproduced here as Fig. 3.
However, in his Fig. 1.5 (reproduced here in Fig. l), Blalock (1968, p. 25)
switches from this diagramming practice and diagrams indirectly measured
variables with Xs instead of Us and Is (e.g., variables X,, X, and X5). He
also symbolizes directly measured variables with a single X, rather than with
Fig. 2. A Diagram of Direct Measurement,

Measured Value. Source: Blalock, 1968.
where X represents
the True Value and X the
233
Fig. 3. A Diagram of Indirect Measurement,

Indicator. Source: Blalock. 1968.
where U is an Unmeasured
Variable
and Z is its
a causal arrow as he did in Fig. 2. Further, he incorrectly uses the symbol for
a concept by postulation
(X) rather than for a concept by intuition
(X).
The basic reason for the confusion is that the two-level model is inadequate to its task. The problem stems from the fact that Blalock, apparently
without realizing it, is discussing two quite different situations when he
thinks he is only discussing one. In some cases, as in the relationship
between X4 and Xi in Fig. 1, he is discussing the relationship
between a
concept by postulation
and its operationalization.
This is termed indirect
measurement
in Fig. 1. However, in so-called direct measurement, as in Fig.
2, he is discussing the relationship
between a concept by intuition
(such as
sex) and its operationalization.
Blalock considers that both indirect and
direct measurement involve translating between the same two levels, but they
do not. He is actually working with three levels without realizing it.
As remarked earlier, such confusion is not Blalocks fault, but is endemic
in the two-level model, and is present in some degree in all applications
of
the two-level model, including
earlier discussions by the present author
(Bailey, 1973, 1978). The points that I have been making may not be readily
evident to a reader accustomed
to evaluating
measurement
within the
confines of the two-level model, a model which is so familiar that many take
it for granted and do not question it. These points will become clearer in
discussing the three-level model. I then return to Blalocks analysis and
clarify it in terms of the latter model.
The Three-Level
Model
A problem with any extant model, such as the two-level measurement

model, is that it often becomes so familiar to long-term users that they take
it as a given, and thus cannot see its limitations.
As long as we are working
with the two-level model we will be bound by its limitations.
Rather than
extending the two-level model to a three-level model, it is more efficacious to
set the two-level model aside temporarily,
and approach the problem
of
234
measurement anew. Later we can return to the two-level model and compare
it with the three-level model.
It is axiomatic in any measurement model that it is necessary to assume
the existence of an empirically determinable entity of some sort that we wish
to measure. Let us use the familiar example of intelligence. It is a vexing and
controversial example, but a challenging one, with a long history of analysis
which will perhaps provide some continuity between the present model and
earlier ones. Pursuant to the axiom just stated, it is necessary to assume that
individuals actually possess some degree of intelligence. Thus, as a beginning, it can be assumed that Jack has intelligence level Xi and Judy has
intelligence level Xi. Whether these individual intelligence levels can be
measured directly or indirectly, or even whether they can ever be measured
accurately, is somewhat beside the point at this time.
Once it has been assumed that a phenomenon such as intelligence exists
empirically, the next issue concerns its role in the process of social research.
It may be that researchers never recognize that an empirical phenomenon
exists, or do recognize its existence but do not feel that it is important to
study. Assuming that a researcher does recognize the existence of the
phenomenon of intelligence which Jack and Judy possess, then the researcher has some mental (cognitive) image of the phenomenon of intelligence. We may at this point call this mental image a concept if we can do so
without causing confusion. It is important to stress that this mental image is
merely an inner phenomenon in the researchers mind, and is not an external
reality. It is not a written definition on paper or any other material carrier.
This point may not seem important, but it is, and its importance will become
clear. The major point to keep in mind is that the concept is a mental image,
and we can symbolize it by X. We are thus assuming for the present that a
single investigator has a single mental image of the social phenomenon of
intelligence (X), and that two empirical values of this exist as possessed by
Jack and Judy and symbolized by Xi and Xi.
At this point we should be able to see just how different the three-level
model really is from the two-level model. So far we have sketched out two
levels. Are these the same two levels of the two-level model? These two levels
are clearly not synonymous with Northrops two levels, of concepts by
intuition and concepts by postulation. Concepts are just that, whether they
are derived intuitively or by postulation. Thus, following the previous
symbol system, blue as a concept by intuition would be X (not X) and blue
as a concept by postulation would also be X. The actual color blue on an
empirical object would be X, but this would not be a concept by intuition.
However, if the two levels sketched out so far are not synonymous with
Northrops two levels, might not they be synonymous with Blalocks (1968)?
Blalock (1968, p. 19) distinguished between the true value of a variable (X)
235
and its measured value ( X). His true value X is represented by our empirical
value (xl), but his measured value is lacking in the present model (bear in
mind that only two of the three levels have been sketched out thus far).
Blalock (1968, p. 10) also made a distinction between concepts in a theoretical language, which seem to be our X, and concepts defined operationally
(which are lacking at this point in the present formulation). Thus, at one
place Blalock (1968, p. 10) links our X to an operational definition, and at
another (Blalock, 1968, p. 19) he.links our X to an operational definition,
while presenting these different respective formulations as the same two-level
model (which they clearly are not), rather than as two separate two-level
models (which they clearly are).
The key to the unified model, and the chief way to reduce the confusion
rampant in the extant two-level model, is to introduce the third level, which
is of course the operational level, or in terms of the present example, the
individuals score on an intelligence test ( X). The operational level X is in
a very real sense created from both the conceptual level X and the empirical
level X. In mathematical terms it can be said that X and X are mapped into
X. Thus, it is little wonder that this level is often omitted in two-level
models. In fact, it is not so much omitted as merged into one of the other
levels.
This third level is here referred to as either the indicator level or the
operational level. The term operational level is somewhat misleading and
may have negative connotations, but it will be retained (at least for the
present) for want of a better term. It is misleading because operations can be
performed at all three levels. Concept formulation is an operation resulting
in X (the concept of intelligence); one can perform the operation of attempting to observe evidence of intelligent behavior (Xl); and one can perform the
operation of constructing an intelligence test (X).
The three-level measurement model is shown in Fig. 4, where X is the
concept, X is the corresponding empirical occurrence of that concept, and
XlConceptual
level)
(Indicator
level 1
(Empirical
level I
Fig. 4. The Three-Level
Measurement
Model.
236
X is the operational definition or indicator of both the concept and the

empirical occurrence. The three paths a, b and c represent the degree of
congruence between respectively X and X, X and X, and X and X. Path a
refers to the degree of correspondence between the concept X and the
empirical entity X. This is what is often termed the epistemic correlation.
Path c refers to the congruence between the concept X and the operational
indicator X. This is also sometimes called an epistemic correlation in the
two-level model (Blalock, 1968, p. 10; Costner, 1969, p. 245), and in effect
the two-level model generally merges paths a and c. The degree of congruence along path c is what is generally referred to as validity, or the extent
to which the test measures the concept it is supposed to measure (Blalock,
1968, p. 13). Path b refers to measurement error, or the degree of congruence
between the indicator or test X and the actual empirical occurrence X. This
is termed measurement error in Blalocks (1968, p. 19) two-level model,
which I think is an adequate representation. However, Blalocks (1968, pp.
10-19) two-level model tends to merge measurement error (path b) with
path c. This stems from the frequent failure in the two-level model to
maintain the distinction between the empirical occurrence of a phenomenon
X and its indicator X.
The relations portrayed in paths a, b and c are symmetrical in the sense
that as the researcher works with the model, a change at either end of one of
the three dyadic relationships may be deemed by the researcher to necessitate change at the other end of that relationship. Any of the three levels (X,
X or X) may be the origin of such change. For example, suppose that the
researcher makes new observations of X which differ from past observations. This may mean that X and X are no longer congruent, and thus may
necessitate change in the concept X, particularly if it was originally formulated through observation of X. Further, change in the indicator X may
be necessitated, so that it will be congruent with both the new form of X and
the new form of X.
Similarly, a researchers decision to reformulate a concept X may change
the particular phenomenon X that is studied, and thus may necessitate
change in X so that it becomes congruent with the new X and X.
Sociologists other than strict operationalists would work with X and X
before working with X (although it may be moot whether X or X is worked
with first}. Strict operationalists might work with X first (or even exclusively), and might say that change in X would be followed by subsequent
changes in X and X. Very strict operationalists might even deny that X and
X (particularly X) exist independently of X. However, the strength of the
three-level model is that all variants of the measurement paradigm are clearly
included, be they strict operationalist, nonoperationalist, two-level or even
single-level.
237
A crucial question concerns the way in which the congruence between
levels should be expressed in the three-level model. The notion of epistemic
correlation has been used in the two-level model (Blalock, 1968). This notion
should probably be replaced for several reasons. One reason is that the term
epistemic correlation is closely identified with the two-level model, and its
use could conceivably carry some of the limitations of the two-level model,
or lead to erroneous or limited perceptions of the three-level model. Further,
the term correlation has an empirical connotation in sociological statistics
which is misleading in the present application, as there is no way to compute
a value for an epistemic correlation. Still further, the term correlation
connotes a relationship between continuous, generally intervally measured
variables. This can be extremely misleading in the present context, which is
quite a different situation altogether.
Rather than use the term correlation, it seems preferable to express the
degree of congruence between X, X and X in terms of isomorphism. The
levels X and X, for example, are isomorphic inasmuch as there is a
point-by-point similarity between each of. them. This notion is sufficiently
broad, for example, that whether X is a unidimensional concept such as age
or a multidimensional concept such as bureaucracy, we can still easily talk of
X being isomorphic to X. Similarly, X is isomorphic to both X and x to
the degree to which it is a point-by-point representation of each.
EXAMPLES
In order to ensure that the three-level model is sufficiently clear let us

look briefly at four ad~tional examples. These include a very familiar but
somewhat complex example (crime), a relatively abstract construct
(authoritarianism), a directly measured variable in Blalocks (1968, p. 19)
terms (age), and a variable which Blalock (1968, p. 14) measures indirectly in
terms of its supposed effects (racial disc~~nation~. The concept of crime
(X) entails the perception of acts which violate laws and are thus subject to
sanctions. An individual instance of this behavior (e.g., robbery) can be
observed (X), and indicators can be formulated, such as crime statistics or a
scale of carnality
(X). Similarly, a concept of authoritarianism as a
mental image in the investigators mind can be constructed (X), instances of
behavior consistent with this image can be observed (X), and an
authoritarianism scale can be constructed (X).
Next, let us consider the directly measured variable of age. The concept
of age is labeled X, while the actual value of age exhibited individually by a
given person is X. Some indicator of age, such as the response to a
questionnaire item, constitutes the third level (X). Likewise, variables
considered by Blalock to be measured indirectly are easily accommodated in
238
the three-level model. Consider the example of racial discrimination. The
concept of racial discrimination is X, an actual observable act of racial
discrimination, as when a majority person is hired rather than an equally
qualified minority person, is x, and an indicator of discrimination, such as a
discrimination index or a questionnaire item asking a person to list how
many times he or she has been the object of discrimination, is X. As such,
this variable is measured directly, and need not be measured indirectly at all.
Indeed, it is not dealt with in the three-level model any differently from the
so-called directly measured variables which Blalock states are closer to
the operational level. Blalock (1968, p. 14) noted that discrimination is not
ordinarily thought of as a property of an individual, as are the variables said
to be directly measured. However, apparently, the basic reason for measuring discrimination indirectly rather than directly is that it is a member of a
class of properties that have been theoretically conceived in terms of their
causal implications and that imply measurement in terms of their supposed
effects (Blalock, 1968, p. 14). Theoretical interest in the effect of discrimination would be a sufficient reason for studying its relationship to other
variables in a hypothesis, but is not sufficient reason for failure to measure
discrimination directly (which hypothesis testing would require, of course). If
one did wish to study an effect of discrimination, such as income, this would
fit easily into the three-level model. The concept of income is X, an
individuals actual income is x, and indicated income (e.g., as embodied in a
response to a questionnaire item) is X. The concept of an income differential is merely an extension of this example. The relationship between
discrimination and income in the three-level model is shown in Fig. 5.
Figure 5 assumes that discrimination can be measured directly (X).
Variables which are not directly but are indirectly measurable in terms of
their effects can easily be accommodated in the three-level model as in Fig.
Xl
Racial
discrimination
Fig. 5. The Relationship

Three-Level Model.
x2
Income
between
Discrimination
(X,)
and its Effect (Income,
X2) in the
239
Xl
C
\
X?
b
/
X'l
Fig. 6. An Unmeasured
Variable
(XI)
and its Effect (X2) in the Three-Level
Model.
6. In Fig. 6, the unmeasured variable is thought to exist as a concept (X,)

and to exist empirically
(Xi), but for some reason we cannot construct its
indicator (Xi). However, the unmeasured variable is causally related to its
effect (X,), which does exist at all three levels (X,, X; and Xy). Further
analysis of this model is complex and is outside the major thrust of this
paper, and so will be considered at a later date.
THE OPERATIONAL
LEVEL
The indicator level x is quite complex and is easily confused with one of
the other two levels (as in the two-level model). Thus, it requires further
explication.
The indicator level X is the result of the mathematical
mapping
of both the conceptual level X and the empirical level X into a third level
(X). Thus, X is quite literally a combination
of X and X. Ideally, X and
X will be isomorphic
with each other, and X will be isomorphic
with each
of them. This is the goal of measurement
using the three-level model.
Generally,
it is assumed that X and X are isomorphic
before X is
constructed,
meaning
that the concept (X) is consistent with empirical
reality (xl). The first two levels (X and X) are combined on the third level
by representing a combination
of X and X on some physical carrier. That is,
X and X are combined to form a new piece of information
(xl), and this
information
must be carried by some physical object. This object is usually
paper, which carries the information
in the form of symbols such as letters or
numbers. However, this medium for conveying information
is increasingly a
piece of film or a computer
tape or disc. Such a medium
for carrying
information
is called a marker in information
theory (Miller,
1978, p. 12).
Among the examples of markers that Miller listed are the stones of Hammurabis day which bore cuneiform writing, parchments, writing paper, and
waves emanating from a radio station.
At least two sociologists have used some form of the concept of marker,
240
although neither used the term marker,

this term being coined in information theory. Durkheims
(1954) concept of the totem is clearly a form of
marker. The Australian totems he studied were generally living objects such
as birds or trees. These objects had great symbolic significance,
as they
represented religion as well as the clan. A modern example of such a marker
is a national flag, which elicits feelings such as patriotism.
Sorokin (1964, p.
4) also discussed markers. without using the term:
Any empirical sociocultural phenomenon consists of three separate components: (1) immaterial, spaceless, and timeless meaning; (2) material (physiochemical and biological)
vehicles that materialize, externalize, or objectifr the meanings; and (3) human agents that
bear, use, and operate the meanings with the help of the material vehicles (italics in the
original).
The simplest form of indicator is the dichotomously

labeled type, such as
the classes of intelligent/nonintelligent,
male/ female, young/ old, etc. As
mere labels with no empirical content, such types represent the mapping of
the concept X into the indicator level X. It is a simple matter to map in the
empirical
level by locating empirical
examples of the cells in a type and
computing
a frequency for each cell. That is, the two subtype labels
male/female
(in a table without cell frequencies)
simply represent the
mapping of the dichotomous
concept of gender (X) into the indicator level
X, but the frequency tabulation constructed by enumerating
the numbers of
individuals
observed to be in each type represents a mapping of both X and
X into X, and is thus a complete formulation
of the three-level model for
nominal measurement.
The situation is similar at the ordinal level. A set of questions can be
constructed that are said to represent a particular
concept. This again is a
mapping of X into X. We conduct such a mapping of X into X whenever
we construct a set of agree/disagree
items said to measure alienation,
authoritarianism,
etc.
At this stage we can even distinguish
between partial order or simple
order. If we seek merely to establish partial order we can say that a value on
the scale is attained merely by summing the number of items that are agreed
with. This is done in all summated
rating scales, such as Likert (1932)
scaling. In such scaling, any combination
of three questions agreed with
yields a value of three on the scale. In contrast, we can stipulate that
agreement on items must be cumulative,
in the sense that one cannot be
answered unless a prior one has been answered. This stipulation,
utilized in
Guttman
(1944) scaling, yields a simple order scale with each score being
achieved in a unique fashion. Similarly,
by utilizing
a different set of
operations for the manipulation
of a given set of items we can construct
241
what is said to be an interval scale through Thurstone scaling (Thurstone
and Chave, 1929). In discussing Likert, Guttman and Thurstone scaling, we
still have not discussed the application of these techniques to actual empirical data, but have discussed only the mapping of X into X. The difference
in the three techniques lies primarily in the particular operations that are
specified and in the assumptions that are made for mapping X into X and
for manipulating the items at the indicator level. When x is mapped into
the respective scales, it then becomes possible to assign a scale score to a
given individual respondent.
Comparison With Other Formulations

This paper has shown that the measurement process necessarily involves
working with three different levels. Extant two-level models have utilized
either only the conceptual-empirical levels (X-X), only the conceptual-indicator levels (X-X),
or only the empirical-indicator levels (X-X),
or
have unwittingly merged two of these levels into one. Probably the most
common such merger is to fail to discriminate between X and x by
combining them under either the label of data, empirical level, operational definition, or indicator. It is clear that the two-level model is not
sufficiently precise, and that its use generally leads to confusion. The
question remains, however, of whether in fact the three-level model is
sufficient. As far as I can tell, there are only three basic levels involved in the
measurement process. There are, however, a great many different operations
that can be performed, especially at the indicator level. Thus, the measurement process can be presented in a great many variations, but all seem to
operate on the three basic levels presented here.
THEORY-MODEL-DATA
TRIANGLE
For example, consider the theory-model-data

triangle presented by
Leik and Meeker (1975, p. 10). Leik and Meeker (1975, pp. 9-10) stated
that:
At the substantive theory point of the triangle we are concerned with various general
propositions about the interrelationships of sociological variables such that we can predict
and explain social facts. At the data point of the triangle we are concerned with
observations of these social facts, whether by interview, questionnaire, field observation,
laboratory observation, experimentation, or other sources of information about social
relationships.
242
As for the third point, Leik and Meeker (1975, p. 10) stated that:
A model is in this sense strictly general-that is, devoid of any empirical or theoretical
content. It operates as a hind of logical machine, which may be of use for either of the
other points of the triangle if adequate correspondence between theoretical terms and
mathematical terms or between observational procedures and mathematical terms can be
established.
Leik and Meekers data point is clearly our X, while the mathematical
model is X. The substantive theory point is clearly X also, although
written in specific verbal terms rather than abstract mathematical symbols.
Our third level X, consisting of mental images in the mind of the investigator, is not included explicitly in the triangle, although it may be implied by
the substantive theory point.
Leik and Meekers description of the mathematical model illustrates the
point about the large range of operations possible on the indicator level
(X). We have generally been discussing the situation where the researcher
first works with X or X (or often both), and then forms x by mapping
from both X and x. In mathematical modeling it is common, in a sense, to
work with the indicator level (X) first. This is done by using general
symbols such as X, Y and 2, and formulating quantitative relationships
between them.
THE MODEL OF COOMBS ET AL.
Coombs et al. (1954) divided the process of mathematical modeling into

six parts. They discussed two alternative means of arriving at physical
conclusions. One is by proceeding from the real world through experimental design, through observations, to physical conclusions. The alternative is
to proceed from the real world through mathematical systems, through
mathematical conclusions, to physical conclusions (Coombs et al., 1954, p.
134). However, it is clear upon exa~nation that all of these six facets fall
easily within the present three-level model, and do not indicate an extension
of the model.
Discussion
The major value of the three-level model is that it includes all three levels
actually used in the research process. A two-level model may be merely
incomplete, or it may entail the unwitting merger of two of the levels. In the
former instance we have an incomplete understanding of the measurement
243
process, and in the latter instance we have a resulting confusion which may
effectively preclude the development
of further knowledge concerning
the
measurement
process.
However, with the three-level model clearly in mind as a guiding framework, there may be occasions when two-level or single-level models can be
used efficaciously. Thus, the three-level model in no way precludes the use of
one- or two-level models. It simply subsumes them, and thus helps in
discerning confusions in these models and in understanding
the relationships
among these models. Within the context of the three-level model, verbal
theorists can derive concepts deductively
(X) and map them into verbal
essays on paper (X) without the constraints imposed by empirical investigation (X). The concepts they derive which are isomorphic
with work done
by researchers confining themselves solely to empirical
data-gathering
(X)
can be combined
in an effective division
of labor to yield a finished
measurement
model. Concepts (X) which are not quickly seen to be isomorphic with X can be stored in documents (X) until such time as they are
seen to be isomorphic,
or until it is decided that they are not theoretically
or
empirically
viable.
The three-level model does not solve the considerable number of problems
now present in the measurement
literature, but does provide a sound basis
for attacking these problems. One of the major areas where the three-level
model may facilitate
progress is research on the notions of validity and
reliability.
Blalock (1968, p.13) noted that the notion of validity is used in
two different ways. One is the degree to which an indicator measures what it
is supposed to measure. However, in attempting
to assess validity,
two
indicators are often compared, as in the case of criterion validity. Blalock
(1968, p. 13) stated that these are two different types of validity that should
be given different names to lessen confusion. The three-level model clarifies
the notion of validity considerably. The classical notion of validity, as in the
case where an indicator
measures what it is supposed to measure, has
generally been interpreted as a high degree of congruence between a concept
(X) and an indicator (X). That is, an indicator Xi is said to be valid if it is
isomorphic
with the concept Xi rather than with some other concept X,. The
three-level model shows the notion of validity to be somewhat more complicated. It is not sufficient for X and X to be isomorphic
if X is not
isomorphic
with X. Reconsider the example of intelligence,
where X is the
concept of intelligence,
X the empirical occurrence of intelligence,
and X
the indicator of intelligence
(e.g., a pencil and paper intelligence
test). It is
not sufficient that X and X are isomorphic if X and X are not isomorphic,
because the latter indicates a high degree of measurement
error. That is, if X
and X were isomorphic,
but X and x were not, then x would indeed be
measuring intelligence,
but it would be measuring the wrong value of it. If
244
Jim has an intelligence level of 150 but X measures his intelligence as 100,
then this would seem to be little more valid than a case where X was not
measuring the concept of intelligence (X) at all, but was measuring some
other concept. Within the context of the three-level measurement model,
measurement is truly valid only when X is isomorphic with both X and X.
This matter deserves extensive discussion and is clearly beyond the scope of
this paper. The discussion of validity and reliability within the context of the
three-level measurement model will be the topic of a subsequent paper.
There are a number of other facets of the measurement process which
remain to be examined within the context of the three-level model. These
include the use of multiple indicators of a single concept, direct and indirect
measurement, the relationship between inductive and deductive reasoning
(including the notion of grounded theory), and the role of mathematical
models and formal logics in the measurement process. Also of interest is the
distinction between discrete classes and variables, because concepts (X) are
often stated as classes,whereas indicators (X) and statements about actual
values (xl) are often stated as variables. Also notice that in this paper I have
not only extended the number of levels in the measurement model from two
to three, but have also broadened the notion of an indicator. That term is
often limited to scales that are ordinal or interval, and I have broadened it to
include not only nominal variables but even verbal-type labels. All of these
are indicators (X) and must be carefully distinguished from concepts (X)
and empirical values (X).
Still a further point for consideration is whether the term level is the
best, or whether some other term, such as dimension or coordinate, should
be used to signify X, X and X. Strictly speaking, there is no hierarchy
among X, x and X as the term level implies. Nevertheless, two-level
models have generally been presented as though there was a hierarchy, with
the conceptual level written above the empirical level in the diagram. I
cannot recall ever seeing the empirical level on top and the conceptual level
on the bottom.
References
Bailey, K.D. (1973). Monothetic
and polythetic typologies and their relation to conceptualization, measurement and scaling, American Sociological Review 38: 18-33.
Bailey, K.D. (1978). Methods of Social Research. New York: Free Press.
Bierstedt, R. (1959). Nominal
and real definitions in sociological theory, pp. 121-144 in
Llewellyn Gross, ed., Symposium in Sociological Theory. New York: Harper and Row.
Blalock, H.M., Jr. (1968). The measurement problem: a gap between the languages of theory
and research, pp. 5-27 in Hubert M. Blalock and Ann B. Blalock, eds., Methodology in
Social Research. New York: McGraw-Hill.
245
Capecchi, V. (1966). Typologies
in relation to mathematical models, Zkon supplementary
number 58: l-62.
Coleman, J.S. (1964). Zntroduction to Mathematical Sociology. New York: Free Press.
Coombs, C.H. (1953). Theory and methods of social measurement, pp. 471-535 in Leon
Festinger and Daniel Katz, eds., Research Methodr in the Behavioural Sciences. New York:
Dryden Press.
Coombs, C.H., Raiffa, H. and Thrall, R.M. (1954). Some views on mathematical models and
measurement theory, Psychological Review 61: 132-144.
Costner, H.L. (1969). Theory, deduction, and rules of correspondence,
American Journal of
Sociology 75: 245-263.
Durkheim, E. (1954). The Elementary Forms of Religious Life. New York: Free Press.
Guttman, L. (1944). A basis for scaling qualitative data, American Sociological Review 9:
139-150.
Hauser, R.M. and Goldberger, AS. (1971). The treatment of unobservable variables in path
analysis, pp. 81-117 in Herbert L. Costner, ed., Sociological Methodology 1971. San
Francisco: Jossey-Bass.
Hempel, C.G. (1952). Typological
methods in the natural and social sciences, Proceedings
of the American Philosophical Association, Eastern Division 1: 65-86.
Leik, R.K. and Meeker, B.F. (1975). Mathematical
Sociology. Englewood
Cliffs, NJ:
Prentice-Hall.
Likert, R. (1932). A technique for the measurement of attitudes, Archives of Psychology 21,
No. 140: l-30.
Lundberg, G.A. (1939). Foundations of Sociology. New York: Macmillan.
Mayer, L.S. and Younger, MS. (1974). Multiple
indicators and the relationship
between
abstract variables, pp. 191-211 in David R. Heise, ed., Sociological Methodology 1975.
San Francisco: Jossey-Bass.
McKinney,
J.C. (1966). Constructive Typology and Social Theory. New York: Appleton
Century-Crofts.
Miller, J.G. (1978). Living Systems. New York: McGraw-Hill.
Northrop, F.S.C. (1947). The Logic of the Sciences and the Humanities. New York: Macmillan.
Parsons, T. (1937). The Structure of Social Action. Glencoe: Free Press.
Sorokin, P.A. (1964). Sociocultural Causality, Space, Time. New York: Russell and Russell.
Thurstone, L.L. and Chave, E.J. (1929). The Measurement of Attitudes. Chicago: University of
Chicago Press.
Werts, C.E., Joreskog, K.G. and Linn, R.L. (1972). Identification
and estimation in path
analysis with unmeasured variables, American Journal of Sociology 78: 1469-1483.
Winch, R.F. (1947). Heuristic and empirical typologies: a job for factor analysis, American
Sociological Review 12: 68-75.

Bailey K A Three Level Measurement Model

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bailey K A Three Level Measurement Model

Uploaded by

Copyright:

Available Formats

Quality and Quantity, 18 (1984) 225-245

Elsevier Science Publishers B.V., Amsterdam

Printed in The Netherlands

A Three-Level Measurement Model

of Sociology, University of California,

Los Angeles, CA, U.S.A.

Some sociologists have advocated what might be termed single-level

(1969, p. 245) wrote in a similar

There seems to be a general acceptance of the two-level measurement model

0 1984 Elsevier Science Publishers

Another symptom of the unfinished

The two-level model is firmly ensconced in the sociological literature. It is

Fig. 2. A Diagram of Direct Measurement,

the True Value and X the

Fig. 3. A Diagram of Indirect Measurement,

A problem with any extant model, such as the two-level measurement

Fig. 4. The Three-Level

X is the operational definition or indicator of both the concept and the

In order to ensure that the three-level model is sufficiently clear let us

Fig. 5. The Relationship

and its Effect (Income,

and its Effect (X2) in the Three-Level

6. In Fig. 6, the unmeasured variable is thought to exist as a concept (X,)

although neither used the term marker,

The simplest form of indicator is the dichotomously

Comparison With Other Formulations

For example, consider the theory-model-data

Coombs et al. (1954) divided the process of mathematical modeling into

You might also like