Professional Documents
Culture Documents
0033-2909/95/S3.00
Psychological Bulletin
1995, Vol. 117, No. 2, 187-215
This article benefited greatly from the counsel of a number of colleagues. I must exculpate them regarding its remaining deficiencies. My
especial thanks go to Lew Goldberg, David Harrington, Robert Hogan,
Oliver John, Robert McCrae, Philip Shaver, and Auke Tellegen, among
others.
Preparation of this article was supported in part by National Institute
of Mental Health Grant MH 16080.
Correspondence concerning this article should be addressed to Jack
Block, Department of Psychology, University of California, Berkeley,
California 94720-1650.
187
188
JACK BLOCK.
Because the Big Five formulation was entirely atheoretical, usage of the term "model" may be premature.
Moreover, Big Five research has been based only on the relations among a set of variables across individuals, what Cattell
(1946), in his incisive formulation of the "data box," termed
the R-technique (or "variable-centered") approach to the analysis of personality data. As he observed, although the R-approach to data analysis is important, there are other important
ways of looking at personality. In particular, it should be recognized that no matter how satisfying on descriptive or other
grounds the variable-centered factor structure of the FFA may
be, it cannot represent a personality structure. Personality
structures lie within individuals (see also Block, 1971, p. 13;
John, 1990, p. 96). It is the personality structure of an individual that, energized by motivations, dynamically organizes perceptions, cognitions, and behaviors so as to achieve certain "system" goals. No functioning psychological "system," with its
rules and bounds, is designated or implied by the "Big Five"
formulation; it does not offer a sense of what goes on within
the structured, motivation-processing, system-maintaining
individual.
On this analysis, it follows that a more appropriate, and more
limiting, label for the "Big Five" body of work is something like
the inordinately cumbersome phrase, "the five-factor, variablecentered approach to personality descriptions." Of practical ne-
189
able, shoelace-tying competency, in one form or another. Variable 100 is a fully reliable, fully valid but anonymous or unrecognized measure of the latent variable, general intelligence,
which is presumed here to not correlate at all with the first 99
measures. Factor analysis of this matrix will issue one general
factor explaining all of the communal variance. An unsophisticated enthusiast of factor analysis would be impressed by the
finding of so powerful a factor "explaining" so much of the covariance. Further, the lonely, nameless, intelligence measure
would likely be viewed as a "residual" and dismissively consigned to obscurity. Yet, relating these factor analytic findings
to the wider, external world beyond the mathematical solipsism
of the particular factor analysis would reveal that the "powerful" general factor has no or trivial implications, whereas
the ignored residual variable has momentous behavioral
significance.
Of course, this example is a contrived extreme. But the point
made applies to the real world of data to an unrecognized or
unacknowledged extent: the amount of variance "explained"
internally by a factor need not testify to the external psychological importance of the factor.
The mix of variables submitted to factor analysis can be varied with the consequence that the factors then obtained will, in
effect, be entailed rather than represent substantive findings. A
small factor can be made large, a large factor can be made small,
residuals can be made into "factors" (or discarded as
irrelevant). One can make for a "simple structure" by dropping
factorially complex ("interstitial") variables or prevent "simple
structure" (and even create a circumplex configuration) by
adding factorially complex variables that "blend" different
sources of variance. If the locations of axes in one "simple structure" are not satisfying, it is possible by judicious selection and
deletion of variables to relocate the axes and have quite another
"simple structure." One can achieve such ends by just varying
the degree and the location of redundancy within the variable
set. In particular, a set of variables can be "prestructured." That
is, wittingly or unwittingly, the set of variables subjected to factor analysis may have been previously selected so as to contain
several quite different subsets or "clusters" of redundant variables. Such "prestructuring" preordains the "factors" subsequently "found" by the algorithmic operations of factor
analysis.
It is my view that the frequent presence and the powerful
effects of "prestructuring" are often not sufficiently recognized
by those using the method of exploratory factor analysis. In particular, it will be suggested later that, in crucial ways, influential
demonstrations of the sufficiency of the FFA may have been unduly influenced by prior prestructuring of the personality variables used in these analyses.1 If so, then the "recurrence" and
"robustness" over diverse samples of factor structures may be
attributable more to the sameness of the variable sets used than
to the intrinsic structure of the personality-descriptive domain.
Although the method of factor analysis has been used for al1
Protagonists of the FFA recently have been acknowledging the crucial influence on factor solutions of the particular sets of measures used.
For example, Costa and McCrae observe that "the axes chosen by a
varimax rotation will depend completely on the selection of variables"
(1992a, p. 661). Goldberg (1993) also acknowledges this problem.
190
JACK BLOCK
191
Role ofAllport
The FFA approach in America may be said to have begun
with Allport and Odbert's (1936) onerous compilation of all
the terms in the 1925 unabridged, 400,000 word edition of
Webster's New International Dictionary they judged as usable
"to distinguish the behavior of one human being from that of
192
JACK BLOCK
another" (p. 24). They came up with 17,953 single-word descriptor terms. To this enormous collection, Allport and Odbert
applied their definition of "trait" as "generalized and personalized determining tendenciesconsistent and stable modes of
an individual's adjustment to his environment" (p. 26) and
came up with a primary list of only (sic) 4,504 nonjudgmental
"trait-names," still a very large set. They suggested that their
alphabetical listing of trait-names might prove to be a useful
resource for psychologists developing rating scales and the like.
However, Allport warned that "common speech is a poor guide
to psychological subtleties" (Allport, 1961, p. 356).
RoleofCattell
Next on the scene was Raymond Cattell (1943a, 1943b,
1947). Cattell subscribed to what subsequently has become
known as the lexical hypothesis: "All aspects of human personality which are or have been of importance, interest, or utility
have already become recorded in the substance of language"
(1943b, p. 483). He began with the Allport and Odbert traitname listing but, noteworthy and often not noted (but see John,
Angleitner, & Ostendorf, 1988), he deemed it insufficient.
To make the list of traits as complete as possible. . . in addition to
all that could be obtained from the dictionary, the substance of
all syndromes and types which psychologists have observed and
described in the past century or so [were added]. (Cattell, 1943a,
p. 491)
An issue must be raised here that will be more fully considered later in this article. Can the psychological perceptiveness of
Air Force officers and officer candidates, as quickly expressed
by 3-point ratings on 30 or so scales in an officially required
research program regarding 12 to 30 of their peers known for
such short periods, provide a fundamental data basis for discerning the essential dimensions for the scientifically sufficient
description of personality?
In six of the analyses, 8 orthogonal factors were extracted; 5
factors were extracted in another analysis and 12 factors were
extracted in the analysis of the female sample. Summarizing
their findings, Tupes and Christal concluded that in each of the
eight analyses based upon the Cattell variables, "five fairly
strong and recurrent factors emerged." They were impressed,
and later other psychologists were impressed, by the marked
congruence among the five-factor patterns derived in so many
analyses.
I suggest that, for technical reasons, the degree of recurrence
of this five-factor structure over their eight analyses may not be
so striking as has been assumed. Although the analysis of their
first Air Force sample used the centroid method of factor analysis prevalent at the time, followed by subjective rotation, the
remaining seven factor analyses all employed the multiplegroup method of factor analysis. The multiple-group method of
factor analysis (see, e.g., Harman, 1967, chap. 11), now longabandoned and mostly forgotten, was frequently used in the
precomputer era to lessen the laborious calculation centroid
factor analysis otherwise required. The multiple-group method
permitted the extraction of a number of factors in one laborsaving operation. The centroid method, on the other hand, extracted factors serially and involved protracted computations
of successive residual matrices. To lessen the subsequent computations, the multiple-group factor analyst first had to partition the set of variables into a number of groups (i.e., anticipated factors) according to preconceived hypotheses. All these
preconceived factors then were extracted simultaneously, followed by the calculation of a residual matrix. This residual matrix, if its elements were sizable, might then be subjected to further analysis, usually by the centroid method.
In the first of the Tupes and Christal factor analyses, eight
factors were arduously extracted via the centroid method.
These factors were then subjectively rotated toward orthogonal
simple structure so as to residualize three of them (i.e., via rotation, the loadings of variables on three factors were deliberately made generally low enough so that these factors could be
said to be unimportant, thus justifying their elimination from
subsequent consideration). However, in the subsequent seven
factor analyses, wherein the multiple-group method was used,
the variables were grouped into five subsets prestructured so as
to correspond to the five rotated factors decided upon in the first
study. That is, the orthogonal factor structures of the last seven
analyses were created to conform insofar as possible with the
factor structure solution settled upon in the first analysis. In
effect, Tupes and Christal used their first factor solution as the
193
target matrix for all their subsequent analyses. Factors representing other than these five pregrouped sets of variables were
residualized, except in the female sample in which the fifth factor was split into two subfactors. As Horn (1967) has compellingly observed, Procrustean rotations to fit a target matrix can
show seemingly impressive congruence with the target even
when random variables are involved. Inevitably, then, the prestructured solutions used in the last seven Tupes and Christal
multiple-group factor analyses fitted the target matrix at least in
part because of "fitting error." The extent to which capitalization on chance was involved has never been evaluated, but it
certainly underlies a portion of the factor "recurrence"
observed.
A more important concern to register, however, is that the
semantic structure underlying the Cattellian variables used in
all the analyses by Tupes and Christal may have intrinsically
predestined the factor solutions subsequently observed. For example, their first factor was defined by variables labeled as "secretive," "silent," "self-contained or reclusive" as opposed to
"sociable," and "talkative." Referring to Soule's Dictionary of
English Synonyms (Sheffield, 1959), under the term "secretive," there is the following entry (cited completely): "reserved,
reticent, close, uncommunicative, cautious, wary, taciturn" (p.
470). As these synonyms attest, the variables identifying the
first factor of Tupes and Christal were highly redundant. Therefore, if the subjects providing the basic data were not responding
incoherently, there is no problem in understanding how such
redundant rating scales will, in a factor analytic context, generate a factor dimension. A second factor was defined by such
variables labeled as "composed," "calm," "placid," and
"poised." Again referring to Soule, under the heading "composed" is the following: "calm, quiet, unruffled, undisturbed,
unmoved, tranquil, placid, sedate, collected, self-possessed, imperturbable, cool" (p. 109). Although not included under
"composed," the term "poised" when referenced identifies
"composure" as a synonym. It is obvious how such virtually
synonymous rating scales will generate a mathematical factor.
A third factor was based on such ratings as "good-natured,"
"cooperative," and "mild;" a fourth factor stemmed from ratings of "artistic," "imaginative," and "intellectual;" a fifth factor emerged from ratings of the equivalent scales "responsible,"
"scrupulous," and "seeing a job through in spite of difficulties
or temptations." It is not surprising when semantically related
variable sets prove to load on the same factor; as these terms are
used by often inarticulate or language-insensitive raters, their
redundancies are great. Consequently, their factorial equivalencies may only testify to the reliability and coherence of the ratings made of the subjects. Rather than representing truly substantive findings, the Tupes and Christal factor findings may
simply reflect the prestructuring that somehow crept into and
characterizes the Cattell-specified variables. Therefore, although five factors may well characterize the data sets Tupes
and Christal created or used, this finding may not be of great
moment.2
2
The preceding discussion does not reintroduce the Shweder (1975)
argument that personality judgments are "no more than statements
about how respondents classify things as alike in meaning" (p. 482). As
Block, Weiss, and Thorne (1979) observed, "in no way whatsoever is it
194
JACK BLOCK
Agreeablenessunder which any number of importantly distinguishable personality qualities may escape analysis. I further
suggest, on similar grounds, that the other four far-flung, allusive labels taken over by Tupes and Christal should not have
entered the personality firmament on the basis of this early factor analytic review.
Role of Norman
The unpublished Air Force technical report by Tupes and
Christal might well have languished, unattended and without
consequence, had not Norman (1963) picked up the baton.
Norman had taxonomic concerns. He noted that "the construction of more effective theories of the development, structure, and functioning of personality will be facilitated by having
available an extensive and well-organized vocabulary by means
of which to denote the phenotypic attributes of persons" (p.
574). And he accepted the Allport and Odbert argument that
"perceptible differences between persons in their characteristic
appearance or manner of behaving or changes over time and
situations of single individuals in these regards become codified
as a subset of the descriptive predicates of the natural language
in the course of its development" (p. 574). Norman's discussion
of the issues involved is sober and sophisticated: "It is explicitly
not assumed that complete theories of personality will simply
emerge automatically from such taxonomic efforts.. . . There
is a good deal more to theory construction . . . than the
development of an observation languageeven a good one"
(p. 574).
However, his empirical offering consisted only of a replication
of the study of Tupes and Christal. On the basis of the factors
identified by Tupes and Christal, Norman selected 20 of their
Cattellian variables (the four variables best representing each of
the five factors). By his variable selection procedure, he limited
himself to the variables most likely to demonstrate simple structure. He then had several groups of undergraduates offer peer
ratings using this restricted set of variables. By this narrowed
selection from among what has already been suggested may be a
prestructured set of variables, it follows that the factor structure
subsequently observed may have been a foregone conclusion.
Norman's findings, however, were subsequently viewed as further empirical support for the existence, primacy, and perhaps
sufficiency of the five orthogonal personality factors reported
by Tupes and Christal (Norman renamed the "Dependability"
factor as Conscientiousness, a replacement label that subsequently achieved preference).
It should be noted that in collecting peer ratings by laypersons
on a restricted set of 20 variables and in emphasizing the basic
importance of the five personality factors subsequently extracted from this kind of data, Norman would seem to have
shifted from his announced focus on the development of a scientifically oriented language for the description of personalities
to a study of the way laypersons use a constrained language to
characterize other laypeople. There are connectionsimportant, useful, even crucial connectionsbetween these two emphases, but they are not the same. Psychologists certainly can
learn a great deal for their science by studying the nature of lay
observations. But it does not follow that lay usages should be
taken over to provide the basic concepts of the field of personality psychology.
In contemplating his 1963 study, and prompted by the suggestion of Tupes and Christal that the Cattellian variables might
have omitted some of the personality concepts residing in the
Allport-Odbert adjectives, Norman decided that "it was time to
return to the total pool of trait names in the natural language
there to search for additional personality indicators not easily
subsumed under one or another of these five recurrent factors"
(1963, p. 582).
To locate personality terms that were new or previously omitted, Norman (1967) searched a later unabridged edition of
Webster and found 175 single-word descriptors to add to the list
compiled by Allport and Odbert. Then, applying his own set
of decision rules for sorting through this descriptor lexicon, he
excluded terms judged to be "purely evaluative and mere quantifiers," "ambiguous, vague, and . . . metaphorical," "very
difficult, obscure, and little-known," and "anatomical, physical
and grooming characteristics." The remaining terms he further
sorted into the categories of "stable traits," "temporary states,"
and "social roles." Excluding the latter two categories as inappropriate for the purpose of personality description, Norman
was left with 2,800 single-word descriptors deemed to represent
"stable traits." These remaining terms were then administered
to undergraduates to empirically assess their understandability
by laypersons, their social desirability, and the degree to which
undergraduates believed these terms were descriptive of themselves and their peers. After Norman removed terms he judged
to be ambiguous or unfamiliar to the typical college student or
redundant, 1,431 terms remained. With this last culling, Norman had a pool of descriptors he believed approached suitability for "the development of a structured taxonomy."
Norman then proceeded to a further, semantic sorting of his
"stable trait" terms that, although never formally reported and
published by him (but described by Briggs, 1992; Goldberg,
1981, 1990; John et al., 1988; John, 1990), may have influenced later research findings. Impressed by the factor structure
that he, Tupes, and Christal had identified earlier, and using his
understanding of the psychological meaning of the factors, Norman personally sorted his 1,431 terms according to their judged
fit into his five dimensions and assigned the terms to the positive
or negative pole of each dimension. As a final step, he examined
the terms at each pole and further formulated what he judged
to be semantic clusters within each pole. In all, he sorted his
1,431 terms into 75 semantic clusters specifying one or the other
end of the five-factor dimensions he viewed as paramount.
Fewer than 25 terms were left unclassified by Norman. The
John, Angleitner, and Ostendorf (1988) historical review may
be consulted for closer information regarding this Norman
effort. Although Norman did not proceed further with his terms
and trait clusters, his delineation and structuring of the trait
lexicon provided an important starting point for subsequent
efforts to advance a lexically based trait taxonomy.
By the mid-1960s, the initial phase of the FFA may be said
to have ended. Although a number of subsequent articles (e.g.,
Borgatta, 1964; Digman & Takemoto-Chock, 1981; Smith,
1967) also reported similar five-factor solutions of rating data,
these studies all were based on various versions of the Cattellian
or Norman variable sets and so these later studies can be classi-
195
196
JACK BLOCK
It is my belief that, for scientific purposes, single-word descriptors, although useful for many purposes, cannot convey
crucial features of personality, its dynamic functioning, the conditionalities of behavior consequent upon character structure,
the relations among phenotypical behavioral characteristics
that underlie "equifinality" (Bertalanffy, 1952) or permit
"multipotentiality." We need to use sentences, paragraphs,
pages, chapters, and books to begin to do justice to the understandings we have or must develop.
For example, how does one convey with a single adjective or
a number of separate, unlinked adjectives what may be called
the "pecking order personality," the kind of person who is affable with peers, deferent to superiors, and nasty to individuals of
lower rank? How does one convey in a single word or in a number of separate unlinked words the nature of the hysteric personalityits rigidity conjoined with impulsivity and with "la
belle indifference?" How does one convey the kind of individual
who is so disorganized or capturable by a compelling social surround as to be negligent in fulfilling responsibilities but who
subsequently is racked by guilt? How does one convey with suitable single-word descriptors the person who, confronted with
an anxiety-inducing decision situation, is quickly decisive, not
with the confidence that rapid decision is so often interpreted to
imply but only to get past the stress of the situation? How does
one convey the kind of person who, in desperate circumstances,
becomes unnaturally calm and poised? As an exercise, the
reader is invited to read or re-read Shapiro's penetrating volume on Neurotic Styles (1965) to evaluate how well single-word
descriptors, unconnected, uncontextualized, unconditioned,
can represent the complexities and the complications of personalities. My own belief is that they cannot do the scientific job
the field of personality psychology requires.
Regarding Reliance on Laypersons to Specify Personality
Descriptors and to Provide Personality Descriptions
Throughout his work, Goldberg has used "stable trait" descriptors identified by Norman or by himself as understandable
by laypersons. Trait-adjectives not understandable by laypeople
have been excluded from subsequent consideration. Typically,
undergraduates have served as criterion groups for assessing understandability. Using various sets of adjectives selected as being
understandable to laypersons, lay people have rated themselves
or others. The Big Five factor structures have derived from these
descriptions offered by laypersons based on trait-adjectives
judged to be understandable to laypersons. Several aspects of
this dependence on lay judgments concern me.
Given the aspirations of the FFA to provide a scientifically
compelling representation of the dimensions of personality
differences, consider the filtering of the set of "stable traits" to
exclude those terms unfamiliar or unclear to college undergraduates. Why were undergraduates used as the criterion group to
determine the set of personality descriptors to be used? Why
not 12-year-olds? Why not 5-year-olds? Why not 45-year-old
psychological clinicians?
Reflexively, we reject the idea of basing our personality language solely on the terms familiar to and understandable by
young children or by preadolescents. The reason why, when
some thought brings it forward explicitly, is that we believe a
certain level of cognitive development has to be reached before
one can talk sensibly about personality functioning and personality description. The work of Peevers and Secord (1973) and
Livesley and Bromley (1973), among others, is relevant here.
These developmental studies of how language is used for interpersonal description show that, over time, the words used to
describe people progress through a predictable sequence of
197
Graziano (1992) adds evidence that many of the standard adjectives used in personality assessment are unfamiliar to college
students; a study by Beck, McCauley, Segal, and Hershey
(1988) further illustrates the great individual differences in the
way trait-adjectives are understood.
These findings are perturbing for those who would use adjectives with laypersons for self-ratings and other-ratings. It,
therefore, is of special interest to observe what positively influences the consistency (implying an underlying, consensual
meaningfulness) of adjective usage. Goldberg and Kilkowski
find that verbal intelligence is associated with consistent usage
(see also Hampson, John, & Goldberg, 1986). Willingness to
work on an assigned task also contributed to meaningful adjective ratings. An index of "general adjustment" was strongly
related to consistency of adjective usage (r = .71). Elaboration
on the single-word descriptor to provide several phrases to define, clarify, and calibrate its meaning greatly improved the
internal consistency and, implicitly, the validity of subjects'
ratings, thus further demonstrating the shortcomings of singleword adjectival descriptors.
198
JACK BLOCK
Goldberg argues for the "inherent advantages" of the clustersampling method on the ground that "cluster sampling provides
a simple-structured set of variables" (p. 28) that can be expected to issue orthogonal factor markers.
However, the use of the cluster sampling approach necessarily
presumes that one knows which are the interstitial variables,
that is, one already has firmly fixed the locations of reference
axes. The mathematical method of factor analysis can only indicate the number of dimensions in the factor space; it offers no
guidance whatsoever as to where references axes within the ndimensional space should be positioned. For this fundamental
decision, conceptual and substantive considerations external to
the factor analytic method ordinarily must come into play. In
factor analyses, core variables and interstitial variables in the
conceptual context of one placement of the reference axes
199
would reverse their positions of primacy within an equally tenable but alternative placement of reference axes. An illustration
of such a reversal is the striking 45 divergence, on crucial theoretical grounds, of Gray (1981) and Eysenck (1967) in their
preferred positionings of two reference axes. The dimensions
Eysenck calls Neuroticism and Extroversion are conceptualized
after Gray's rotation as Anxiety and Impulsivity. Another example of a change in interpretation by a difference in factor
positioning is the preference of McCrae and Costa (1989b) to
reinterpret the Wiggins (1982) dimensions of Dominance and
Nurturance as exemplifying their own dimensions of Extraversion and Agreeableness.
A way to fix axis locations is to prevent competing possibilities from arising. Then, of course, only one possibility of locating the reference axes can be found. As Goldberg notes, Norman (1963), by selecting the strongest factor representatives reported by Tupes and Christal, directly minimized interstitial
variables and almost guaranteed a simple factor structure with
factor-univocal variables. It may be that in the sequence of adjective culling, cluster culling, and cluster forming procedures
by Norman and by Goldberg, which eventuated in very much
the same five factors, adjectives and cluster-based variables interstitial to these factors were unknowingly de-emphasized. Unfortunately, one cannot be sure that this thinning out of structure-complicating adjectives and synonym clusters has occurred. It, therefore, would be helpful if the immense task of
adjective and cluster culling and cluster forming were to be replicated by disinterested investigators. Until an independent verification has been established, the proposal that the lexically
based five factors be accepted as the conceptual framework for
the scientific study of personality would seem premature.4
Regarding the Restricted Context in Which Five Factors
Emerge and Are Orthogonal
Repeatedly, the lexical Big Five factors have been described
as orthogonal or "nearly orthogonal" to each other (e.g., Goldberg, 1992). However, the empirical research findings indicate
that the five factors are frequently importantly correlated
with each other, usually to reflect an overriding evaluative
component.
Thus, consider the serially developed, highly refined 100-item
five-factor marker set recently presented by Goldberg (1992,
Study 4). A "nearly orthogonal" Big Five factor structure
emerges from the markers when data restricted to self-descriptions or the descriptions of liked others are analyzed. However,
3
Anomalous findings persist, however. For example, in Study 3, "passionless" is a central component of the cluster, "placidity," which in turn
is a definer of the positive pole of Factor 3, Emotional Stability. In Study
1, the cluster, "passionlessness," is a definer of the positive pole of Factor
3, Conscientiousness.
4
A recent monograph by Ostendorf (1990) warrants mention. He
reports, based on a complicated, arduous, and methodologically sophisticated study of lay self- and other-ratings on 430 German single-word
descriptors, that five factors similar to the English lexical Big Five
emerge. I agree but note that he obtained at least eight highly replicable
factors, according to the well-regarded criterion of Everett (1983) (see
Ostendorf's Tables 46 and 47).
200
JACK BLOCK
tions, the steps along Costa and McCrae's path of Big Five advocacy must be retraced.
Beginnings of the Costa and McCrae FFA
In their first study prefiguring what became their particular
FFA, Costa and McCrae (1976) applied cluster analysis (Tryon
& Bailey, 1970) to the Sixteen Personality Factor (16PF) Questionnaire of Cattell (Cattell, Eber, & Tatsuoka, 1970), using
data from 3 groups of subjects. Three clusters were extracted. A
first cluster, reflecting about 21% of the variance, was specifically likened by them to the Neuroticism concept of Eysenck
(1970). A second cluster, encompassing about 14% of the variance, was specifically likened to Eysenck's concept of Extraversion-Introversion.
As Costa and McCrae duly noted, versions of both of these
cluster dimensions had been "consistently observed in the personality literature for over 50 years" (p. 569) and had long been
known to have many and diverse behavioral implications (see,
for only one example, Block, 1965). Indeed, Wiggins (1968)
had already designated these two omnipresent questionnaire dimensions as "the Big Two." However, although the Big Two had
long been recognized, large differences of opinion had long existed, and still exist, as to their latent or conceptual meaning.
The Eysenckian meaning subscribed to at this time by Costa
and McCrae has by no means been uniformly accepted (see,
e.g., Block, 1965, chap. 8; Block & Block, 1980, pp. 44-47,4950; Guilford, 1975, 1977; Tellegen, 1985; Tellegen & Waller in
press; Watson & Tellegen, 1985). Some personologists have
viewed these dimensions broadly, and others have viewed them
more narrowly. By these differences in scope or emphasis, the
psychological flavor of the Big Two changes appreciably.
The third cluster in this Costa and McCrae study was not well
represented; it accounted for only about 6% of the variance and
was inconsistently and unreliably present in the data analyzed.
Only Cattell's "imaginativeness" variable consistently defined
this cluster, and its estimated reliability averaged only .47. Nevertheless, Costa and McCrae were intrigued by this third cluster
and suggested that it intimated, albeit inadequately, a dimension of "openness to experience."
In a subsequent study using the same samples of veterans
(Costa & McCrae, 1978), Costa and McCrae sought to amplify
the measurement of this third cluster. Twelve 16PF variables
were joined with six additional scales specifically "intended as
a replacement for the unstable third cluster of the 16PF" (p.
128). Three of these new scales came from Coan's (1972) prior
inventory to measure openness to experience, and three more
were rationally constructed by Costa and McCrae who were influenced by Tellegen and Atkinson's (1974) report on "absorption." These latter psychologists had developed an unusual questionnaire scale, not related to "the Big Two," that in its content
and correlates seemed to reflect a susceptibility or openness to
environmental surrounds.
Applying factor analysis and varimax rotation to these 18
variables, the desired three-factor solution was obtained. The
third factor was now fattened because of the content redundancy provided by the six newly introduced openness-to-experience scales. However, although the third factor had become
more of a presence, it still was not clearly definable. While the
variables defining Neuroticism and Extraversion had mean factor loadings of .76 and .69, respectively, the mean factor loading
of the variables defining Openness was only .49. Nevertheless,
this last dimension was conceptually attractive to Costa and
McCrae and, as measured, Openness seemed to be essentially
unrelated to the indubitable Big Two of Neuroticism and
Extraversion.
At this junctureand with no anticipations of a subsequent
connection with the still inconspicuous FFACosta and
McCrae made a strong, unwavering, conceptual and research
decision to focus their research attention on three broad
constructs:
Whereas the 'true' number of dimensions of human personality is
a metaphysical rather than a scientific question, a long history of
fact finding shows that at least the two dimensions of Extraversion
and Neuroticism must be reckoned with in any personality model.
To these two, our research suggests the addition of a third broad
domain, which we call O, Openness to Experience. (1980, p. 69)
In choosing to concentrate on Neuroticism (N) and Extraversion (E) Costa and McCrae took no intellectual risks: these
two broad dimensions in their various operationalizations already had been studied and had demonstrated in many ways
their pervasive influences on behavior. In adding an emphasis
on Openness to Experience (O), they were including a dimension less well specified and investigated but one they believed to
be psychologically consequential. They titled this decision on
their subsequent focus as "the NEO trait model" (Costa &
McCrae, 1980).
Although Costa and McCrae viewed existing measures of N
and E as "serviceable," they also believed N and E measures
could be considerably improved. In addition, they believed that
O, as they conceptualized and had measured the domain, was
only poorly represented by existing questionnaires. So, they
embarked on the process of creating and validating a new personality inventory . . . devoting considerable attention to the specification of more focused^acew of each of the three domains. Global
estimates . . . do not allow much precision in showing which
forms of (a) trait domain are most characteristic of a person.. . .
An inventory providing measures of a half dozen forms or facets of
Extraversion or Neuroticism (or Openness) would allow a more
fine-grained analysis. (1980, p. 92)
In constructing their questionnaire, Costa and McCrae recognized that specifying the components, or facets, or subdimensions of each domain could not be done logically or theoretically
but instead required an intelligent arbitrariness; they expressly
hoped to provide a good and interesting sample of the traits
within each of their three domains (1980, p. 93).
In designing their questionnaire, Costa and McCrae distinguished and permanently fixed upon a half dozen facets each
for their broad constructs of Neuroticism, Extraversion, and
Openness. The facet distinctions they offered were not rooted
in factor analysis, formal theorizing, or ineluctable empirical
findings. Rather, the facets derived from their personal thinking
about how the three domains could be further articulated. The
six facets Costa and McCrae nominated to represent the Neuroticism domain were Depression, Impulsiveness, Anxiety,
Hostility, Self-consciousness, and Vulnerability. For the Extraversion domain, they posited the facets of Warmth, Gregarious-
201
202
JACK BLOCK
Costa and McCrae named their questionnaire the NEO Inventory. They duly noted that their facet scale reliabilities and factor loadings were inflated to an unknown extent by being based
on much the same sample, the ABLSA sample, as that on which
item selection was based.
What had Costa and McCrae achieved at this stage in the
early 1980s? They had an inventory, the NEO, carefully tailored
to fit their considered decision to focus on three particular personality domainsthe previously well-established and well-researched dimensions called N and E and the less well-studied O
dimension. By their particular choice of underlying facets, their
inventory scales operationalizing N and E may have taken on a
somewhat different psychological coloration or meaning than
these concepts held for Eysenck and for others. There were no
intimations as yet of the FFA. In commenting on their approach
at this time, Costa and McCrae were attractively modest albeit
somewhat ahistoric: "We do not wish to give the impression
that the NEO model exhausts the 'personality sphere.'. . . The
NEO model is provisional, but it seems to us to cover enough
important traits to form a useful starting point" (Costa &
McCrae, 1980, pp. 94-95).
203
naire items represented two additional factors beyond the initial NEO three factors. To create tentative questionnaire scales
for A and C, McCrae and Costa referenced the 80-item adjective
rating data already available (see earlier discussion). These lexical data, strongly structured to represent the Goldberg lexical
factors as further modified by McCrae and Costa (1985), provided factor scores on A and C. The controlling criterion invoked for selection of A and C inventory items was that a rationally anticipated A or C questionnaire item correlates more
highly with the appropriate adjective factor than with the other
adjective factors. In this way, items were chosen to constitute
preliminary questionnaire measures of the A and C dimensions. These items, interspersed with the original NEO items in
the NEO Rating Form, were administered to peers of the subjects. A joint factor analysis indicated two new questionnaire
factors had indeed been introduced beyond the initial three factors in the NEO. For ratings by peers, inventory scales for A
and C were constructed by a bootstrapping procedure, selecting
items that also correlated well with the scales measuring A and
C stemming from the self-report analyses. In this manner, questionnaire scales were created in both self-report and peer rating
forms to represent the A and C dimensions.
This sequence of interlocking analyses for assuring correspondence between A and C questionnaire measures and the
previous A and C adjectival measures is unusual and astute.
The new (global) A and C inventory scales were added to the
previous three (faceted) NEO scales and, together, were published as the NEO Personality Inventory (NEO-PI, Costa &
McCrae, 1985).
Subsequently, McCrae and Costa (1987) reported analyses of
the NEO-PI and adjectival data as providing "validation of the
five-factor model of personality across instruments and observers" (p. 81). This paper has frequently been cited by them and
by others as foundational support for their particular five factors. It, therefore, warrants particular analytic attention.
McCrae and Costa report that, whether peer ratings of subjects culminate in five adjective factor scores or culminate in
five NEO scale scores from the enlarged NEO Inventory, the
result is much the same: There is appreciable correspondence
between what peers say about a target person via single-word
descriptors and what they say via longer, similarly oriented sentences. Further, McCrae and Costa find that different peers display an attractive degree of consensuality in the way they rate a
subject on the five factors, whether they describe a subject by
adjectives or by the longer NEO items. Finally, they report that
there is appreciable congruence between a subject's self-evaluation with respect to the five factors and the way peers evaluate
the subject on the five factors, whether the descriptive medium
involves adjectives or the more elaborated NEO items.
The overall structure of these findings provides impressive
evidence for the coherence, and therefore meaningfulness, of
personality assessment. Doubts regarding such coherence and
meaningfulness earlier had been seriously raised (see, e.g., Mischel, 1968; Shweder, 1975) and had received appreciable acceptance. It is, therefore, helpful to have the McCrae and Costa
results demonstrating the reliable distinguishability of individuals. However, these findings are by no means unique nor are
they specifically validating of the NEO five-factor structure.
Similar findings with regard to adjectival and sentence corre-
204
JACK. BLOCK
spondence in meaning, consensuality among raters in their descriptions of subjects, and agreement of self-evaluations with
rater evaluations had been and have been reported frequently
albeit not within the specific terms of the five-factor framework.
A recent integrating article concludes that, as measured by various procedures, "different judges of the same personality, including the person in question, tend to agree with one another to
an impressive degree on a wide variety of personality attributes"
(Funder & Dobroth, 1987, p. 417). Thus, for example, at the
Institute of Personality Assessment and Research in Berkeley
where personality raters have used both adjectival personality
descriptions, sentence-long Q-sort personality descriptions, and
questionnaire scales, appreciable and consistent meaning correspondence between the several approaches has been found
(see, e.g., Block & Petersen, 1955; Gough, McKee, & Yandell,
1955). Any of the thousands of routine mentions in journal
articles of appreciable interrater agreement in evaluations of
subjects on various personality dimensions, using diverse procedures, is an instance of consensuality among raters. For example, Funder and Dobroth report that 87 of the 100 personality variables they evaluated show significant interjudge
agreement (1987, p. 416). Demonstrations of a congruence between self-evaluations and evaluations by peers or other raters
are remarkably abundant.6 To partially illustrate these several
kinds of convergence, consider the studies by Andersen (1984),
Block (1965, 1971), Block and Block {1980), Cheek (1982),
Edwards and Klockars (1981), Funder (1980), Funder and
Colvin, (1988); Funder and Dobroth (1987), Hase and Goldberg (1967), Kenrick and Stringfield (1980), Monson, Tanke,
and Lund (1980), Moskowitz (1990), Plomin (1974), and
Woodruffe( 1985), among others.
Thus, let us suppose that a quite different set of reliable variablesan alternative personality frameworkhad been assessed by McCrae and Costa within their ALBSA samplea
taxonomy consisting, perhaps, of measures of ego control
(overcontrol versus undercontrol), ego resiliency, agency-communion, introspectiveness, energy level, and liberalism-conservatism.7 These particular variables are highly reliable, are of
conceptual interest, and are empirically relatively independent.
There is certainly reason to believe from so many prior studies
of so many different variables in so many different samples that
McCrae and Costa in their own sample would have duplicated
in pattern and strength the kind of findings they invoked as specific "validation" of their own Big Five. Correspondence between single-word descriptors and longer statements would have
been observed in their ALBSA sample, the well-acquainted
peers of the ALBSA subjects would have shown equivalent consensuality in their ratings, and there would have been a comparable degree of congruence in the ALBSA sample between selfevaluations and peer ratings. But these findings would in no way
have provided evidence of the special merits and descriptive inclusiveness of this alternative dimension set. Such results would
have provided only further testimony of the reliability and consensuality that generally underlies seriously attempted personality assessment by human beings of reasonably meaningful
dimensions.
With regard -to the specific correspondence between instruments reported by McCrae and Costa, a possible bias may be
noted. Guided by their a priori conceptual orientation, McCrae
6
However, there is also appreciable evidence of consequential discrepancies between self-evaluations and evaluations by others (Block
& Thomas, 1955; Colvin & Block, 1993; John & Robins, 1993, 1994;
Shedler, Mayman, & Manis, 1993).
7
1 do not offer this set of variables as definitive or as without problems
similar to those that beset the FFA.
205
206
JACK BLOCK
207
208
JACK BLOCK
tor, have split it into two large dimensions, Sociability and Ambition, and have six, not five factors, named differently and with
psychologically different connotations. Tellegen and Waller (in
press) have revamped the meanings of the five factors and have
added two more. Krug and Johns (1986), Noller, Law, and
Comrey (1987), and Boyle (1989), in analyses emphasizing the
Cattell Sixteen Personality Factor Questionnaire (Cattell et al.,
1970), all come up with "five robust factors," more or less, but
not the lexical or NEO Big Five factors. Johnson and Ostendorf
(1993), who accept that fiveness has been demonstrated to be
warranted, are troubled by the "inherent promiscuity" (i.e.,
myriad lateral lexical connections) of personality terms and
offer a differently labeled five factors (e.g., Agreeableness is renamed "Softness," Conscientiousness is renamed "Constraint,"
etc.). Caprara, Barbaranelli, Borgogni, and Perugini (1993)
have developed an Italian five-factor questionnaire wherein Extraversion has become Energy and Neuroticism has become
Control of Impulse and Emotions, importantly different labels.
Zuckerman (in press; Zuckerman, Kuhlman, Joireman, Teta,
& Kraft, 1993) provides "an alternative five factor model for
personality," but there are fundamental differences between his
broad dimensions and those more customarily called the Big
Five. How are the differences among these factor-oriented personality psychologists to be stilled, to escape intramural cacophony? By assertions? By the differential frequency of publication
of these differing views? By a convening of the interested parties
and the subsequent issuance of a conceptual treaty? By the adventitiously influenced direction the assessment community
happens to take?8
Furthermore, the acknowledged breadth of each of the five
factors poses a serious problem for the FFA that has not been
confronted by its advocates. There seems to be a continuing
indefiniteness or inconsistency or oscillation in the way five-factor advocates represent the Big Five or say the factors are to be
understood. We are told by FFA advocates that the factors are
the five basic, pervasively important dimensions of personality
"having an explanatory power that specific traits lack" (Costa
& McCrae, in press-b, p. 30). When questions are raised regarding the descriptive coarseness and psychological confoundings of the Big Five, we are told by FFA advocates that the five
factors only represent "domains," defined as "sphere [s] of concern or function" (Costa & McCrae, in press-b, p. 5) and are
nothing more than abstract, broadly inclusive, global categories, behavioral themes, initial rough distinctions, wide bandwidth ways of schematizing personality qualities. "Proponents
of the five-factor model have never intended to reduce the rich
tapestry of personality to a mere five traits" (Goldberg, 1993, p.
27). I submit that FFA advocates should not have it both ways;
the Big Five cannot simultaneously serve as both basic, broadly
useful factors and initial rough distinctions.
More importantly, it has been primarily in terms of the five
global factors, without further articulation, that psychological
interpretation has most often been delivered. However, it is now
being acknowledged (e.g., Costa & McCrae, in press-b) that, for
an adequate understanding of personality, it is necessary to
think and measure more specifically than at this global level if
behaviors and their mediating variables are to be sufficiently,
incisively represented. "A better acquaintance with the individual comes from a consideration of the facets.. . . The informa-
tion facets offer is more specific, more easily tied to the client's
problems in living" (Costa & McCrae, in press-b, pp. 27, 2930). As Wiggins (1992) observes, increasingly, "subordinate
qualities appear to be generally regarded as more scientifically
desirable than the superordinate qualities" (p. 529). McAdams
(1992) also notes that
the Big Five are in no way akin to the basic 'elements' of personality. They are not pure elemental typesbasic ingredients, as it
were, of personality. Instead, they exist as polyglot generic arenas
with fuzzy, overlapping boundaries. Adequate prediction and description in personality studies will usually require a judicious and
informed selection of many different constructs within the various
arenas, (pp. 339-340)
209
210
JACK BLOCK
expressions covering in fact the same basic situation, but sounding different, as though they were in truth different" (p. 64).
Within personality psychology, the jangle fallacy also abounds
and is exemplified when the NEO five global dimensions are
put forward without recognition of the earlier or alternative,
differently named personality constructs to which they are intrinsically linked or which they blend and confound.
The jingle and jangle fallaciesby no means limited to personality psychologywaste scientific time. The one suggests
agreements that do not exist; the other involves useless redundancies, sometimes because of the absence of historical knowledge, that lead to the "reinvent[ion of] constructs under new
labels" (Holroyd & Coyne, 1987, p. 367). Together, these errors
work to prevent the recognition of correspondences that could
help build cumulative knowledge.
Because of these prevailing difficulties, serious theorizing and
the construction of an improved set of scientifically useful and
consensually understood personality dimensions should be
taken up actively, sustainedly, and systematically by more personologists. It is not enough to say that the FFA is insufficiently
justified and fails to make crucial distinctions; the further responsibility devolves to provide or reintroduce integrating recognitions that are empirically consequential. This necessary
task remains a daunting one, not to be assumed lightly. We
might fruitfully recommence by deconstructing the myriad, often disparate, meanings that have been ascribed to the Big Five.
Alternative conceptual and empirical analyses of these and related dimensions are required, followed by constructive disputation and efforts at concept calibration. Such close conceptual
reflection, further informed by focused empiricism, will spiral
our field forward.
My final, most ambitious suggestion is that, in the
conceptual/empirical arguments to be made for specific dimensions, of whatever number, these constructs be situated
within a coherent, intraindividual theoretical framework. As
earlier noted, the study of personality psychology has been
overly preoccupied with the study of interindividual differences;
indeed, the field is often defined simply as "the study of
[inter]individual differences." And, of differences between individuals, there is no end. An infinite number of sets of descriptive variables can be formulated, each being preferred by its
progenitor and contestable otherwise. What is needed is a basis
for choosing among these alternative sets. Efforts to study or
conceptualize the dynamics underlying intraindividual functioning might well move the study of personality toward such a
basis.
The integral connection between a theory of the individual
on the one hand and a theory of individual differences on the
other is crucial to recognize (Cronbach, 1957; Lewin, 1946;
Underwood, 1975). Lewin has expressed well the relation between these two approaches: The "problems of individual
differences, of age levels, of personality, of specific situations,
and of general laws are closely interwoven. A law is expressed in
an equation which relates certain variables. Individual differences have to be conceived of as various specific values which
these variables have in a particular case. In other words, general
laws and individual differences are merely two aspects of one
problem; they are mutually dependent on each other and the
Postscript
This article has sought to present an alternative view on what
may be the crucial issue for personality psychology in this decade. It is my wish that the concerns expressed and the questions
raised be viewed within the recognition that the scientific understanding of personality functioning and personality development is intellectually fascinating and of fundamental importance for general psychology. In the last generation, we have
come far in illuminating the coherence of personality. It is my
hope that the present disquisition may encourage constructive
dialogue leading to further conceptual/methodological/
substantive recognitions and the next stage of the scientific
study of personality.
9
In a generative, dynamic system theory of the individual, the laws
or production rules or "if. . . then" relations on which experience and
behavior are predicated would be specified. In a system theory, there are
system goalsfunctions to be maximized or minimized or "satisficed"and parameters influencing system functioning. "For systems
that change through time, explanation takes the form of laws acting
on the current state of the system to produce a new stateendlessly"
(Simon, 1992, p. 160).
References
Allport, G. W. (1961). Pattern and growth in personality. New York:
Holt, Rinehart & Winston.
Allport, G. W., & Odbert, H. S. (1936). Trait names: A psycho-lexical
study. Psychological Monographs, 47(1, Whole No. 211).
Andersen, S. (1984). Self-knowledge and social inference: II. The diagnosticity of cognitive /affective and behavioral data. Journal of Personality and Social Psychology, 46, 294-307.
Angleitner, A., Ostendorf, F, & John, O. P. (1990). Towards a taxonomy of personality descriptors in German: A psycho-lexical study.
European Journal of Personality, 4, 89-118.
Armstrong, J. S. (1967). Derivation of theory by means of factor analysis or Tom Swift and his electric factor analysis machine. American
Statistician, 21, 17-21.
211
212
JACK BLOCK
213
214
JACK BLOCK
215
ogy to embrace the five-factor model of personality? American Psychologist, 42, 887-889.
Watson, D., & Tellegen, A. (1985). Toward a consensual structure of
mood. Psychological Bulletin, 98, 219-235.
Webb, E. (1915). Character and intelligence. British Journal of Psychology, 1, (Suppl. 3).
Wiggins, J. S. (1968). Personality structure. Annual Review of Psychology, 19,293-350.
Wiggins, J. S. (1982). Circumplex models of interpersonal behavior in
clinical psychology. In P. C. Kendall & J. N. Butcher (Eds.), Handbook of research methods in clinical psychology (pp. 183-221). New
York: Wiley.
Wiggins, J. S. (1992). Have model, will travel. Journal of Personality,
60. 527-532.
Woodruff, D. S. (1983). The role of memory in personality continuity:
A 25 year follow-up. Experimental Aging Research, 9, 31-34.
Woodruff, D. S., & Birren, J. E. (1972). Age changes and cohort differences in personality. Developmental Psychology, 6, 252-259.
Woodruffe, C. (1985). Consensual validation of personality traits: Additional evidence and individual differences. Journal of Personality
and Social Psychology, 48, 1240-1252.
Zuckerman, M. (in press). An alternative five factor model for personality. In C. F. Halverson, G. A. Kohnstamm, & R. P. Martin (Eds.),
The developing structure of temperament and personality from infancy to adulthood. Hillsdale, NJ: Erlbaum.
Zuckerman, M., Kuhlman, D. M., Joireman, J., Teta, P., & Kraft, M.
(1993). A comparison of three structural models for personality: The
Big Three, the Big Five, and the Alternative Five. Journal of Personality and Social Psychology, 65, 757-768.
Received December 30,1993
Revision received April 27, 1994
Accepted April 29, 1994