You are on page 1of 29

Copyright 1995 by the American Psychological Association, Inc.

0033-2909/95/S3.00

Psychological Bulletin
1995, Vol. 117, No. 2, 187-215

A Contrarian View of the Five-Factor Approach to


Personality Description
Jack Block
University of California, Berkeley
The 5-factor approach (FFA) to personality description has been represented as a comprehensive
and compelling rubric for assessment. In this article, various misgivings about the FFA are delineated. The algorithmic method of factor analysis may not provide dimensions that are incisive. The
"discovery" of the five factors may be influenced by unrecognized constraints on the variable sets
analyzed. Lexical analyses are based on questionable conceptual and methodological assumptions,
and have achieved uncertain results. The questionnaire version of the FFA has not demonstrated the
special merits and sufficiencies of the five factors settled upon. Serious uncertainties have arisen in
regard to the claimed 5-factor structure and the substantive meanings of the factors. Some implications of these problems are drawn.

During the last decade, the "Big-Five" approach has begun


to loom large in the field of personality psychology. It is being
said that "rapid progress has been made toward a consensus
on personality structure" (Costa & McCrae, 1992d, p. 344).
Goldberg (1992) has talked of "a quiet revolution occurring in
personality psychology.... An age-old scientific problem has
recently begun to look tractable.. . . Gradually, agreement has
been growing about the number of orthogonal factors needed
to account for the interrelations among English-language trait
descriptors" (p. 26). The contention is that, via the mathematical method of factor analysis, the basic dimensions of personality description have been "discovered": "Their number is five,
and their nature can be summarized by the broad concepts of
Surgency, Agreeableness, Conscientiousness, Emotional Stability versus Neuroticism, and Openness to Experience" (John,
1990, p. 96). Digman (1990), reviewing the field, also celebrates the "emergence" of "the five robust factors of
personality."
Personality psychologists are being asked to accept this specific set of five orthogonal factors and to use these factor dimensions as the conceptual structure for descriptively representing
different personalities. Widely, frequently, and enthusiastically
promulgated by vigorous, resourceful, talented, ingenious adherents, and with support mustered from many and various
studies, the five-factor approach (FFA) has achieved appreciable popularity; a tide seems underway in the field. Adoption of
the FFA as the universal framework for personality description

This article benefited greatly from the counsel of a number of colleagues. I must exculpate them regarding its remaining deficiencies. My
especial thanks go to Lew Goldberg, David Harrington, Robert Hogan,
Oliver John, Robert McCrae, Philip Shaver, and Auke Tellegen, among
others.
Preparation of this article was supported in part by National Institute
of Mental Health Grant MH 16080.
Correspondence concerning this article should be addressed to Jack
Block, Department of Psychology, University of California, Berkeley,
California 94720-1650.

187

would, of course, fundamentally shape the subsequent course of


thinking and research.
In the words of advocates of this argument, the five factors
"are both necessary and reasonably sufficient for describing at
a global level the major features of personality" (McCrae &
Costa, 1986, p. 1001); the five-factor approach provides "a universal descriptive framework... for the comprehensive assessment of individuals" (McCrae, 1989, p. 243); "the five-factor
model developed in studies of normal personality is fully adequate to account for the dimensions of abnormal personality as
well" (Costa & McCrae, 1992d, p. 347). Why are there five and
only five factors? "We believe it is an empirical fact, like the
fact that there are seven continents on earth or eight American
presidents from Virginia" (McCrae & John, 1992, p. 194).
The claims of an emerging consensus about the FFA have
also, after a lag, prompted some expression of concerns about
the FFA (cf., e.g., Ben-Porath & Waller, 1992a, 1992b; Eysenck,
1992; Hough, 1992; McAdams, 1992; Mershon & Gorsuch,
1988; Tellegen, 1993; Waller & Ben-Porath, 1987), about the
claims of the approach, its nominal empirical basis, and its science-directing implications. What is meant by the phrase, "major features of personality," and what criteria should serve to
ensure their identification? What does the term, "global," denote? How does one conclude that a description is "reasonably
sufficient" or "comprehensive" or "fully adequate"? How compelling and indisputable were the procedures by which the five
factors were discovered and settled upon? What is the role of
concept and theory in the field of personality psychology?
The present article tries to make explicit and bring together
some of the reasonsmethodological, empirical, semantic, theoreticalunderlying my own discontent with the FFA. Some
prefatory remarks are necessary in order to convey the personal
perspective from which I shall be speaking of the FFA. Not all
personality psychologists will subscribe to my views.
My concern is, like the concerns of FFA advocates, with the
problemit is really an idealof establishing a set of constructs for the scientific description of personality (see, e.g.,
Block, 1961; Block & Block, 1980). For a set of constructs to be

188

JACK BLOCK.

scientifically sufficient, a number of criteria will need to be


jointly served. The psychological constructsoften primitive
termsshould receive sufficient elaboration (necessarily, of
course, via words) so that their meaning can attain consensuality among psychological scientists. The constructs should be
sufficient in number so that the personalities of individuals and
the dynamics of behaviors can be represented in articulated
ways and discriminating explanations and predictions may be
formulated. They should demonstrate a superior usefulness in
prediction or in economy of conceptualization over competing
sets of constructs. The constructs should be theory-reflecting
rather than constructs issued and controlled by confounded societal evaluations. As scientific constructs, they should be formulated with no necessary regard for their usability and understandability by laypersons. These several criteria constitute a
tall order of achievement and it is not suggested that their
achievement is imminent; rather, they serve as a guiding
aspiration.
Some words are also due about the terminology used in this
essay. The Big Five are often called "the five-factor model of
personality" and are referred to as providing an understanding
of the "structure of personality." As the term "model" is used
in conventional parlance among psychologists, it means a theoretically based, logically coherent, working representation or
simulation that, in operation, attempts to generate psychological phenomena of interest. However, no identifiable hypotheses,
theories, or models guided the emergence of or decision on this
five-fold space (although some have been offered post hoc).
There is no theoretical reason why it should be these five [ factors]
rather than some other five. . . . [There were] no a priori predictions as to what factors should emerge, and a coherent and falsifiable explanation for the five factors has yet to be put forward.
(Briggs, 1989, p. 249)

Because the Big Five formulation was entirely atheoretical, usage of the term "model" may be premature.
Moreover, Big Five research has been based only on the relations among a set of variables across individuals, what Cattell
(1946), in his incisive formulation of the "data box," termed
the R-technique (or "variable-centered") approach to the analysis of personality data. As he observed, although the R-approach to data analysis is important, there are other important
ways of looking at personality. In particular, it should be recognized that no matter how satisfying on descriptive or other
grounds the variable-centered factor structure of the FFA may
be, it cannot represent a personality structure. Personality
structures lie within individuals (see also Block, 1971, p. 13;
John, 1990, p. 96). It is the personality structure of an individual that, energized by motivations, dynamically organizes perceptions, cognitions, and behaviors so as to achieve certain "system" goals. No functioning psychological "system," with its
rules and bounds, is designated or implied by the "Big Five"
formulation; it does not offer a sense of what goes on within
the structured, motivation-processing, system-maintaining
individual.
On this analysis, it follows that a more appropriate, and more
limiting, label for the "Big Five" body of work is something like
the inordinately cumbersome phrase, "the five-factor, variablecentered approach to personality descriptions." Of practical ne-

cessity, in the following discussion reference will be made to the


"five-factor approach," but it should be understood that this
phrase is shorthand for a lengthier, more awkward, but perhaps
more fitting, wording.
In what follows, I often deliberately write in the first person
instead of using the traditional third person. In much of the
literature presenting and evaluating the FFA, opinionsinformed opinions but nevertheless opinionsinevitably have
been, are, and will continue to be offered. In my own remarks,
rather than using the seeming objectivity of third-person phrasing, the egocentric "I" will serve to identify personal views. Because previous interpretations of the FFA have come primarily
from its proponents, it may be useful to provide a more dubitative reckoning. FFA advocates certainly will wish to and are encouraged to dispute my interpretive account. The ensuing dialogue can only be constructive and advancing regarding the issues involved.
No apologies are offered for the specifics to be presented. Often slighted, they provide a basis for evaluating certain central,
highly influential studies. If the shaping of the field of personality is to be based on reasoning and evidence rather than claim
and counterclaim, we must be religious about the details.
The structure of this essay is as follows. Because it has been
said that the five-factor "theoretical model. . . [is] rooted in
factor analysis" (McCrae & Costa, 1989a, p. 108), some forgotten or slighted knowledge regarding this extraordinarily useful
method is brought forward. I maintain that the task of evolving
the theoretical constructs necessary for the scientific study of
personality cannot be entrusted solely to the pervasively useful
method of factor analysis. The origins and early history of the
FFA are then reviewed. Concerns and questions are raised regarding the origins of the FFA in the work, primarily, of Cattell
(e.g., 1943a, 1943b, 1945), Tupes and Christal (1992), and
Norman (1963). I will suggest that, to the extent this early work
on ratings is said to provide a foundation for the FFA, subsequent edifices may be shakier than is now recognized. The extensive lexical labors of Goldberg (e.g., 1977, 1981, 1982, 1990,
1992) are then considered. The widely ranging research of
Costa and McCrae (e.g., 1985, 1992b; McCrae & Costa, 1985,
1987), which brought the FFA, until then based exclusively on
lexical analyses and adjective ratings of self or others, into the
questionnaire realm, is next examined. It is primarily these last
two research programs, vigorously pursued and promoted, that
have moved the FFA toward its current prominence in personality psychology. On the basis of this series of evaluations,
largely methodological in nature, the current status of the FFA
is appraised, its logical and psychological problems discussed,
and some positively oriented suggestions offered. It should be
noted that a number of the ideas brought forward in this effort
are not novel; many come from others, generally cited but sometimes perhaps not because I do not remember their origin.

A Viewpoint Regarding the Method and Possibilities of


Factor Analysis
Because the FFA is acknowledged to be "rooted" in the
method of factor analysis, an evaluation of the approach must
be predicated upon an understanding of the method, its
strengths and its vagaries. Although factor analysis is a method

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

usable in diverse ways in a variety of contexts to analyze the


correlations among a set of variables, the problems or arbitrariness of the method are often not given sufficient recognition.
The mathematics of the method generates a succession of latent variables or factors that, subsequently, can be used, in reverse, to regenerate the initial correlation matrix to any desired
degree of precision. If as many principal component factors are
extracted as there are variables in the beginning variable set, the
reproduction of the initial correlation matrix is perfect.
There would be little point to the factor analytic method,
however, if all it provided was as many factors at the end of the
process as one had variables at the outset. It is because the serially extracted factors necessarily decrease in their influence on
reproducing the initial matrix that the method can offer cognitive economy. If the extraction of relatively few factors permits
the reproduction of the initial correlation matrix reasonably
well, then one may consider the information contained in the
starting correlation matrix to be expressible or "explained" in
terms of these relatively few factors. One can then seek to provide psychological meaning (usually via factor rotation) for
these summarizing latent variables.
As a method of data reduction, exploratory factor analysis
can often simplify and make reportable masses of datanetworks of interrelationsotherwise cognitively unencompassable. For some, the promise of factor analysis has seemed to be
that the method would permit easy empiricism rather than
tough theory to develop our scientific constructs. However, it is
necessary to know and to remember the rigid logic of the
method and its consequent quirkiness when it is unwarily set
down in the real world. The method can issue marvelous, previously obscure connections, it can also issue mindless results
(e.g., see discussion of this issue by Armstrong, 1967; Lykken,
1971). In considering the results afforded by factor analysis, one
must be mindful of the ways the method may suggest more than
is supportable. Herewith are some cautionary remarks about
factor analysis that derive from long-available knowledge in the
field.
The correlations analyzed by factor analysis reflect what is
traditionally and honorifically called communal or common
variance. But communal variance may also be called, less conventionally and more pejoratively, redundant variance. To the
extent a variable correlates with other variables, it has communal or redundant variance: it is said to be "explainable" by these
other variables and conveys no unique information. To the extent a variable does not correlate with other variables but is
itself highly reliable, that variable is indexing some quality not
otherwise being captured.
Redundancy has its usefulness. In factor analysis, redundancy provides multiple indicators and, therefore, more reliable
indices of underlying factor dimensions. However, the communal or redundant variance observed to exist within the particular set of variables subjected to factor analysis may or may not
be important in other domains or other settings. Therefore, the
factors summarizing this communal or redundant variance
may or may not be important when they are brought out into
the larger world.
As a contrived but instructive example, consider a correlation
matrix based on 100 variables, 99 of which intercorrelate because they are all fully reliable manifestations of the latent vari-

189

able, shoelace-tying competency, in one form or another. Variable 100 is a fully reliable, fully valid but anonymous or unrecognized measure of the latent variable, general intelligence,
which is presumed here to not correlate at all with the first 99
measures. Factor analysis of this matrix will issue one general
factor explaining all of the communal variance. An unsophisticated enthusiast of factor analysis would be impressed by the
finding of so powerful a factor "explaining" so much of the covariance. Further, the lonely, nameless, intelligence measure
would likely be viewed as a "residual" and dismissively consigned to obscurity. Yet, relating these factor analytic findings
to the wider, external world beyond the mathematical solipsism
of the particular factor analysis would reveal that the "powerful" general factor has no or trivial implications, whereas
the ignored residual variable has momentous behavioral
significance.
Of course, this example is a contrived extreme. But the point
made applies to the real world of data to an unrecognized or
unacknowledged extent: the amount of variance "explained"
internally by a factor need not testify to the external psychological importance of the factor.
The mix of variables submitted to factor analysis can be varied with the consequence that the factors then obtained will, in
effect, be entailed rather than represent substantive findings. A
small factor can be made large, a large factor can be made small,
residuals can be made into "factors" (or discarded as
irrelevant). One can make for a "simple structure" by dropping
factorially complex ("interstitial") variables or prevent "simple
structure" (and even create a circumplex configuration) by
adding factorially complex variables that "blend" different
sources of variance. If the locations of axes in one "simple structure" are not satisfying, it is possible by judicious selection and
deletion of variables to relocate the axes and have quite another
"simple structure." One can achieve such ends by just varying
the degree and the location of redundancy within the variable
set. In particular, a set of variables can be "prestructured." That
is, wittingly or unwittingly, the set of variables subjected to factor analysis may have been previously selected so as to contain
several quite different subsets or "clusters" of redundant variables. Such "prestructuring" preordains the "factors" subsequently "found" by the algorithmic operations of factor
analysis.
It is my view that the frequent presence and the powerful
effects of "prestructuring" are often not sufficiently recognized
by those using the method of exploratory factor analysis. In particular, it will be suggested later that, in crucial ways, influential
demonstrations of the sufficiency of the FFA may have been unduly influenced by prior prestructuring of the personality variables used in these analyses.1 If so, then the "recurrence" and
"robustness" over diverse samples of factor structures may be
attributable more to the sameness of the variable sets used than
to the intrinsic structure of the personality-descriptive domain.
Although the method of factor analysis has been used for al1

Protagonists of the FFA recently have been acknowledging the crucial influence on factor solutions of the particular sets of measures used.
For example, Costa and McCrae observe that "the axes chosen by a
varimax rotation will depend completely on the selection of variables"
(1992a, p. 661). Goldberg (1993) also acknowledges this problem.

190

JACK BLOCK

most a century, there is still not a clear, unequivocal basis for


deciding on the number of "factors" to extract in a factor analysis or how to obtain an "optimum" rotation of the particular
set of factors settled upon. Various rules exist for these consequential decisions, based on various arguable assumptions or
aesthetic preferences. One can debate the proper number of factors; is a matrix "under-factored" or "over-factored?" What are
the dangers of underfactoring as compared to overfactoring?
How should one rotate to achieve "psychological meaning" and
who is to be the judge? Should a rotational method be used that
destroys a general factor (varimax), emphasizes a general factor, or seeks to conform to a priori, perhaps theoretical, expectations? Should one impose mathematical orthogonality on the
rotated factors, or is this cognitively economical way of representing factors psychologically inappropriate? Is an oblique rotation, which inevitably fits the data points better, truly a fairer
representation of the underlying reality, or is it a better fit only
because of "the fitting of error?" Should one inject unanchored,
spurious variance into the analysis by considering all variables
to be fully reliable, or should the analysis employ more realistic
communality estimates that set more constraints on subsequent
rotations and produce less striking factor structures? When rotating factors, should one build up a weak factor or residualize
it? Is a given "factor" really only "a bloated specific" (Cattell,
1973) or "tautological" (Eysenck & Eysenck, 1969) factor warranting residualization? Or is the factor a "primary" factor or a
merging of primary factors into a "complex" or "combinatorial" factor with broader but conflated behavioral significance?
Since complex, primary, and tautological factors can all emerge
simultaneously in a given factor analysis (Comrey, 1978; Guilford, 1975; Lykken, 1971)> how does one recognize and respond to their different natures and implications? These issues
are of foundational importance because, with the method of
factor analysis, the psychological "nature" of the factors obtained very frequently changes fundamentally as the number of
factors changes and as the rotational criteria are varied. Intuitively and conceptually, these transmogrifications should not
occur: A "real" factor should not change in meaning when another factor is introduced or when rotations are shifted. But in
fact, this happens in the arbitrary hyperspace of factor analysis
and is unsettling as one tries to impute psychological meaning
to the factors robotically issued by the workings of the method.
As Cliff (1983) remarks in a more general context, "There are
typically an infinity of alternative sets of parameters [e.g., factor
loadings] which are equally consistent with the data, many of
which would lead to entirely different conclusions concerning
the nature of the latent variables" (pp. 122-123). The method
of factor analysis cannot choose among the infinity of possibilities. The decision requires conceptual argument and empirical
work; one must return to the task of being a psychologist.
To further encourage a wary perspective on the results issued
by factor analysis, it is also useful to consider the starting basis
or grist for the method. Because factor analysis processes correlation matrices, all the psychometric problems that afflict correlation coefficients have effects on the subsequent results
afforded by the method. Thus, computed correlation coefficients are seriously influenced by the degree of relation linearity
that is present; by scaling considerations; by the validities of the
measures used; by the reliabilities of the measures being corre-

lated; by truncations, restrictions, or extensions of range on the


variables used; by the forms of the score distributions for the
variables being correlated; by maverick observations that may
improperly raise or lower correlations; by method variance; by
merging different kinds of samples manifesting different covariance patterns (e.g., males and females) into one large, hodgepodge sample wherein the resultant correlations are an irrelevant function of the relative sizes of the samples being inappropriately combined; by heterogeneity with respect to a third
variable (e.g., measures x and y may be unrelated but because
each correlates with age, x and y will correlate and appear on
the same factor unless age is partialed); by whether the variables
being correlated are represented by differential scores or by cumulative scores (Loevinger, 1948); and much more. The
"downstream findings" issued by factor analysis may be fundamentally affected by these often unevaluated "upstream
influences."
For example, it appears that the Big Five factors are reasonably orthogonal when data from a homogeneous sample of subjects are evaluated, but when data from a more heterogeneous
sample are factored, the consequent factor structure may appreciably lose its orthogonality (see, e.g., Costa & McCrae,
1992c; Goldberg, 1992, p. 37; Mroczek, 1992). For another example, consider that "introspectiveness" and "impulsivity"
correlate quite negatively in a sample of Air Force jet pilots but
quite positively in a sample of male graduate students, with the
consequence that a factor analysis will locate these two important variables differently, depending on the whether the data being evaluated are from the first sample, the second sample, the
samples combined, or the two samples in various proportions
(Block, 1955). Because of the many kinds of unrecognized or
unattended-to influences on correlation coefficientsnotably,
in my view, the usage of psychologically different subject samplesthe findings of factor analysis, especially with regard to
the factors emerging later in the factoring sequence, often do not
replicate from one analysis to another or will appear in different
form. In particular, one must guard against easy and convenient
dismissal of factors that may not replicate or have not yet been
replicated. The factors being dismissed may well be reliable, important within each of the analyses, and may well be crucial to
recognize when the aspiration is to develop a comprehensive set
of constructs for the description and understanding of
personality.
At a deeper, conceptual level, there is the question of whether
the ubiquitously used correlation coefficient is, by its nature,
able to represent pivotal features of personality functioning. For
example, correlation coefficients fail to adequately represent
asymmetricality of relations (e.g., although "wittiness" and
"intelligence" correlate, and wittiness necessarily entails intelligence, intelligence does not entail wittiness; although a "talkative" person is necessarily "gregarious," a gregarious person
need not be talkative). There is also the complication of conditionality of relations (e.g., "agreeableness" with other individuals may be manifested if and only if the individuals involved are
at the same or higher status or social level; spontaneity may be
manifested only in "safe" contexts). Correlation coefficients
imperfectly reflect such relations. To the extent these relations
exist, the psychological interpretation of subsequent factor analysis findings will be misdirected. Of course, recognition of the

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

insufficiencies of correlation coefficients should not cause their


abandonment. Rather, there needs to be more awareness of their
inadequacy for various conceptual purposes.
For reasons such as these, and others, a new look in factor and
covariance structure analysis may be emerging. MacCallum
and Tucker (1991) have presented a more realistic framework
for the common-factor model, one that recognizes the difference between measured variables and modeled variables. Their
approach, which awaits further development, respects "the nonlinear influences of common factors on measured variables or
the presence of large numbers of minor common factors" (p.
503). Cudeck and Henly (1991) discuss various sources of discrepancy between models of covariance structures and the sample matrices from which these models are derived. They present
sobering remarks on "the model selection problem" and remind us that such models are only descriptive or summarizing
devices. Their concluding paragraph is worth quoting here:
Yet even with models that are intended to describe or summarize,
the problem of making comparisons inevitably arises when two or
more structures apply to the same data and one wishes to evaluate
their relative performance. Often the best that can be done, which
is actually a useful accomplishment whose value should not be
minimized, is to state clearly the criteria that are used in the comparison, in conjunction with descriptions of the models, characteristics of the data, and the purpose for which the models were constructed. Evaluations of this kind inevitably include elements of
prior experience and individual preference. These personal points
of view should be articulated. Many believe such subjectivity in
model development is somehow unscientific and undesirable, as if
it could be avoided by pretending it does not exist. (p. 518)

Finally, as an anthropological observation, an unremarked


oddity in the field of factor analysis warrants noting: Regardless
of the content domain being evaluated, the great majority of
published factor analyses report relatively few factorsthree
through perhaps six or seven factors. Rarely does the factor analyst settle on more factors in reporting substantive results.
There can be valid reasons for such findings: There may indeed
be only a small number of latent dimensions underlying the
starting correlation matrices. But the underlying motivation of
many factor analysts should be considered as well. They wish
to achieve latent dimensions explaining large amounts of the
covariance present. Otherwise, the method could be said to lose
its raison d'etre. Unfortunately, however, this ambition of the
factor analyst is opposed by logical constraints residing in the
nature of the procedure. To have many factors would necessarily diminish the "explanatory" power of each and result in a
cognitively complicated picture of the world. To enhance the
"importance" of the factors in an analysis, it is therefore pragmatically preferable to restrict their number and to discourage
the existence of numerous, small factors. If one allows for the
level of communality usually existing in empirical matrices
(attenuated by unreliability and by the specific variance of the
measures being analyzed), each of five factors would be constrained to "explain" on the average not more than about 8% or
10% of the total variance. This is on the verge of "explaining"
too little variance to be satisfying to many factor analysts. The
human preference for schematizing the world at a certain level
of complexity (Miller, 1956) may be operative here. Such cognitive preferences, subsequently firmed into commitments, may
underlie the motivation of some factor analysts toward sparse

191

solutions rather than solutions involving more numerous and,


therefore, more complicating latent dimensions.
The preceding disquisition is by way of saying that factor
analysis is a highly useful technique for the study of personality,
but it is not a method for all reasons. As Costa and McCrae
(1992a) say, "Used intelligently, it can yield valuable insights"
(p. 654). But abject deference to the method and the results
it issues is not warranted (see, especially, Lykken, 1971). It is
unlikely indeed that the logical structure and sequence of operations underlying the method of factor analysis and correlational data can reflect the way individuals evolve, articulate, and
conditionally use descriptive terms to characterize themselves
and others. To the extent that the method of factor analysis, per
se, is said to provide sufficient or strong justification for the fivefactor approach to the description of individual differences in
personality, we must remember to look for and evaluate the unacknowledged assumptions or restrictions that in fact undergird
the data used, the correlation coefficients subsequently generated, the factor analytic logic, the factor analytic heuristics, and
the interpretations of the consequent factor analytic findings.
The faith of FFA adherentstheir premise and their promisethat the field of personality psychology can confidently rely
on the factor analytic algorithm as an appropriate and sufficient
basis for objectively deciding on the theoretical constructs to be
used by personologists is, in my view, unwarranted, naive, and
limiting. Certainly, the results afforded by factor analysis
thoughtfully and fairly applied should often be influential. But
also influential should be understandings gained from experimental investigations, from intimate and prolonged observations of other people, from neurophysiological recognitions,
from psychiatric insights, from personal introspections, from
formal cognitive efforts to create a theoretical system that encompasses a chosen domain of phenomena, from the thoughts
flicking through one's mind as one drifts toward sleep, from any
and all possible "contexts of discovery" (Reichenbach, 1951).
None of these sources will be fully dependable in the seeking of
incisive, generative, and coherently related concepts, but all can
contribute to a "context of theory construction" that then
warrants the "context of justification (i.e., validation)"
(Reichenbach, 1951) and, in a helical process, takes the science
onward to a new "context of discovery."

A Revisionist History of the "Discovery" of the Big Five


Approach to Personality Description
Previous historical accounts of the FFA have come only from
proponents of the FFA. Therefore, it may be useful to provide a
more skeptical rendering of the chronology. In focusing on the
several foundational studies of the FFA, I will suggest that, if the
footings of the FFA are less than secure, the structure subsequently erected may not be a house all will wish to enter.

Role ofAllport
The FFA approach in America may be said to have begun
with Allport and Odbert's (1936) onerous compilation of all
the terms in the 1925 unabridged, 400,000 word edition of
Webster's New International Dictionary they judged as usable
"to distinguish the behavior of one human being from that of

192

JACK BLOCK

another" (p. 24). They came up with 17,953 single-word descriptor terms. To this enormous collection, Allport and Odbert
applied their definition of "trait" as "generalized and personalized determining tendenciesconsistent and stable modes of
an individual's adjustment to his environment" (p. 26) and
came up with a primary list of only (sic) 4,504 nonjudgmental
"trait-names," still a very large set. They suggested that their
alphabetical listing of trait-names might prove to be a useful
resource for psychologists developing rating scales and the like.
However, Allport warned that "common speech is a poor guide
to psychological subtleties" (Allport, 1961, p. 356).

RoleofCattell
Next on the scene was Raymond Cattell (1943a, 1943b,
1947). Cattell subscribed to what subsequently has become
known as the lexical hypothesis: "All aspects of human personality which are or have been of importance, interest, or utility
have already become recorded in the substance of language"
(1943b, p. 483). He began with the Allport and Odbert traitname listing but, noteworthy and often not noted (but see John,
Angleitner, & Ostendorf, 1988), he deemed it insufficient.
To make the list of traits as complete as possible. . . in addition to
all that could be obtained from the dictionary, the substance of
all syndromes and types which psychologists have observed and
described in the past century or so [were added]. (Cattell, 1943a,
p. 491)

Thus, he made sure that terms reflecting aspects of personality


he deemed to be importantintroversion/extraversion, emotional maturity, his construct of cyclothymia/schizothymia
(the essence of the agreeableness factor, according to French,
1953, p. 222), ascendance/submission, Thurstone's radicalism/conservatism variable, McDougall's "temper" variables,
and many morewere included in his starting list.
Cattell's considered additions to the dictionary listing provided by Allport and Odbert may well have been heuristically
beneficial. But a consequence of his decision to introduce terms
to represent accumulated psychological insights was to depart
significantly from the dictionary offerings of Allport and Odbert; he was no longer entitled to claim that his selected set of
variables was "a truly representative list. . . derived from language" (1945, p. 70).
Cattell then proceeded through an ambitious, retrospectively
unspecifiable sequence of semantic decisions to abbreviate this
list. Applying his personal judgment at various stages in the
elimination sequence, he first proffered 171 terms (4%) of the
Allport and Odbert adjectives as sufficiently representing "the
personality sphere." But because 171 "traits" were still too
many at the time for his computational capabilities, the terms
were further clustered via correlational analysis aided by Cattell's semantic understanding of the terms, producing about 60
clusters (1943a, 1945). These were still too many, and so a further reduction was enforced to achieve a computationally manageable set of about 35 bipolar rating scale dimensions based
on clusters of variables and termed by Cattell as the standard
reduced personality sphere (1945, 1947). Cattell went on to
factor analyze some peer ratings based on these 35 variables
and concluded that 12 "primary" factors underlay personality.

These factors do not match well with the factors subsequently


known as the Big Five. For a fuller description of the sequence
by which Cattell evolved his lexical reductions, consult John,
Angleitner, and Ostendorf (1988).
More than 99% of the 4,504 trait-name adjectives of Allport
and Odbert and those first added by Cattell had been eliminated
along Cattell's way. This final set of 35 distilled variables proved
to be unusually consequential because of its acceptance and frequent use, in one form or another, by later researchers. Although Cattell's 35 variables certainly represented his own best
judgment at the time of what the most important variables of
personality were, he also acknowledged that his reductions
could well have eliminated personality features of relevance
(Cattell, 1945, p. 71).
It is no denial of Cattell's brilliance, psychological acumen,
and many scientific contributions to recognize, along with
John, Angleitner, and Ostendorf, that an ultimate set of variables for a scientific theory of personality should not be so dependent on any one person functioning in a private, convenience-emphasizing, and subsequently unevaluated way. In particular, one must wonder to what extent Cattell's prior
theoretical notions regarding personality, however valid, influenced his sequential narrowing of possibilities and his construction of variables by clustering terms. Did he introduce semantic
structures that subsequently would underlie and even entail the
results afforded by subsequent empirical usage of his variable
set? I believe so and illustrate this possibility in my discussion
of the work of Tupes and Christal (1961/1992).

Role of Tupes and Christal


Tupes and Christal, personnel selection psychologists employed by the Air Force to improve officer selection and promotion procedures, reported a set of analyses subsequently hailed
by FFA advocates as "the discovery" of the Big Five dimensions
and as the "pivotal work that. . . laid the foundations for. . .
the five-factor model" (McCrae, 1992, p. 217). Their internal
technical report has been often cited over the years, but, because
it was not published in an archival journal until recently, it was
difficult to find. Because their study has been so often and
widely hailed, it is useful to consider the Tupes and Christal
contribution in closer detail than it usually has received.
In their very brief account (a bit more than eight pages of
text in the original), Tupes and Christal described eight factor
analyses of Cattell's rating variables. All the subjects rated were
in their early 20s. Six of the analyses involved peer ratings, two
were based on ratings of subjects provided by older, status-superior raters. There were five male samples and one female sample. Four of the analyses derived from Air Force data, two analyses were of data earlier analyzed by Cattell (1947, 1948), and
two analyses employed data previously factored by Fiske
(1949).
The analyses of the Air Force samples initiated and were central to the Tupes and Christal report. Therefore, details of data
gathering in the Air Force context assume special importance.
For example, the individuals in one sample knew each other as
little as 3 days; in three other Air Force samples, the length of
acquaintance was 2, 3, 8, or 26 weeks.
The raters were untrained, psychologically naive subjects, and little

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH


time could be made available for training them as raters. Therefore
the rating procedure must be simple, requiring a minimum of
trained judgment. Only about two hours [in one sample, only one
hour] were available for the rating. . . [of] 2 5 to 30 subjects on 30
or 35 traits. (Tupes, 1957, p. 4)

An issue must be raised here that will be more fully considered later in this article. Can the psychological perceptiveness of
Air Force officers and officer candidates, as quickly expressed
by 3-point ratings on 30 or so scales in an officially required
research program regarding 12 to 30 of their peers known for
such short periods, provide a fundamental data basis for discerning the essential dimensions for the scientifically sufficient
description of personality?
In six of the analyses, 8 orthogonal factors were extracted; 5
factors were extracted in another analysis and 12 factors were
extracted in the analysis of the female sample. Summarizing
their findings, Tupes and Christal concluded that in each of the
eight analyses based upon the Cattell variables, "five fairly
strong and recurrent factors emerged." They were impressed,
and later other psychologists were impressed, by the marked
congruence among the five-factor patterns derived in so many
analyses.
I suggest that, for technical reasons, the degree of recurrence
of this five-factor structure over their eight analyses may not be
so striking as has been assumed. Although the analysis of their
first Air Force sample used the centroid method of factor analysis prevalent at the time, followed by subjective rotation, the
remaining seven factor analyses all employed the multiplegroup method of factor analysis. The multiple-group method of
factor analysis (see, e.g., Harman, 1967, chap. 11), now longabandoned and mostly forgotten, was frequently used in the
precomputer era to lessen the laborious calculation centroid
factor analysis otherwise required. The multiple-group method
permitted the extraction of a number of factors in one laborsaving operation. The centroid method, on the other hand, extracted factors serially and involved protracted computations
of successive residual matrices. To lessen the subsequent computations, the multiple-group factor analyst first had to partition the set of variables into a number of groups (i.e., anticipated factors) according to preconceived hypotheses. All these
preconceived factors then were extracted simultaneously, followed by the calculation of a residual matrix. This residual matrix, if its elements were sizable, might then be subjected to further analysis, usually by the centroid method.
In the first of the Tupes and Christal factor analyses, eight
factors were arduously extracted via the centroid method.
These factors were then subjectively rotated toward orthogonal
simple structure so as to residualize three of them (i.e., via rotation, the loadings of variables on three factors were deliberately made generally low enough so that these factors could be
said to be unimportant, thus justifying their elimination from
subsequent consideration). However, in the subsequent seven
factor analyses, wherein the multiple-group method was used,
the variables were grouped into five subsets prestructured so as
to correspond to the five rotated factors decided upon in the first
study. That is, the orthogonal factor structures of the last seven
analyses were created to conform insofar as possible with the
factor structure solution settled upon in the first analysis. In
effect, Tupes and Christal used their first factor solution as the

193

target matrix for all their subsequent analyses. Factors representing other than these five pregrouped sets of variables were
residualized, except in the female sample in which the fifth factor was split into two subfactors. As Horn (1967) has compellingly observed, Procrustean rotations to fit a target matrix can
show seemingly impressive congruence with the target even
when random variables are involved. Inevitably, then, the prestructured solutions used in the last seven Tupes and Christal
multiple-group factor analyses fitted the target matrix at least in
part because of "fitting error." The extent to which capitalization on chance was involved has never been evaluated, but it
certainly underlies a portion of the factor "recurrence"
observed.
A more important concern to register, however, is that the
semantic structure underlying the Cattellian variables used in
all the analyses by Tupes and Christal may have intrinsically
predestined the factor solutions subsequently observed. For example, their first factor was defined by variables labeled as "secretive," "silent," "self-contained or reclusive" as opposed to
"sociable," and "talkative." Referring to Soule's Dictionary of
English Synonyms (Sheffield, 1959), under the term "secretive," there is the following entry (cited completely): "reserved,
reticent, close, uncommunicative, cautious, wary, taciturn" (p.
470). As these synonyms attest, the variables identifying the
first factor of Tupes and Christal were highly redundant. Therefore, if the subjects providing the basic data were not responding
incoherently, there is no problem in understanding how such
redundant rating scales will, in a factor analytic context, generate a factor dimension. A second factor was defined by such
variables labeled as "composed," "calm," "placid," and
"poised." Again referring to Soule, under the heading "composed" is the following: "calm, quiet, unruffled, undisturbed,
unmoved, tranquil, placid, sedate, collected, self-possessed, imperturbable, cool" (p. 109). Although not included under
"composed," the term "poised" when referenced identifies
"composure" as a synonym. It is obvious how such virtually
synonymous rating scales will generate a mathematical factor.
A third factor was based on such ratings as "good-natured,"
"cooperative," and "mild;" a fourth factor stemmed from ratings of "artistic," "imaginative," and "intellectual;" a fifth factor emerged from ratings of the equivalent scales "responsible,"
"scrupulous," and "seeing a job through in spite of difficulties
or temptations." It is not surprising when semantically related
variable sets prove to load on the same factor; as these terms are
used by often inarticulate or language-insensitive raters, their
redundancies are great. Consequently, their factorial equivalencies may only testify to the reliability and coherence of the ratings made of the subjects. Rather than representing truly substantive findings, the Tupes and Christal factor findings may
simply reflect the prestructuring that somehow crept into and
characterizes the Cattell-specified variables. Therefore, although five factors may well characterize the data sets Tupes
and Christal created or used, this finding may not be of great
moment.2
2
The preceding discussion does not reintroduce the Shweder (1975)
argument that personality judgments are "no more than statements
about how respondents classify things as alike in meaning" (p. 482). As
Block, Weiss, and Thorne (1979) observed, "in no way whatsoever is it

194

JACK BLOCK

Factor interpretation and factor naming is always a difficult


problem and usually cannot be done quickly or satisfyingly (but
see Meehl, Lykken, Schofield, & Tellegen, 1971). As personnel
selection psychologists working in the unfamiliar field of personality, Tupes and Christal did not elaborate conceptual understandings of their five factors. Instead, guided by earlier work
of French (1953), they simply entitled their five factors, without
any elaboration of their meaning whatsoever, as Surgency or
Extroversion, Agreeableness, Dependability, Emotional Stability or the Opposite of Emotionality, and Culture.
These terse Tupes and Christal labels have been extraordinarily durable over the years and have shaped thinking and research
interpretations. It is of historical and evidential interest to examine more closely the original basis in French's work for one
of these perduring and portentous factor labels.
French, a psychometrician at the Educational Testing Service, had reviewed 68 factor analyses of personality variables
and issued a laconic, cautious report presented as "a reference
source on factorial studies rather than as a theoretical exposition . . ." (p. 8). Abstracting from these 68 studies, French
offered a total of 49 factors, presented alphabetically from
"Agreeableness" to "Will Control," as having been observed in
at least two studies. The French review was a careful evaluation
of a body of literature he recognized as deficient in many ways.
Often, the factor analyses available to French were based on
data conceptually or methodologically unacceptable even by
the prevailing standards of the time. Consequently, the resulting
factors were of ambiguous meaning or implication and French's
effort to identify or summarize reproducible factors was severely limited, as he was fully aware.
The very first factor presented by French as perhaps replicable, "Agreeableness" (one of the Big Five), was predicated on
10 factor analyses, as follows: Lovell (1945) factor analyzed 13
Guilford questionnaire variables. Thurstone (1951) reanalyzed
Lovell's matrix. Baehr (1952) factored a variation on the Guilford variables. Bolanovich (1946) factored ratings by supervisors of field engineers on such variables as "personality," "sales
ability," and "desire for self-improvement." Brogden's (1944)
paper, "A Multiple-Factor Analysis of the Character Trait Intercorrelations Published by Sister Mary McDonough" was
based on religiosity ratings of early adolescents attending parochial school during the 1920s (McDonough, 1929). Tschechtelin (1944) factored ratings of children by their classmates
with regard to such variables as "good sport," "punctuality,"
and "sense of humor." Reyburn and Taylor (1939) factored rating data previously published by Webb (1915). And three factor analyses by Cattell (1945, 1947, 1948) were scrutinized as
well.
I suggest that, although these several studies were respectable
products of their times and although French was diligent in
seeking common denominators across analyses, an everlasting,
overarching, theoretically useful psychological concept cannot
be extracted from this empirical melange. Instead, we have a
broad, bland, impressively unincisive umbrella of a label

possible to proceed from specific semantic similarity judgments. . .to


a specification of just which individuals are rated high or low on a particular rating dimension" (p. 1057).

Agreeablenessunder which any number of importantly distinguishable personality qualities may escape analysis. I further
suggest, on similar grounds, that the other four far-flung, allusive labels taken over by Tupes and Christal should not have
entered the personality firmament on the basis of this early factor analytic review.

Role of Norman
The unpublished Air Force technical report by Tupes and
Christal might well have languished, unattended and without
consequence, had not Norman (1963) picked up the baton.
Norman had taxonomic concerns. He noted that "the construction of more effective theories of the development, structure, and functioning of personality will be facilitated by having
available an extensive and well-organized vocabulary by means
of which to denote the phenotypic attributes of persons" (p.
574). And he accepted the Allport and Odbert argument that
"perceptible differences between persons in their characteristic
appearance or manner of behaving or changes over time and
situations of single individuals in these regards become codified
as a subset of the descriptive predicates of the natural language
in the course of its development" (p. 574). Norman's discussion
of the issues involved is sober and sophisticated: "It is explicitly
not assumed that complete theories of personality will simply
emerge automatically from such taxonomic efforts.. . . There
is a good deal more to theory construction . . . than the
development of an observation languageeven a good one"
(p. 574).
However, his empirical offering consisted only of a replication
of the study of Tupes and Christal. On the basis of the factors
identified by Tupes and Christal, Norman selected 20 of their
Cattellian variables (the four variables best representing each of
the five factors). By his variable selection procedure, he limited
himself to the variables most likely to demonstrate simple structure. He then had several groups of undergraduates offer peer
ratings using this restricted set of variables. By this narrowed
selection from among what has already been suggested may be a
prestructured set of variables, it follows that the factor structure
subsequently observed may have been a foregone conclusion.
Norman's findings, however, were subsequently viewed as further empirical support for the existence, primacy, and perhaps
sufficiency of the five orthogonal personality factors reported
by Tupes and Christal (Norman renamed the "Dependability"
factor as Conscientiousness, a replacement label that subsequently achieved preference).
It should be noted that in collecting peer ratings by laypersons
on a restricted set of 20 variables and in emphasizing the basic
importance of the five personality factors subsequently extracted from this kind of data, Norman would seem to have
shifted from his announced focus on the development of a scientifically oriented language for the description of personalities
to a study of the way laypersons use a constrained language to
characterize other laypeople. There are connectionsimportant, useful, even crucial connectionsbetween these two emphases, but they are not the same. Psychologists certainly can
learn a great deal for their science by studying the nature of lay
observations. But it does not follow that lay usages should be

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

taken over to provide the basic concepts of the field of personality psychology.
In contemplating his 1963 study, and prompted by the suggestion of Tupes and Christal that the Cattellian variables might
have omitted some of the personality concepts residing in the
Allport-Odbert adjectives, Norman decided that "it was time to
return to the total pool of trait names in the natural language
there to search for additional personality indicators not easily
subsumed under one or another of these five recurrent factors"
(1963, p. 582).
To locate personality terms that were new or previously omitted, Norman (1967) searched a later unabridged edition of
Webster and found 175 single-word descriptors to add to the list
compiled by Allport and Odbert. Then, applying his own set
of decision rules for sorting through this descriptor lexicon, he
excluded terms judged to be "purely evaluative and mere quantifiers," "ambiguous, vague, and . . . metaphorical," "very
difficult, obscure, and little-known," and "anatomical, physical
and grooming characteristics." The remaining terms he further
sorted into the categories of "stable traits," "temporary states,"
and "social roles." Excluding the latter two categories as inappropriate for the purpose of personality description, Norman
was left with 2,800 single-word descriptors deemed to represent
"stable traits." These remaining terms were then administered
to undergraduates to empirically assess their understandability
by laypersons, their social desirability, and the degree to which
undergraduates believed these terms were descriptive of themselves and their peers. After Norman removed terms he judged
to be ambiguous or unfamiliar to the typical college student or
redundant, 1,431 terms remained. With this last culling, Norman had a pool of descriptors he believed approached suitability for "the development of a structured taxonomy."
Norman then proceeded to a further, semantic sorting of his
"stable trait" terms that, although never formally reported and
published by him (but described by Briggs, 1992; Goldberg,
1981, 1990; John et al., 1988; John, 1990), may have influenced later research findings. Impressed by the factor structure
that he, Tupes, and Christal had identified earlier, and using his
understanding of the psychological meaning of the factors, Norman personally sorted his 1,431 terms according to their judged
fit into his five dimensions and assigned the terms to the positive
or negative pole of each dimension. As a final step, he examined
the terms at each pole and further formulated what he judged
to be semantic clusters within each pole. In all, he sorted his
1,431 terms into 75 semantic clusters specifying one or the other
end of the five-factor dimensions he viewed as paramount.
Fewer than 25 terms were left unclassified by Norman. The
John, Angleitner, and Ostendorf (1988) historical review may
be consulted for closer information regarding this Norman
effort. Although Norman did not proceed further with his terms
and trait clusters, his delineation and structuring of the trait
lexicon provided an important starting point for subsequent
efforts to advance a lexically based trait taxonomy.
By the mid-1960s, the initial phase of the FFA may be said
to have ended. Although a number of subsequent articles (e.g.,
Borgatta, 1964; Digman & Takemoto-Chock, 1981; Smith,
1967) also reported similar five-factor solutions of rating data,
these studies all were based on various versions of the Cattellian
or Norman variable sets and so these later studies can be classi-

195

fied as further manifestations of this early stage of the FFA.


Given that prestructured sets of variables may have been used,
it follows that the "fiveness" of the factors emerging in these
later studies and their appreciable (but not full) similarity of
factor interpretation need not be viewed as persuasive.
The current FFA frequently memorializes and cites as foundational the empiricism of this first phase of the FFA. Therefore, to the extent the concerns expressed previously regarding
this first phase are cogent, a muting of these observances may
be in order. However, it is on the basis of contemporary accomplishments rather than its early history that the FFA must
achieve its claimed deservedness. And so, the later work that
has zoomed the FFA into its present prominence must be
considered.
The second, and current, phase of the FFA began with the
work of Goldberg (e.g., 1977, 1981, 1982, 1990, 1992), who
sought to go beyond the constraints set by Cattell's choice of
variables. Working with larger numbers of adjectival descriptors, he has presented a refinding and refining of the five factors
reported by earlier investigators. The consequent claims for this
lexical approach merit alternative analysis. Further, the team of
Costa and McCrae (e.g., 1985, 1992c; McCrae & Costa, 1985,
1987) has brought the FFA, which had been focused exclusively
on adjective-based ratings of self or others, into the questionnaire realm. Their adopting and adapting of the Big Five framework also warrants close scrutiny.
Five-Factor Lexical Approach
Goldberg (1971) had long been concerned with the question,
"Why measure that trait?" With close knowledge of Norman's
(1963) work on "an adequate taxonomy of personality attributes" (p. 574) and heuristically using the lexical hypothesis,
during the 1970s Goldberg began and has sustained a meticulous lexical and taxonomic effort. Among his many studies, he
evaluated the role of the evaluation component in adjective usage (Peabody & Goldberg, 1989), their frequency in various
categories (Hampson, John, & Goldberg, 1986), the consistency with which adjectives are used by laypersons (Goldberg &
Kilkowski, 1985), the level of abstractness of adjectives (John,
Hampson, & Goldberg, 1991), the influence of unipolar and
bipolar contexts for the way adjectives are employed (Goldberg,
1992), andnot leastthe factor structure underlying lay usage of adjective-descriptors.
For many years, Goldberg's taxonomic work was communicated only informally to members of an "invisible college" of
assessment psychologists at conferences or via technical reports
(e.g., Goldberg, 1977,1980, 1983). Increasingly, over the years,
his views and arguments became persuasive. Two publications
summarizing his lexical work appeared relatively early and
proved especially influential (Goldberg, 1981, wherein he
coined the phrase "the Big-Five"; 1982). More recently, he has
presented a major, integrative report "intended to alleviate any
qualms about the generality of the Big-Five structure" of personality descriptors (Goldberg, 1990, p. 1223). Along the way,
he has offered a variety of sets of adjectives as "marker variables" so that individuals can be assessed in terms of the lexical
Big Five factors (Goldberg, 1980, 1983, 1992).
Goldberg's program of thinking and research has been pred-

196

JACK BLOCK

icated on the assumption, questioned earlier, that the method of


factor analysis is suitable and sufficient for achieving a scientifically compelling personality taxonomy. In addition, the lexical approach involves premises, procedures, and findings that
often admit alternative positions or interpretations. Four issues
will be discussed: (a) the lexical hypothesis, (b) the limits on
the linguistic and descriptive capabilities of laypersons, (c) the
method of factor analysis as specifically applied in lexical analyses, and (d) the restricted context in which the lexical five factors are found.
Regarding the Lexical Hypothesis
In his own version of the lexical hypothesis, Goldberg presumes that everything necessary and important for the description of personality is expressible via the single-word descriptors
available in the English (and other elaborated) languages:
Those individual differences that are the most significant in the
daily transactions of persons with each other will eventually become encoded into their language. The more important such a
difference is, the more people will notice it and wish to talk of it,
with the result that eventually they will invent a word for it.
(Goldberg, 1982, p. 204)

By beginning with an unabridged and unstructured universe of


single-word descriptors, it follows from this premise that one
can be assured of an inclusive set of personality-descriptive
terms.
I am not alone in my uneasiness about the psychological
sufficiency of this presumption. John, Goldberg, and Angleitner
(1984) have remarked:
Although much can be learned about personality from language
. . . we must not confuse the language which people use in their
constructions of social experience with the scientific constructs
that are supposed to describe, explain, and predict human behavior. . . . There is reason to distrust the 'accumulated wisdom' of
any language.. . . Thus, we view a natural language taxonomy of
personality terms as a rich source, . . . as a useful starting place
for a scientific terminology, (p. 86)

McCrae and Costa (1985) also have expressed reservations


about the lexical hypothesis. Reasoning from conceptions of
personality based on theory, clinical observation, laboratory research, and a variety of assessment procedures, they observe:
The argument that personality is exhaustively captured by the evolution of natural language is appealing, but by no means compelling. One might argue that the research of psychologists in the past
century has uncovered important aspects of personality that were
not encoded in the language. No one would imagine that an analysis of common English terms for parts of the body would provide
an adequate basis for the science of anatomy; why should personality be different? (p. 711; see also McCrae, 1990)

It is my belief that, for scientific purposes, single-word descriptors, although useful for many purposes, cannot convey
crucial features of personality, its dynamic functioning, the conditionalities of behavior consequent upon character structure,
the relations among phenotypical behavioral characteristics
that underlie "equifinality" (Bertalanffy, 1952) or permit
"multipotentiality." We need to use sentences, paragraphs,

pages, chapters, and books to begin to do justice to the understandings we have or must develop.
For example, how does one convey with a single adjective or
a number of separate, unlinked adjectives what may be called
the "pecking order personality," the kind of person who is affable with peers, deferent to superiors, and nasty to individuals of
lower rank? How does one convey in a single word or in a number of separate unlinked words the nature of the hysteric personalityits rigidity conjoined with impulsivity and with "la
belle indifference?" How does one convey the kind of individual
who is so disorganized or capturable by a compelling social surround as to be negligent in fulfilling responsibilities but who
subsequently is racked by guilt? How does one convey with suitable single-word descriptors the person who, confronted with
an anxiety-inducing decision situation, is quickly decisive, not
with the confidence that rapid decision is so often interpreted to
imply but only to get past the stress of the situation? How does
one convey the kind of person who, in desperate circumstances,
becomes unnaturally calm and poised? As an exercise, the
reader is invited to read or re-read Shapiro's penetrating volume on Neurotic Styles (1965) to evaluate how well single-word
descriptors, unconnected, uncontextualized, unconditioned,
can represent the complexities and the complications of personalities. My own belief is that they cannot do the scientific job
the field of personality psychology requires.
Regarding Reliance on Laypersons to Specify Personality
Descriptors and to Provide Personality Descriptions
Throughout his work, Goldberg has used "stable trait" descriptors identified by Norman or by himself as understandable
by laypersons. Trait-adjectives not understandable by laypeople
have been excluded from subsequent consideration. Typically,
undergraduates have served as criterion groups for assessing understandability. Using various sets of adjectives selected as being
understandable to laypersons, lay people have rated themselves
or others. The Big Five factor structures have derived from these
descriptions offered by laypersons based on trait-adjectives
judged to be understandable to laypersons. Several aspects of
this dependence on lay judgments concern me.
Given the aspirations of the FFA to provide a scientifically
compelling representation of the dimensions of personality
differences, consider the filtering of the set of "stable traits" to
exclude those terms unfamiliar or unclear to college undergraduates. Why were undergraduates used as the criterion group to
determine the set of personality descriptors to be used? Why
not 12-year-olds? Why not 5-year-olds? Why not 45-year-old
psychological clinicians?
Reflexively, we reject the idea of basing our personality language solely on the terms familiar to and understandable by
young children or by preadolescents. The reason why, when
some thought brings it forward explicitly, is that we believe a
certain level of cognitive development has to be reached before
one can talk sensibly about personality functioning and personality description. The work of Peevers and Secord (1973) and
Livesley and Bromley (1973), among others, is relevant here.
These developmental studies of how language is used for interpersonal description show that, over time, the words used to
describe people progress through a predictable sequence of

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

stages. In the early years, a young child will describe others in


undifferentiated, egocentric terms ("Billy is nice") and with
reference to its own needs and wants ("Billy gave me candy").
A few years later, people are described more specifically, in behavioral terms but not with so narrowly focused a self-reference
("Billy gets into lots of fights"). With more life experience, the
developing individual begins to characterize people in dispositional terms but in unqualified ways ("Billy is aggressive"). Still
later, individuals may introduce conditionalities and complexities into their characterizations of people ("Billy becomes aggressive when he is threatened").
Although this sequence of stages correlates well with the
number of years the individual has lived in this world, it is not
age per se that is influential but rather the cumulative effect of
the extensity of the individual's interpersonal experiences and,
particularly, the individual's reflections on that experience.
These reflections are better understood in terms of ego development (Loevinger, 1976) than age, because not everyone moves
all the way through the several developmental stages. Some individuals remain egocentrically fixated on evaluating others in
terms of their immediate personal relevance; others achieve
phenotypical description of personality-relevant behaviors but
remain obtuse and do not think in motivational terms; still others, although evaluating the motivations of others, overgeneralize and fail to recognize the conjoint effects of context. Over
the years, given the necessary intelligence, perceptiveness, introspectiveness, and absence of intrusive personal psychopathology, it can be expected that individuals will tend to ripen
toward psychological-mindedness, sophistication, nuance, and
complexity in thinking about people, but not necessarily.
Have undergraduates, or enough undergraduates, reached
that mature stage of development where they can function as
criterion judges for the ultimate selection of personality descriminanda and serve as expert evaluators and describers of personality? I think not. In many respects, college students preoccupied with their own immediate, compelling and impelling developmental problems may be no more than adolescents. They
can provide very many kinds of consequential information on
their self-images or on the social stimulus value of their peers.
One can learn a lot about them from them. But their modal
level of articulation of the domain of personality should not be
accepted as the standard for scientific study within the field.
Not enough undergraduatesand indeed not enough human
beingshave reached the stage of cognitive and ego development required before one can talk sensibly and with the necessary perspicacity about personality functioning. It is well recognized that novices in various cognitive domains do not make
the discriminations that experts in these domains do; the "basic
level" of categorization or distinction-making is a function of
one's level of expertise (cf., e.g., Tanaka & Taylor, 1991). If only
laypersons or naive, unworldly observers are studied, the basic
level at which information will be represented will be that of a
novice. Therefore, limiting the discriminative vocabulary for
personality description to trait-adjectives familiar to and understandable by modal undergraduates would limit as well the discriminations that scientifically oriented personality psychologists might wish to depict.
For example, such concept-representing adjectives as agentic
or communal, machiavellian, sadomasochistic, augmenting or

197

reducing, field-independent, repressing, eroticizing, levelling or


sharpening, narcissistic, and so on would not pass the test of
familiarity and understandability by the typical undergraduate.
Many personality psychologists would wish to have such descriptive possibilities available to them.
Yet another kind of difficulty with the usage of adjectives selected to be suitable for undergraduates is that, even after a series of siftings to identify familiar and understandable adjectives, such terms may remain unclear to many individuals and
therefore be incoherently used. It appears that the lay use of
languageespecially adjectivesis relatively gross, casual, unreflective, and surprisingly nonconsensual. Laypersons sometimes use the term "aggressive" to mean assertiveness or opportunity-seizing and sometimes to mean the expression of hostility. The word "deep" may refer to someone who is introspective
or it may refer to someone judged to be psychologically complicated. The word "critical" may refer to someone who is analytic
or someone who is hostile. Studies by Goldberg and Kilkowski
(1985) and by Kilkowski (1976) have identified the scope and
dimensions of this problem. The specific thrust of their research
was on the "consistency" of adjective usage, using an already
quite selected set of adjectives. The reader should note that consistency, if achieved, can only be achieved because the adjectives
are understood by the describing individuals in meaningful and
corresponding ways.
Kilkowski and Goldberg were surprised by the amount of inconsistency they encountered.
Subjects do not agree about the meaning of [single-word] trait descriptors. . . . The vocabulary skills of undergraduate populations
seem untrustworthy, and the meaning of trait attributions cannot
be assumed unless the sense in which a trait descriptor is to be
understood is explicitly stated. (Kilkowski, 1976, p. 83)
"Unless definitions are provided, it is difficult to know precisely
what a response signifies, and this seems true for all but the most
simple adjectives.. . .It has long been known that trait-descriptive
terms function in our lexicons as 'fuzzy sets.' Perhaps the major
implication . . . of this study is that such terms are even more
fuzzy than we have ever thought. (Goldberg & Kilkowski, 1985,
pp. 97-98)

Graziano (1992) adds evidence that many of the standard adjectives used in personality assessment are unfamiliar to college
students; a study by Beck, McCauley, Segal, and Hershey
(1988) further illustrates the great individual differences in the
way trait-adjectives are understood.
These findings are perturbing for those who would use adjectives with laypersons for self-ratings and other-ratings. It,
therefore, is of special interest to observe what positively influences the consistency (implying an underlying, consensual
meaningfulness) of adjective usage. Goldberg and Kilkowski
find that verbal intelligence is associated with consistent usage
(see also Hampson, John, & Goldberg, 1986). Willingness to
work on an assigned task also contributed to meaningful adjective ratings. An index of "general adjustment" was strongly
related to consistency of adjective usage (r = .71). Elaboration
on the single-word descriptor to provide several phrases to define, clarify, and calibrate its meaning greatly improved the
internal consistency and, implicitly, the validity of subjects'
ratings, thus further demonstrating the shortcomings of singleword adjectival descriptors.

198

JACK BLOCK

In summary, personality descriptions have a better chance to


be valid if they are made by smart, conscientious, generally adjusted individuals who, through the availability of elaborated
definitions, have calibrated their understandings of the descriptive language they are to use. Not an unsurprising set of results,
viewed retrospectively. Unfortunately, typically, the use of undergraduates and laypersons will generally involve many who
are not so smart, not so involved, or motivated as to do the often
drudging task well, not so mature, who will be working with
adjectives often only imprecisely or differently understood.
In my view, then, selecting personality adjectival descriptors
on the basis of their familiarity to college students and the further use of college students to provide the ingredients for the
technical description of personality is rife with conceptual and
practical problems. It will continue to be of appreciable linguistic and social-psychological interest to study lay language usages and the "natural" or "folk" categories that have evolved
for personality description. But a scientifically oriented approach to the formulation of an individual's personality should
not be constrained. For psychologists, who are experts or should
be, to vest lay principles of personality attribution as the basis
of their science would sadly limit their subsequent efforts to
achieve and represent deeper, more articulated and theoretical
understandings. Rather than a language derived from and applied by personality "novices," what is scientifically required is
a language suitable for personality "experts." Assessment psychologists should not settle for less.

Regarding Factor Analyses of Adjective Descriptor


Clusters and the Creation of Marker Variables
In the lexical studies, for each subject, the single-descriptor
ratings typically are aggregated in accord with previously established cluster definitions. The resulting cluster scores are then
factor analyzed. Repeatedly, five factors have emerged from the
matrices being analyzed. To encourage wider research usage of
the lexical FFA, various sets of personality descriptors have been
constructedsome of them quite short, easily administered to
subjects, and conveniently scored by the psychologist-researcherto serve as "markers" of the orthogonal Big Five factor structure.
The recurrent finding of an orthogonal five-factor structure
when synonym clusters of adjectives are analyzed merits particular scrutiny. Consider first a recent analysis wherein the Big
Five factor structure emerges from a far larger set of single-word
descriptors than previously had been used in lexical analyses
(Goldberg, 1990, Study 1; see Goldberg, 1980, for an earlier
version of this analysis).
The use of an especially large set of single-word descriptors
was an attempt to respond to the concern that many earlier
demonstrations of the Big Five factor structure were dependent
on Cattell's original and somewhat special variable set (see, e.g.,
Waller & Ben-Porath, 1987; John, 1990, among others). In addition, the effects of different methods of factor analysis and
rotation and the effects of the number of factors rotated were
studied. The data were college student self-descriptions using
the 1,431 single-word descriptors earlier surviving the winnowing procedures of Norman (1967; unpublished research, as
noted earlier).

The responses of each subject to the Norman descriptors


were aggregated to generate scores for each of the 75 clusters
that Norman had rationally created from his descriptors. Five
factors were extracted using five different methods of factor
analysis and two different methods of rotation (orthogonal and
oblique). A Big Five factor structure quite clearly emerged, with
the fifth factor being renamed Intellect instead of Culture. In
none of the analyses did the structure change substantially as a
function of the factoring or rotational procedure used. When
six or seven factors were rotated, a couple of earlier factors split;
when more than seven factors were rotated, subsequent factors
were defined by only one or two variables and were considered
unimportant. I must note that many psychologists might well
view some of the factors defined by only one or two variables
(e.g., degree of sensuality) as having appreciable characterological significance.
There are a number of reasons why these Goldberg findings
may not be compelling in their implication. It will be remembered that Norman earlier had prestructured these 75 clusters
"from the top down" in terms of the Big Five bipolar dimensions. Therefore, the finding of the same five factors emerging
"from the bottom up" may have been constrained. As suggested
by John, Angleitner, and Ostendorf (1988), "because the construction of the 75 categories had been guided by the Big Five,
the emergence of these factors is not very surprising" (p. 189).
Inspection of the factor loadings of the 75 cluster-variables indicate the tenability of this interpretation. To an impressive extent, the loadings produced by the factor analysis are in correspondence with the factor assignments previously structured by
Norman. Therefore, the extent to which the factor structure
obtained represents a substantive rather than an entailed finding is uncertain.
The various methods of factor extraction used, although they
involve somewhat different mathematical rationales, have long
been known not to differ appreciably in the factor results they
afford, as Goldberg seems to acknowledge (1990, p. 1219). For
example, Corulla (1987) analyzed the revised version of the
Eysenck Personality Questionnaire by six different factor analytic methods and found the various methods of factoring
offered equivalent results. Further, if it is the case that the five
factors being rotated were based on variables previously fitted
into orthogonally related categories, it may not be surprising
that rotations freed to be oblique will find little to be oblique
about and will issue results very much like those issued by orthogonal rotations (see Corulla, 1987).
As Goldberg notes, although the anticipated five-factor structure certainly is appreciably present, the 75 Norman categories
showed a number of unexpected and anomalous factor loadings
in disagreement with the proposed Big Five factor structure and
in disagreement with other analyses.
Although one should expect some differences in factor structures
from study to study, the basic meaning of the factors must remain
constant ifthey are to be given the same labels.... To demonstrate
the robustness of the Big-Five factor structure, it is necessary to
show that the core variables associated with each factor . . . play
the same role when analyzed within other subsets of variables.
(Goldberg, 1990, p. 1222)

Accordingly, Goldberg wished to improve upon this study, to


develop cleaner and stronger "core variables." Also, he was

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

aware that, because Study 1 (1990) was based on clusterings of


adjectives judged only by Norman and involved as many as
1,431 descriptors, studies based on fewer descriptors formed
into semantic clusters on a more consensual basis were
desirable.
In Studies 2 and 3 of Goldberg's report, by complicated, iterative judgmental procedures it is understandably impossible to
truly specify, Goldberg formed semantically more homogeneous adjective-clusters while eliminating adjectives and adjective-clusters that previously had performed anomalously. In
Study 2, using dictionaries, thesauri, and synonym-finders,
Goldberg constituted 133 synonym clusters from 479 commonly used adjectives. In Study 3, these clusters and adjectives
were further refined by internal consistency analyses, and 100
even tighter synonym clusters were produced from a set of 339
adjectives. Factor analyses of these different sets of synonym
clusters, with data based on self-descriptions and peer descriptions, provided clear versions of an orthogonal Big Five factor
structure, the structure emanating from Study 3 being termed
"nearly perfect."3 Goldberg has suggested that these 100 synonym clusters provide a good enough Big Five structure to be
widely usable as "markers" of these dimensions.
However, the emphasis on attaining ever better "core variables" or adjective clusters may well have predestined the improved five-factor structures subsequently observed. "Core
variables," by definition, are variables that, in an analysis, have
shown themselves to strongly and univocally represent a particularly positioned factor axis. Noncore variables are adjectives
and adjective clusters that had been or could be expected to be
factorially complex. When, in a subsequent study, core variables are increasingly used and factorially complex variables are
increasingly excluded, it may not be surprising if more "nearly
perfect" simple factor structures emerge.
An article devoted to the development of markers for the Big
Five (Goldberg, 1992) illustrates more clearly the methodological bootstrapping that may be involved. Three strategies for
developing factor markers are discussed:
(a) representative sampling of the total domain [of adjectives]. . .
(b) uniform sampling, which implies oversampling sparsely populated regions [of adjectives] and undersampling dense ones . . .
and (c) cluster sampling, which aims for factor-univocal variables
by the systematic omission of those located in interstitial regions
between the clusters, as used by Norman (1963). (p. 28)

Goldberg argues for the "inherent advantages" of the clustersampling method on the ground that "cluster sampling provides
a simple-structured set of variables" (p. 28) that can be expected to issue orthogonal factor markers.
However, the use of the cluster sampling approach necessarily
presumes that one knows which are the interstitial variables,
that is, one already has firmly fixed the locations of reference
axes. The mathematical method of factor analysis can only indicate the number of dimensions in the factor space; it offers no
guidance whatsoever as to where references axes within the ndimensional space should be positioned. For this fundamental
decision, conceptual and substantive considerations external to
the factor analytic method ordinarily must come into play. In
factor analyses, core variables and interstitial variables in the
conceptual context of one placement of the reference axes

199

would reverse their positions of primacy within an equally tenable but alternative placement of reference axes. An illustration
of such a reversal is the striking 45 divergence, on crucial theoretical grounds, of Gray (1981) and Eysenck (1967) in their
preferred positionings of two reference axes. The dimensions
Eysenck calls Neuroticism and Extroversion are conceptualized
after Gray's rotation as Anxiety and Impulsivity. Another example of a change in interpretation by a difference in factor
positioning is the preference of McCrae and Costa (1989b) to
reinterpret the Wiggins (1982) dimensions of Dominance and
Nurturance as exemplifying their own dimensions of Extraversion and Agreeableness.
A way to fix axis locations is to prevent competing possibilities from arising. Then, of course, only one possibility of locating the reference axes can be found. As Goldberg notes, Norman (1963), by selecting the strongest factor representatives reported by Tupes and Christal, directly minimized interstitial
variables and almost guaranteed a simple factor structure with
factor-univocal variables. It may be that in the sequence of adjective culling, cluster culling, and cluster forming procedures
by Norman and by Goldberg, which eventuated in very much
the same five factors, adjectives and cluster-based variables interstitial to these factors were unknowingly de-emphasized. Unfortunately, one cannot be sure that this thinning out of structure-complicating adjectives and synonym clusters has occurred. It, therefore, would be helpful if the immense task of
adjective and cluster culling and cluster forming were to be replicated by disinterested investigators. Until an independent verification has been established, the proposal that the lexically
based five factors be accepted as the conceptual framework for
the scientific study of personality would seem premature.4
Regarding the Restricted Context in Which Five Factors
Emerge and Are Orthogonal
Repeatedly, the lexical Big Five factors have been described
as orthogonal or "nearly orthogonal" to each other (e.g., Goldberg, 1992). However, the empirical research findings indicate
that the five factors are frequently importantly correlated
with each other, usually to reflect an overriding evaluative
component.
Thus, consider the serially developed, highly refined 100-item
five-factor marker set recently presented by Goldberg (1992,
Study 4). A "nearly orthogonal" Big Five factor structure
emerges from the markers when data restricted to self-descriptions or the descriptions of liked others are analyzed. However,
3
Anomalous findings persist, however. For example, in Study 3, "passionless" is a central component of the cluster, "placidity," which in turn
is a definer of the positive pole of Factor 3, Emotional Stability. In Study
1, the cluster, "passionlessness," is a definer of the positive pole of Factor
3, Conscientiousness.
4
A recent monograph by Ostendorf (1990) warrants mention. He
reports, based on a complicated, arduous, and methodologically sophisticated study of lay self- and other-ratings on 430 German single-word
descriptors, that five factors similar to the English lexical Big Five
emerge. I agree but note that he obtained at least eight highly replicable
factors, according to the well-regarded criterion of Everett (1983) (see
Ostendorf's Tables 46 and 47).

200

JACK BLOCK

when data based on ratings of a subject pool including disliked


individuals are factored, the five-factor markers issue a factor
structure that is impressively nonorthogonal (see Goldberg,
1992, Table 5). The marker subsets representing Factors II, III,
IV, and V (Agreeableness, Conscientiousness, Emotional Stability, and Culture/Intellect/Openness to Experience) show
appreciable intercorrelations, ranging from the 30s into the 50s,
uncorrected for attenuation. A factoring of this 5 X 5 matrix of
intercorrelations and varimaxing verifies that a strong, broad,
evaluative (and also substantive) factor may be said to be present along with the Surgency factor; the "fiveness" of the factor
structure seems to have been lost within this more heterogeneous sample of subjects.
This failure of orthogonality is not a one-time, ephemeral
finding; it happens often (e.g., Mroczek, 1992) and has received
acknowledgement and discussion (Goldberg, 1993; Peabody &
Goldberg, 1989). The changeableness of the Big Five factor
structure as a function of the nature of the sample and data
being evaluated should be troubling for the adjectival FFA. The
study of personality differences cannot be limited to data from
restricted, homogeneous samples of laypersons describing
themselves or their friends.
In summary, doubts may be expressed regarding the lexical
approach on four counts: there is no guarantee that the essential
lexical hypothesis will permit the expression of scientifically
crucial aspects of personality; the use of laypersons, often psychologically obtuse, to specify the personality language for use
by scientific psychologists can be said to be unwise; the sequence
of empirical procedures that repeatedly issued similar five-factor structures may have been constrained to produce the results
obtained; and the offered five-factor structure is unstable and
seems to exist primarily within homogeneous and somewhat
special populations.
Five-Factor Questionnaire Approach

Costa and McCrae came together in the 1970s and began an


astonishingly fruitful research collaboration. Their work in the
last 15 years or so has been remarkable in its fecundity; its reach
in so many directions; its conceptual and scholarly knowledge;
its methodological sophistication and ingenuity; its energy and
planfulness; and its substantive empirical contributions made
in such diverse fields as personality assessment, behavioral medicine, and gerontology. Their longitudinal analyses in adulthood
and old age (e.g., McCrae & Costa, 1990) have contributed additional powerful and implacable evidence for the surprising degree of characterological continuity to be seen in lives through
time (see, e.g., Backteman & Magnusson, 1981; Block, 1971,
1977, 1981, 1993; Block & Block, 1980;Conley, 1984; Douglas
& Arenberg, 1978; Leon, Gillum, Gillum, & Gouze, 1979; Pitkanen-Pulkkinen, 1981; Woodruff, 1983; Woodruff* Birren,
1972; see Moss & Susman, 1980, for an early review). Although
in admiration of much of their research accomplishments, I
must nevertheless take issue with their prescription of the FFA
to shape the field of personality psychology. It seems to me their
quite remarkable body of work has not demonstrated that, in
number and in kind, their five factors are uniquely and optimally positioned to provide comprehensive and penetrating descriptions of personality. To convey the basis for my reserva-

tions, the steps along Costa and McCrae's path of Big Five advocacy must be retraced.
Beginnings of the Costa and McCrae FFA
In their first study prefiguring what became their particular
FFA, Costa and McCrae (1976) applied cluster analysis (Tryon
& Bailey, 1970) to the Sixteen Personality Factor (16PF) Questionnaire of Cattell (Cattell, Eber, & Tatsuoka, 1970), using
data from 3 groups of subjects. Three clusters were extracted. A
first cluster, reflecting about 21% of the variance, was specifically likened by them to the Neuroticism concept of Eysenck
(1970). A second cluster, encompassing about 14% of the variance, was specifically likened to Eysenck's concept of Extraversion-Introversion.
As Costa and McCrae duly noted, versions of both of these
cluster dimensions had been "consistently observed in the personality literature for over 50 years" (p. 569) and had long been
known to have many and diverse behavioral implications (see,
for only one example, Block, 1965). Indeed, Wiggins (1968)
had already designated these two omnipresent questionnaire dimensions as "the Big Two." However, although the Big Two had
long been recognized, large differences of opinion had long existed, and still exist, as to their latent or conceptual meaning.
The Eysenckian meaning subscribed to at this time by Costa
and McCrae has by no means been uniformly accepted (see,
e.g., Block, 1965, chap. 8; Block & Block, 1980, pp. 44-47,4950; Guilford, 1975, 1977; Tellegen, 1985; Tellegen & Waller in
press; Watson & Tellegen, 1985). Some personologists have
viewed these dimensions broadly, and others have viewed them
more narrowly. By these differences in scope or emphasis, the
psychological flavor of the Big Two changes appreciably.
The third cluster in this Costa and McCrae study was not well
represented; it accounted for only about 6% of the variance and
was inconsistently and unreliably present in the data analyzed.
Only Cattell's "imaginativeness" variable consistently defined
this cluster, and its estimated reliability averaged only .47. Nevertheless, Costa and McCrae were intrigued by this third cluster
and suggested that it intimated, albeit inadequately, a dimension of "openness to experience."
In a subsequent study using the same samples of veterans
(Costa & McCrae, 1978), Costa and McCrae sought to amplify
the measurement of this third cluster. Twelve 16PF variables
were joined with six additional scales specifically "intended as
a replacement for the unstable third cluster of the 16PF" (p.
128). Three of these new scales came from Coan's (1972) prior
inventory to measure openness to experience, and three more
were rationally constructed by Costa and McCrae who were influenced by Tellegen and Atkinson's (1974) report on "absorption." These latter psychologists had developed an unusual questionnaire scale, not related to "the Big Two," that in its content
and correlates seemed to reflect a susceptibility or openness to
environmental surrounds.
Applying factor analysis and varimax rotation to these 18
variables, the desired three-factor solution was obtained. The
third factor was now fattened because of the content redundancy provided by the six newly introduced openness-to-experience scales. However, although the third factor had become
more of a presence, it still was not clearly definable. While the

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

variables defining Neuroticism and Extraversion had mean factor loadings of .76 and .69, respectively, the mean factor loading
of the variables defining Openness was only .49. Nevertheless,
this last dimension was conceptually attractive to Costa and
McCrae and, as measured, Openness seemed to be essentially
unrelated to the indubitable Big Two of Neuroticism and
Extraversion.
At this junctureand with no anticipations of a subsequent
connection with the still inconspicuous FFACosta and
McCrae made a strong, unwavering, conceptual and research
decision to focus their research attention on three broad
constructs:
Whereas the 'true' number of dimensions of human personality is
a metaphysical rather than a scientific question, a long history of
fact finding shows that at least the two dimensions of Extraversion
and Neuroticism must be reckoned with in any personality model.
To these two, our research suggests the addition of a third broad
domain, which we call O, Openness to Experience. (1980, p. 69)

In choosing to concentrate on Neuroticism (N) and Extraversion (E) Costa and McCrae took no intellectual risks: these
two broad dimensions in their various operationalizations already had been studied and had demonstrated in many ways
their pervasive influences on behavior. In adding an emphasis
on Openness to Experience (O), they were including a dimension less well specified and investigated but one they believed to
be psychologically consequential. They titled this decision on
their subsequent focus as "the NEO trait model" (Costa &
McCrae, 1980).
Although Costa and McCrae viewed existing measures of N
and E as "serviceable," they also believed N and E measures
could be considerably improved. In addition, they believed that
O, as they conceptualized and had measured the domain, was
only poorly represented by existing questionnaires. So, they
embarked on the process of creating and validating a new personality inventory . . . devoting considerable attention to the specification of more focused^acew of each of the three domains. Global
estimates . . . do not allow much precision in showing which
forms of (a) trait domain are most characteristic of a person.. . .
An inventory providing measures of a half dozen forms or facets of
Extraversion or Neuroticism (or Openness) would allow a more
fine-grained analysis. (1980, p. 92)

In constructing their questionnaire, Costa and McCrae recognized that specifying the components, or facets, or subdimensions of each domain could not be done logically or theoretically
but instead required an intelligent arbitrariness; they expressly
hoped to provide a good and interesting sample of the traits
within each of their three domains (1980, p. 93).
In designing their questionnaire, Costa and McCrae distinguished and permanently fixed upon a half dozen facets each
for their broad constructs of Neuroticism, Extraversion, and
Openness. The facet distinctions they offered were not rooted
in factor analysis, formal theorizing, or ineluctable empirical
findings. Rather, the facets derived from their personal thinking
about how the three domains could be further articulated. The
six facets Costa and McCrae nominated to represent the Neuroticism domain were Depression, Impulsiveness, Anxiety,
Hostility, Self-consciousness, and Vulnerability. For the Extraversion domain, they posited the facets of Warmth, Gregarious-

201

ness, Assertiveness, Activity, Excitement Seeking, and Positive


Emotions. For the Openness domain, they specified Fantasy,
Aesthetics, Feelings, Actions, Ideas, and Values.
The eighteen facets Costa and McCrae rationally and unalterably designated at this time for their three domains I believe
were thoughtfully chosen, clinically attentive, and generally insightful. Inevitably, however, these choices are debatable by
other thoughtful, clinically attentive, and generally insightful
psychologists. On conceptual, experiential, and empirical
grounds, a number of personality psychologists have expressed
disagreement with this Costa and McCrae faceting (e.g., BenPorath & Waller, 1992a, 1992b; Caprara, Barbaranelli, & Comrey, in press; Glisky, Tataryn, Tobias, Kihlstrom, & McConkey,
1991; Goldberg, 1993;Hahn&Comrey, 1993;Hogan&Hogan,
1992; Tellegen, 1993).
By this time, the subjects in the Baltimore Longitudinal
Study of Aging (BLSA; Shock et al., 1984)a medically and
psychologically assessed, longitudinally followed panelhad
become available to Costa and McCrae.5 It was with this sample
that they developed and began using a 144-item questionnaire.
The questionnaire was formed through the sequential application of factor analysis to maximally fit their three-dimensional,
18-facet conception of personality (McCrae & Costa, 1983). A
sophisticated approach to the construction of inventory scales
was used.
First, items intended to express each of their 18 facets were
written and were administered to the augmented BLSA
(ABLSA) sample. Inspection of the interitem correlation matrix moved Costa and McCrae to eliminate some misbehaving
items. After factoring the remaining items by the principal
components method, the three desired N, E, and O factors were
identifiable. Then, within each domain, the remaining items
(about 70) were again factored. This time, however, the analyses
focused on maximally representing the six facets posited to exist
within the domain.
Because Varimax rotation of six factors would not necessarily have
produced factors corresponding to the rational facet-scales, an orthogonal Procrustes rotation was used instead. Item factors were
rotated to a maximum fit with a target matrix defined by rational
item assignment. The eight items from each factor with the highest
loading on the intended factor were selected (McCrae & Costa,
1983, p. 255)

to form each facet scale. A score on each of the three broad


dimensionsN, E, and Ocould then be generated simply by
summing the scores of the six relevant eight-item facet scales.
5

The BLSA sample began being recruited in 1958 and consisted of


about 425 men and 130 women. Subjects were volunteers, generally
healthy, predominantly white, with superior educations (almost 25%
held doctoral degrees, 71% were college graduates), and worked in scientific, professional, or managerial positions. In many Costa and
McCrae reports, data are derived from the augmented BLSA (ABLSA)
sample which included about 180 wives and 16 husbands who had not
been BLSA participants. A study of BLSA sample attrition (Douglas &
Arenberg, 1978) indicates that, on the Guilford-Zimmerman Temperament Survey, subjects continuing their participation in the Baltimore
study were higher on Emotional Stability, Objectivity, and Friendliness,
and lower on Ascendance than the subjects who dropped out of the
sample.

202

JACK BLOCK

Costa and McCrae named their questionnaire the NEO Inventory. They duly noted that their facet scale reliabilities and factor loadings were inflated to an unknown extent by being based
on much the same sample, the ABLSA sample, as that on which
item selection was based.
What had Costa and McCrae achieved at this stage in the
early 1980s? They had an inventory, the NEO, carefully tailored
to fit their considered decision to focus on three particular personality domainsthe previously well-established and well-researched dimensions called N and E and the less well-studied O
dimension. By their particular choice of underlying facets, their
inventory scales operationalizing N and E may have taken on a
somewhat different psychological coloration or meaning than
these concepts held for Eysenck and for others. There were no
intimations as yet of the FFA. In commenting on their approach
at this time, Costa and McCrae were attractively modest albeit
somewhat ahistoric: "We do not wish to give the impression
that the NEO model exhausts the 'personality sphere.'. . . The
NEO model is provisional, but it seems to us to cover enough
important traits to form a useful starting point" (Costa &
McCrae, 1980, pp. 94-95).

Connecting the NEO Inventory to the Lexical FFA


Costa and McCrae used the NEO Inventory in their research
for several years, but they became increasingly influenced by
early reports on the lexically based five-factor structure of ratings (Digman & Takemoto-Chock, 1981; Goldberg, 1980,
1981,1982, 1983). They decided to see if this five-factor structure of adjective ratings could be connected with their threedimensional NEO Inventory which used questionnaire statements. They anticipated that these two somewhat different approaches to personality assessmentvia ratings of single-word
descriptors and via ratings of longer inventory statements
would prove connectable. Further, they recognized that in many
assessment contexts personality inventories were easier, more
conventional, and therefore more attractive to employ than
nine-point ratings of numerous adjectives. Also, they had observed that the lexical FFA, because it had focused primarily
on the linguistic properties per se of adjectives, had remained
"empirically isolated" (McCrae & Costa, 1985, p. 711). Norman and Goldberg had been preoccupied with the interrelations among adjectives, and they had relatively little energy or
resources left for research on the independent, external, behavioral implications or correlates of their five lexical factors. Development of an inventory version of the adjectival Big Five,
therefore, was an alluring goal: Such a procedure could more
readily be used and, with this wider usage, the factors could be
surrounded with a diversity of behavioral, real-world relations.
From their prior study (McCrae & Costa, 1983) of the
ABLSA longitudinal panel in which the NEO Inventory had
been developed, McCrae and Costa already had available scores
for their subjects on the three NEO dimensions. They also had
spouse ratings of these subjects on the NEO dimensions gathered via the NEO Rating Form (a third person version of the
NEO Inventory). To make a connection with the lexical FFA,
McCrae and Costa (1985) required five-factor oriented adjectival data from their ABLSA subjects. Their subjects therefore
were asked to respond to two intermixed adjective lists. One of

these adjective checklists consisted of 40 adjective pairs (8 for


each of the five factors) provided by Goldberg as markers of
his lexical Big Five. The second adjective checklist, added by
McCrae and Costa "in order to increase breadth and reliability," consisted of 40 additional adjective pairs (8 for each of the
five factors). In extending the original set of 40 Goldberg markers, McCrae and Costa chose adjectives to represent Neuroticism and Extraversion as they earlier had expressed these concepts within their own NEO formulation. To represent Agreeableness and Conscientiousness, factors they previously had not
conceptualized, they picked items from long lists of adjectives
provided by Goldberg (1980) as relevant. With respect to the
fifth factor, designated Culture by Tupes and Christal and by
Norman and Intellect by Goldberg, McCrae and Costa "deliberately tried to include adjectives that reflect aspects of openness to experience, because these were underrepresented in the
original [Goldberg] set" (1985, p. 719 [italics added]).
From data based on their ABLSA subject sample, McCrae
and Costa concluded that the 40 Goldberg adjectives issued six
factors that could not be well reconciled with the factor structure Goldberg would have expected. However, when the extended set of 80 adjective items (the 40 of Goldberg conjoined
with the 40 selected and added by McCrae and Costa) were
factored, five factors were concluded to be sufficient, and the
adjective factor structure obtained was judged to closely resemble the lexical five-factor structure expected except for the fifth
factor. This last factor, by its content within the expanded set of
adjectives, now looked less like the Intellect dimension of Goldberg than like the Openness-to-Experience dimension McCrae
and Costa earlier had decided to imbed within their NEO Inventory. It was in these latter terms that McCrae and Costa
chose to interpret it. However, chary of methodological criticisms of their effort, they acknowledged that "one could argue
that an openness factor emerges in the analysis of the extended
set only because the selection of items had been biased in that
direction" (1985, p. 719).
The sensitivity of McCrae and Costa about their methodology in regard to their lexical "openness" factor seems well warranted. It is unusual empirically and illogical intuitively for the
doubling of the number of variables in a factor analysis to simplify a factor structure and to lessen its dimensionality unless
the added variables have an especially controlling influence on
the structure subsequently achieved. As earlier noted, the results afforded by factor analysis are extremely susceptible to the
way variables are selected and to their redundancy. By invoking
their own personal definitions of underrepresentation and
proper representation of adjective content, Costa and McCrae
could have entailed or fundamentally shaped the findings they
obtained.
From this study, billed as "updating" Norman's work,
McCrae and Costa concluded that
it appears that Norman's (1963) factors do represent an adequate
taxonomy of personality. . . . Three of the major factors from
questionnaire studies of personalityneuroticism, extraversion,
and openness to experienceare easily subsumed by the five-factor
model. In addition, agreeableness and conscientiousness are orthogonal to all three dimensions of the NEO model, and thus appear to add two new dimensions. (1985, p. 718)

It is unclear how McCrae and Costa's conclusion regarding

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

"an adequate taxonomy of personality" follows from their


study. They were able to link reasonably well the Big Two of
their three NEO factors to two nominally equivalent and predictably related lexical factors (Emotional Stability, reversed,
and Surgency/Extraversion). This connection may not be surprising. For example, Funder and his colleagues (Funder & Colvin, 1988; Funder & Dobroth, 1987) have demonstrated in detail, with commendable methodology, how visible or observable
the broad dimensions that Costa and McCrae call N and E are
to laypersons in everyday life.
Further, Costa and McCrae claimed their third inventory factor could be said to correspond with a (reconceptualized and
redefined by them) third linguistic factor. Because they were
using the previously established linguistic factor structure as
their criterion, they further demonstrated the expectable finding that their NEO Inventory did not assess the remaining two
linguistic factors. The reader should note that, by this logic and
empiricism, McCrae and Costa did not confront the paramount issue of whether the lexical FFA serving as their criterion
is itself fully adequate; they only connected their three-dimensional NEO Inventory to the five-dimensional linguistic factor
structure. In making this connection, they preempted the previous names of the three connected linguistic factors: Culture/
Intellect was changed to Openness to Experience, Emotional
Stability was turned on its end and relabeled Neuroticism, and
Surgency/Extroversion became only Extraversion. The acronym, NEO, thus was preserved.
Adopting the Five-Factor Schema for the NEO
Questionnaire
Encouraged by their finding of linkage between the lexical
five-factors and their a priori NEO Inventory dimensions,
McCrae and Costa (1987) decided to adopt fully the five-factor
structure for personality description. This decision required
augmenting their original three-dimensional NEO structure.
Consequently, questionnaire measures of "agreeableness-antagonism" and "conscientiousness-undirectedness" were developed and "grafted" (Goldberg, 1993, p. 31) onto the NEO Inventory. In developing these new measures, McCrae and Costa
again based their analyses on data available from their ABLSA
sample.
Given their decision to develop Agreeableness (A) and Conscientiousness (C) scales to round out the NEO Inventory,
McCrae and Costa's methodological strategy was exemplary in
the way it sought to achieve their desired factor structure, with
scales that had decent internal consistency reliabilities, that
were distinguishable from each other, and that also corresponded with adjective-based A and C measures. The details of
the complicated series of analyses underlying their questionnaire scales to globally (i.e., without facets) measure the broad
constructs of A and C are too involved to report and evaluate
here; McCrae and Costa's procedures (1987, pp. 83-84) can
only be characterized.
McCrae and Costa began by writing inventory items they expected to relate in self-reports to Agreeableness and to Conscientiousness. These items were mailed to subjects who 3 years
earlier had responded to the NEO Inventory. A joint factor analysis of these items then indicated the two new sets of question-

203

naire items represented two additional factors beyond the initial NEO three factors. To create tentative questionnaire scales
for A and C, McCrae and Costa referenced the 80-item adjective
rating data already available (see earlier discussion). These lexical data, strongly structured to represent the Goldberg lexical
factors as further modified by McCrae and Costa (1985), provided factor scores on A and C. The controlling criterion invoked for selection of A and C inventory items was that a rationally anticipated A or C questionnaire item correlates more
highly with the appropriate adjective factor than with the other
adjective factors. In this way, items were chosen to constitute
preliminary questionnaire measures of the A and C dimensions. These items, interspersed with the original NEO items in
the NEO Rating Form, were administered to peers of the subjects. A joint factor analysis indicated two new questionnaire
factors had indeed been introduced beyond the initial three factors in the NEO. For ratings by peers, inventory scales for A
and C were constructed by a bootstrapping procedure, selecting
items that also correlated well with the scales measuring A and
C stemming from the self-report analyses. In this manner, questionnaire scales were created in both self-report and peer rating
forms to represent the A and C dimensions.
This sequence of interlocking analyses for assuring correspondence between A and C questionnaire measures and the
previous A and C adjectival measures is unusual and astute.
The new (global) A and C inventory scales were added to the
previous three (faceted) NEO scales and, together, were published as the NEO Personality Inventory (NEO-PI, Costa &
McCrae, 1985).
Subsequently, McCrae and Costa (1987) reported analyses of
the NEO-PI and adjectival data as providing "validation of the
five-factor model of personality across instruments and observers" (p. 81). This paper has frequently been cited by them and
by others as foundational support for their particular five factors. It, therefore, warrants particular analytic attention.
McCrae and Costa report that, whether peer ratings of subjects culminate in five adjective factor scores or culminate in
five NEO scale scores from the enlarged NEO Inventory, the
result is much the same: There is appreciable correspondence
between what peers say about a target person via single-word
descriptors and what they say via longer, similarly oriented sentences. Further, McCrae and Costa find that different peers display an attractive degree of consensuality in the way they rate a
subject on the five factors, whether they describe a subject by
adjectives or by the longer NEO items. Finally, they report that
there is appreciable congruence between a subject's self-evaluation with respect to the five factors and the way peers evaluate
the subject on the five factors, whether the descriptive medium
involves adjectives or the more elaborated NEO items.
The overall structure of these findings provides impressive
evidence for the coherence, and therefore meaningfulness, of
personality assessment. Doubts regarding such coherence and
meaningfulness earlier had been seriously raised (see, e.g., Mischel, 1968; Shweder, 1975) and had received appreciable acceptance. It is, therefore, helpful to have the McCrae and Costa
results demonstrating the reliable distinguishability of individuals. However, these findings are by no means unique nor are
they specifically validating of the NEO five-factor structure.
Similar findings with regard to adjectival and sentence corre-

204

JACK. BLOCK

spondence in meaning, consensuality among raters in their descriptions of subjects, and agreement of self-evaluations with
rater evaluations had been and have been reported frequently
albeit not within the specific terms of the five-factor framework.
A recent integrating article concludes that, as measured by various procedures, "different judges of the same personality, including the person in question, tend to agree with one another to
an impressive degree on a wide variety of personality attributes"
(Funder & Dobroth, 1987, p. 417). Thus, for example, at the
Institute of Personality Assessment and Research in Berkeley
where personality raters have used both adjectival personality
descriptions, sentence-long Q-sort personality descriptions, and
questionnaire scales, appreciable and consistent meaning correspondence between the several approaches has been found
(see, e.g., Block & Petersen, 1955; Gough, McKee, & Yandell,
1955). Any of the thousands of routine mentions in journal
articles of appreciable interrater agreement in evaluations of
subjects on various personality dimensions, using diverse procedures, is an instance of consensuality among raters. For example, Funder and Dobroth report that 87 of the 100 personality variables they evaluated show significant interjudge
agreement (1987, p. 416). Demonstrations of a congruence between self-evaluations and evaluations by peers or other raters
are remarkably abundant.6 To partially illustrate these several
kinds of convergence, consider the studies by Andersen (1984),
Block (1965, 1971), Block and Block {1980), Cheek (1982),
Edwards and Klockars (1981), Funder (1980), Funder and
Colvin, (1988); Funder and Dobroth (1987), Hase and Goldberg (1967), Kenrick and Stringfield (1980), Monson, Tanke,
and Lund (1980), Moskowitz (1990), Plomin (1974), and
Woodruffe( 1985), among others.
Thus, let us suppose that a quite different set of reliable variablesan alternative personality frameworkhad been assessed by McCrae and Costa within their ALBSA samplea
taxonomy consisting, perhaps, of measures of ego control
(overcontrol versus undercontrol), ego resiliency, agency-communion, introspectiveness, energy level, and liberalism-conservatism.7 These particular variables are highly reliable, are of
conceptual interest, and are empirically relatively independent.
There is certainly reason to believe from so many prior studies
of so many different variables in so many different samples that
McCrae and Costa in their own sample would have duplicated
in pattern and strength the kind of findings they invoked as specific "validation" of their own Big Five. Correspondence between single-word descriptors and longer statements would have
been observed in their ALBSA sample, the well-acquainted
peers of the ALBSA subjects would have shown equivalent consensuality in their ratings, and there would have been a comparable degree of congruence in the ALBSA sample between selfevaluations and peer ratings. But these findings would in no way
have provided evidence of the special merits and descriptive inclusiveness of this alternative dimension set. Such results would
have provided only further testimony of the reliability and consensuality that generally underlies seriously attempted personality assessment by human beings of reasonably meaningful
dimensions.
With regard -to the specific correspondence between instruments reported by McCrae and Costa, a possible bias may be
noted. Guided by their a priori conceptual orientation, McCrae

and Costa may have shaped their empirical analyses so as to


enhance and even guarantee connection between the adjectivebased Big Five factors of Goldberg and their own preferred
questionnaire approach to personality assessment. There was
no large problem in establishing a correspondence between the
two approaches in their respective representations of the longknown and inescapably evident Big Two. And, as already noted,
by surrounding the adjectives defining the lexical Culture factor
with their own specially selected, openness-reflecting adjectives,
the revised adjective factor may have been coopted to become a
better match with the NEO Openness measure earlier enunciated by Costa and McCrae. The questionnaire items subsequently selected to constitute new NEO scales reflecting Agreeableness and Conscientiousness were selected in large measure
on the basis of their correlations with adjective factor scores formulated to mark A and C. So, these NEO scales could be expected afterward to exhibit the correspondence with adjectivebased A and C scores on which their construction had been
predicated. Given these recognitions, it may not be surprising
that the adjectival and questionnaire measures proved to be
related.
It is crucial to recognize that establishing a correspondence
between the lexical Big Five and the augmented NEO Inventory
does not speak to the fundamental question of the uniqueness
and the sufficiency of the FFA. McCrae and Costa did not, in
their efforts via questionnaire items or adjective ratings, evaluate comparatively the empirical or predictive adequacy of the
Big Five prescription. Nor did they, from a theoretical position,
contend that the Big Five broad dimensions are uniquely advantageous. Rather, having premised the prior lexical work as a
fundamentally incisive "discovery," they showed that their augmented NEO Inventory provides what may be a more convenient way of evaluating individuals within their somewhat
different version of the five-factor framework. However, if, as
earlier suggested, the lexically based Big Five dimensions are
not represented as a "discovery" but are viewed as a constructionarguably usefulemanating from the factor analysis of
perhaps prestructured sets of variables, then the empirical connections of the augmented NEO Inventory with the lexical Big
Five may not have grand import.
In summary, the particular five-factor orientation advocated
by McCrae and Costa is not required to achieve the pattern of
results they underscore as "validation of the [ italics added ] fivefactor model." Evidence was not offered in their studies for the
primacy, exhaustiveness, and incremental validity of their preferred Big Five over other personality dimension sets of varying
kind and number. We are, therefore, returned to the central concern motivating this essay: Are the five broad, "global" domains
or dimensions adopted and focused upon indeed uniquely comprehensive and sufficiently incisive to provide a satisfactory
framework for personality research and assessment?

6
However, there is also appreciable evidence of consequential discrepancies between self-evaluations and evaluations by others (Block
& Thomas, 1955; Colvin & Block, 1993; John & Robins, 1993, 1994;
Shedler, Mayman, & Manis, 1993).
7
1 do not offer this set of variables as definitive or as without problems
similar to those that beset the FFA.

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

Recent Developments in the Five-Factor Questionnaire


Approach
Following their "validation" study, an avalanche of diverse
publications has come forward from Costa and McCrae using
and advancing the NEO five-factor questionnaire as the criterion framework for personality assessment. Three lines of effort
may be briefly characterized and evaluated.
In one series of studies, Costa and McCrae programmatically
sought to reinterpret other questionnaires and assessment approaches in terms of their five-factor NEO Inventory. This work
has been instructive but also less than satisfying. Some methodological problems surround their use of joint factor analysis
rather than interbattery analysis (Browne, 1979, 1980) or redundancy analysis (Lambert, Wildt, & Durand, 1988; van den
Wollenberg, 1977). More troubling, the McCrae and Costa interpretative approach was consistently asymmetrical. It has
long been recognized that different assessment procedures overlap considerably in their coverage of the personality domain.
However, the emphasis in Costa and McCrae's studies was primarily on evaluating the extent to which the NEO five factors
can be "recovered" from other assessment procedures, for example, Jackson's Personality Research Form (PRF; Costa &
McCrae, 1988) or Block's California Q-Set (CQS; McCrae,
Costa, & Busch, 1986). They did not evaluate the extent to
which the dimensions identifiable within other assessment procedures, such as the PRF or the CQS, can be "recovered" from
the NEO Inventory. This latter question, when asked and answered, will often provide a different perspective on the sufficiencies of the NEO Inventory and the FFA. Thus, in their analysis, Costa and McCrae acknowledge but do not emphasize that
the PRF contains two reliable, psychologically meaningful factors defined only by the PRF and not recoverable from their
NEO (1988, p. 263).
The CQS (Block, 1961 /1978) was expressly intended to permit the comprehensive psychodynamic description by psychologists and psychiatrists of the personality of any kind of individual. It has proven empirically useful in a wide variety of research endeavors and, in addition, has been used over the years
for the conceptual or prototypical description of many personality and psychiatric constructs. In their analyses of the CQS,
McCrae et al. (1986) extracted five factors, rotated these factors
preferentially "after considering item content and external
correlates" (p. 438), and concluded that "the agreement between the two systems [the NEO five-factor approach and the
personality domain covered by the CQS] is remarkable in its
detail" (p. 443).
Given the aspiration of comprehensiveness for the CQS language and its manifold configurational possibilities, it may not
be surprising that the CQS items, separately and in conjunction
or configuration, are able to represent or can be made to represent the five broad NEO dimensions; the procedure would have
been disappointing otherwise. Additionally, it should be noted
that the McCrae et al. (1986) factor analysis indicated 32 eigenvalues were greater than unity, testifying to the presence of
many singlets or doubletsnonredundant variables or factors
within the CQS. Certainly, some of these may be expressing only
unreliable variance. However, in my own many factor analyses
of this Q set, more than 20 reliable factors, many of them small

205

because they are not redundantly represented, are regularly


found. In the McCrae et al. analysis, if the scree test they report
is accepted as a sufficient guide, then the presence of seven or
nine factors was indicated. Their subsequent eight-factor CQS
solution resulted in three new, meaningful CQS factors "along
with versions [italics added]" of the five factors they had first
extracted and chosen to emphasize. The three additional factors
within the extended McCrae et al. analysis are said to relate
to introspectiveness, narcissism, and forcefulness of behavior. I
suggest that many, indeed, most personality psychologists would
want to be able to invoke these latter attributes when offering
descriptions of character.
Thus, both the PRF and the CQS seem to reflect aspects of
personality not encompassed by the foreclosed NEO five-factor
approach. Further, the efforts by McCrae and Costa to interpret
other assessment approaches in terms of the NEO Inventory do
not illustrate the impartial scientific competition required if the
relative merits of alternative approaches are to be evaluated.
In another line of endeavor, Costa and McCrae have proposed
that the Big Five as represented by their NEO questionnaire
provides a useful and advancing way of thinking about psychiatric disorders (see, e.g., Costa & McCrae, 1990,1992b, 1992d;
McCrae & Costa, 1986). In reaction, some conceptual and
practical problems surrounding the NEO Inventory in clinical
contexts have been brought forward by Waller and Ben-Porath
(1987), Ben-Porath and Waller (1992a, 1992b), and Tellegen
(1993), among others. It merits mention that studies by Clark
(1990), Harkness (1992), and Livesley, Jackson, and
Schroeder (1989, 1992) of the constructs underlying personality disorders find 23, 39, and 15 components, respectively, to be
essential. The content of these several sets of symptom clusters
suggests that the five-factor questionnaire approach may be impoverished and not additionally helpful in psychiatric contexts.
Clark (1993) has reported that the NEO Five-Factor Inventory
(NEO-FFI) provided no additional contribution to the questionnaire prediction of any of 11 psychiatric disorders.
Perhaps the most important recent development is the publication of the further revised and extended version of the NEO
questionnaire, now called the Revised NEO Personality Inventory (NEO-PI-R) (Costa & McCrae, 1992c). This revision
presents faceted versions of the Agreeableness and Conscientiousness scales. The revision was prompted because psychometric scrutiny of the NEO-PI had revealed that the N and E
factors as they were previously faceted did not cohere as had
been posited (McCrae & Costa, 1989a). In this latest version of
the NEO Inventory, there were several sequences of sophisticated psychometric analyses and measure honing, including use
of the Procrustean-like "validimax" approach (McCrae &
Costa, 1989a), to develop NEO factor and facet scales that
would better and better fit the posited structure of NEO
interrelations.
As with earlier NEO versions, and differing from other personality inventories (e.g., Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989; Jackson, 1984; Tellegen, 1982, in
press), the NEO-PI-R scale construction approach deliberately made little provision for the detection and control of response proclivities that tended to distort the meaning of the
scores obtained. In the opinion of Costa and McCrae, such concerns have been overblown. They view the NEO-PI-R as "an

206

JACK BLOCK

instrument that relies on candor and cooperation between


[inventory] administrator and respondent" (in press-a, p. 20).
For this new NEO edition, Costa and McCrae created new A
and C scales each with, again, six-facet scales because "more
than six scales would tax the user's ability to learn and remember the facets" (Costa, McCrae, & Dye, 1991, p. 888). The six
facets they designated as underlying Agreeableness were Trust,
Straightforwardness, Altruism, Compliance, Modesty, and Tender-mindedness. The six facets named as underlying Conscientiousness were Competence, Order, Dutifulness, Achievement
Striving, Self-Discipline, and Deliberation. In addition, 10
items were changed from the earlier NEO-PI in an effort to
further strengthen the facet and domain correlational structure
desired for the NEO.
The inventory is normed on 500 men and 500 women. To
achieve a reasonably diverse normative sample, Costa and
McCrae selected less well-educated subjects from among the
generally well-educated individuals in their ABSLA sample and
in another, larger sample of employees from a major national
organization (1992c, p. 43). What are the current properties of
this revised product after its prolonged, recursive development?
Some uneasiness immediately arises regarding the NEO-PIR, given its interpretational surround. The scales representing
the five domains were repeatedly and strongly represented earlier as orthogonal conceptually and as nearly so empirically.
Therefore, in the NEO-PI-R, such near orthogonality is again
to be expected. However, the NEO-PI-R N and C scales now
correlate .53 and the E and O scales now correlate .40, both
of these figures being uncorrected for attenuation (Costa &
McCrae, 1992c, p. 100). Reasonable allowance for the unreliability of these measures would raise these figures to about .62
and .47, respectively. These are unusually high values, corrected
or uncorrected, and should be bothersome, even unacceptable,
to the orthogonality-emphasizing NEO five-factor position.
Overall, only 4 of the 10 domain intercorrelations are below
.20, indicating an appreciable failure of the previously claimed
orthogonality in this supposed five-factor measure. These NEO
scale intercorrelations may not be unusual. The NEO-FFI, a
shorter, 60-item version of the NEO-PI-R (but said to account
for 85% of the criterial variance of the longer inventory, Costa
& McCrae, 1992c, p. 54) has been used within my ongoing longitudinal study (see, e.g., Block & Block, 1980). In our sample
of young women, N and C correlate, uncorrected, .61, in our
sample of young men, N and C correlate -.49. There are additional appreciable correlations among the five factors. Eysenck
(1992) also has noted the existence of bothersome high
correlations between supposedly orthogonal domain scales of
the NEO.
Furthermore, within the NEO-PI-R, many of the facets continue to misbehave; their correlations with other facets indicate
that the model of factors and facets earlier postulated by Costa
and McCrae and iteratively pursued by their scale construction
procedures in many ways does not obtain. Parker, Bagby, and
Summerfeldt (1993) have applied confirmatory maximum
likelihood factor analysis with LISREL (Joreskog & Sorbom,
1989) to the normative sample data provided by Costa and
McCrae in their NEO-PI-R manual and tested for the existence of the claimed five-factor, 30-facet model. They find that
there is "poor fit between the obtained factor structure and the

hypothesized dimensions corresponding to the 5-factor model"


(p. 463). Rather than displaying the posited hierarchical arrangement, there are many lateral connections between facets
from different factors. Two recent studies by Comrey's group
(Caprara et al., in press; Hahn & Comrey, 1993) provide further and impressive evidence of the insufficiency of the NEO five
factors when referenced to a well-developed and broad gauge
personality inventory (Comrey, 1993). Eight or nine factors are
found to be required in the inventory domain, and the NEO five
factors are not well identified when placed in this larger context.
After many years of research and measure development, there
still appears to be marked deviation empirically from the posited orthogonal five-factor structure.
As already noted, a serious failure of orthogonality also besets the most recently offered lexical markers for the Big Five.
Moreover, the pattern of troublesome NEO intercorrelations is
impressively different from the pattern of troublesome intercorrelations characterizing the lexical factors (Goldberg, 1992, Table 5). It is difficult to accommodate these findings to the repeated assertions of an empirically driven convergence on five
distinct and agreed-on dimensions as underlying and sufficient
for the scientific assessment of personality (see, e.g., Costa &
McCrae, 1992a; Goldberg, 1993). After so many years, such
marked structural disagreement between the lexical and questionnaire five-factor correlation matrices is bewildering.
A recent chapter by Costa and McCrae (in press-a) describing the history and evolution of their NEO approach is illuminating about these discrepancies. Several consistent guiding
principles are acknowledged therein:
1. Costa and McCrae "emphasize the theoretical and conceptual aspects of the [ factor analytic] solution rather than statistical and mathematical criteria. [They] have allowed [their]
model, rather than eigenvalues or scree tests, to determine the
number of factors" (p. 10).
2. Costa and McCrae "prefer orthogonally-rotated principal
components [i.e., simple factor structures] to more elaborate
and sometimes more difficult to understand oblique and common factor solutions" (p. 10).
3. Costa and McCrae "have occasionally used theoretical
considerations to rotate factors when there are reasons to think
that varimax-rotations are not optimal" (p. 11).
4. However, in the work leading up to the NEO-PI-R, Costa
and McCrae
faced a dilemma: If we wished to preserve simple structure in a
five-dimensional instrument, we would need to abandon some of
our facet scales; if we wished to retain the facets, we would need to
abandon simple structure. We opted for the second solution. Simple structure provides guidance in exploratory analyses. . .; other
things being equal, simple structure should be preferred. But other
things are not equal: There are excellent reasons to adopt a fivefactor model and also excellent reasons to measure traits that are
related to more than one of the factors. . . . This position may
seem unorthodox, but it has solid precedent in personality research. In particular, many writers have argued that a circular [i.e.,
circumplex] arrangement is necessary to describe the structure of
. . . traits.. . . We view both simple and circumplex structures as
useful models . . . but we do not wish to impose either structure
on the data. . . . Given appropriate facets, the question of their
structure is a purely empirical matter, (pp. 14-15)
This recently expressed NEO rationale, although clarifying,

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

also raises profound concerns given prior representations and


understandings. It would appear that a reason why the three,
subsequently augmented to five, faceted factors of Costa and
McCrae differ importantly from other similar approaches is
that empirical analyses were importantly shaped by a priori
conceptual commitments. Rather than being empirically inevitable and data-driven, the "fiveness" of the factor solutions presented, their earlier orthogonality, and the factor rotations settled upon all appear instead to have been conditioned by preexisting theoretical preferences. Thoughtful though Costa and
McCrae have been, and acknowledging their prodigious
amount of useful empirical work, it must be recognized that
their NEO model has been imposed on the hyperspace of personality attributes. It now appears that stubborn realities devolving from the nature of personality are in substantial disagreement with the posited NEO constructs and facets. It is also
not surprising that the Costa and McCrae five-factor structure
is different from other factor structures subject to alternative
influences.
The decision by Costa and McCrae to now abandon their previously emphasized orthogonal factor structure in order to
maintain inviolate the sets of facets initially postulated has
many retroactive and prospective consequences. Many previous
reactions to the NEO approach created on the basis of earlier
understandings may have to be recontextualized; future claims
for the NEO will necessarily be received in a different interpretational setting, one that emphasizes disputable theoretical
preferences rather than apodictic empirical "discoveries."
Certainly, the NEO approach, by virtue of the many correlates with which it is surrounded, cannot be dismissed as entirely wrong. Instead, the question must be asked: Is the NEO
approach sufficiently right as the framework for personality assessment? In placing their empiricism only within their constrained NEO Big Five mold, have Costa and McCrae demonstrated that their NEO dimensions are uniquely or sufficiently
penetrating? I think not.
On conceptual grounds no less worthy than those now being
offered as underlying the NEO approach, it can be argued that
the NEO dimensions happened to achieve their present level of
empirical connections because they conflate or are correlated
imperfectly with constructs and other dimensions providing
deeper and more comprehensive understandings (e.g., earlier
and/or alternative conceptual and operational versions of the
Big Two and a number of other personality dimensions that also
have demonstrated widely ranging empirical connections). Regrettably, the present essay cannot provide an extended version
of this argument, but for illustrations of the kind of case that can
be made, see Ben-Porath and Waller (1992a, 1992b), Eysenck
(1992), Hough (1992), and Tellegen (1993).
Support for any taxonomy must be based on a number of
considerations. There must be empirical and conceptual competition between alternative dimensional offerings to see which,
predictively, best carves nature at its joints or which, theoretically, provides "surface complexity arising from deep simplicity" (a line due to Murray Gell-Mann). To what extent do other,
importantly different, previous or new, correlated or alternative
constructs also or better provide a basis for interpretation?
Without such scientific contests, fairly waged and evaluated, repeated "appeals for the adoption of the five-factor model in per-

207

sonality research and assessment" (McCrae & Costa, 1987, p.


81) must be viewed as premature advocacy.
An Appraisal of the Current Status of the FFA
This article has been preoccupied with issues of method,
logic, and interpretation of the research on which the FFA has
been based. This strategy of argument was adopted because
FFA advocates have so strongly emphasized the inexorably empirical nature of the Big Five. The "fiveness" of the factors has
been delivered as a "discovery" of a fundamental phenomenon
of nature: "It is probably not meaningful or profitable to ask
why there happens to be just five such dimensions" (McCrae &
John, 1992, p. 194). Hence, the analytical line taken in this
paper has been to show how the method, logic, and interpretation of findings by FFA boosters may not be incontrovertibly
compelling and that a psychological construction or imputation
rather than a fact of nature may be involved.
Nevertheless, it might be maintained that whatever the uncertainties or criticisms surrounding the evolution, and the
shaping, of the FFA over the years, the contemporary Big Five
represents a clarifying and advancing framework that can provide needed integration for the anarchic field of personality assessment. However we may have arrived at this juncture in personality assessment, the supreme question therefore becomes,
conjointly, a conceptual and empirical one: Given that we are
here, how "good" are the five factors for the scientific tasks
ahead? Will adopting the FFA help or hinder the understanding
of personality? Although arguably based upon various forms of
atheoretical prestructuring, can the FFA now justify itself by its
theoretical implicativeness and empirical incisiveness?
It is certainly (and trivially) true that by accepting the fivefactor framework, the Tower of Babel that has afflicted personality psychology would be quieted. However, as preceding pages
have argued, the proffered replacement monolith, the Big Five,
may not permit us to reach higher or high enough into the psychological heavens to warrant acceptance of this conformity. To
recapitulate, the reasons are primarily conceptual but also empirical and involve a sequence of considerations.
There is first the issue of which five-factor framework? This
question has been frequently raised (e.g., Briggs, 1992; John,
1990) and continues to nag. Despite protestations, there are important differences among FFA advocates in regard to the psychological meanings they ascribe to their respective sets of five
factors. These disagreements, usually passed over lightly by fivefactor advocates, cannot readily be reconciled. For example,
Costa and McCrae have positioned warmth as a facet of Extraversion while Goldberg places it under Agreeableness. Impulsivity is usually a facet of Neuroticism for Costa and McCrae,
but it is sometimes considered by them (1993) to be a cardinal
attribute of low Conscientiousness, whereas Goldberg places
impulsivity within Extraversion. Costa and McCrae differ from
Goldberg even more in their respective views of the fifth factor.
Trapnell and Wiggins (1990), although adopting a version of
the Big Five, prefer to retain their long-held conceptualizations
of Dominance and Nurturance factors rather than accept the
Costa and McCrae further rotation and relabeling of these dimensions as Extraversion and Agreeableness. Hogan and Hogan
(1992), unable to accept the broadness of the Extraversion fac-

208

JACK BLOCK

tor, have split it into two large dimensions, Sociability and Ambition, and have six, not five factors, named differently and with
psychologically different connotations. Tellegen and Waller (in
press) have revamped the meanings of the five factors and have
added two more. Krug and Johns (1986), Noller, Law, and
Comrey (1987), and Boyle (1989), in analyses emphasizing the
Cattell Sixteen Personality Factor Questionnaire (Cattell et al.,
1970), all come up with "five robust factors," more or less, but
not the lexical or NEO Big Five factors. Johnson and Ostendorf
(1993), who accept that fiveness has been demonstrated to be
warranted, are troubled by the "inherent promiscuity" (i.e.,
myriad lateral lexical connections) of personality terms and
offer a differently labeled five factors (e.g., Agreeableness is renamed "Softness," Conscientiousness is renamed "Constraint,"
etc.). Caprara, Barbaranelli, Borgogni, and Perugini (1993)
have developed an Italian five-factor questionnaire wherein Extraversion has become Energy and Neuroticism has become
Control of Impulse and Emotions, importantly different labels.
Zuckerman (in press; Zuckerman, Kuhlman, Joireman, Teta,
& Kraft, 1993) provides "an alternative five factor model for
personality," but there are fundamental differences between his
broad dimensions and those more customarily called the Big
Five. How are the differences among these factor-oriented personality psychologists to be stilled, to escape intramural cacophony? By assertions? By the differential frequency of publication
of these differing views? By a convening of the interested parties
and the subsequent issuance of a conceptual treaty? By the adventitiously influenced direction the assessment community
happens to take?8
Furthermore, the acknowledged breadth of each of the five
factors poses a serious problem for the FFA that has not been
confronted by its advocates. There seems to be a continuing
indefiniteness or inconsistency or oscillation in the way five-factor advocates represent the Big Five or say the factors are to be
understood. We are told by FFA advocates that the factors are
the five basic, pervasively important dimensions of personality
"having an explanatory power that specific traits lack" (Costa
& McCrae, in press-b, p. 30). When questions are raised regarding the descriptive coarseness and psychological confoundings of the Big Five, we are told by FFA advocates that the five
factors only represent "domains," defined as "sphere [s] of concern or function" (Costa & McCrae, in press-b, p. 5) and are
nothing more than abstract, broadly inclusive, global categories, behavioral themes, initial rough distinctions, wide bandwidth ways of schematizing personality qualities. "Proponents
of the five-factor model have never intended to reduce the rich
tapestry of personality to a mere five traits" (Goldberg, 1993, p.
27). I submit that FFA advocates should not have it both ways;
the Big Five cannot simultaneously serve as both basic, broadly
useful factors and initial rough distinctions.
More importantly, it has been primarily in terms of the five
global factors, without further articulation, that psychological
interpretation has most often been delivered. However, it is now
being acknowledged (e.g., Costa & McCrae, in press-b) that, for
an adequate understanding of personality, it is necessary to
think and measure more specifically than at this global level if
behaviors and their mediating variables are to be sufficiently,
incisively represented. "A better acquaintance with the individual comes from a consideration of the facets.. . . The informa-

tion facets offer is more specific, more easily tied to the client's
problems in living" (Costa & McCrae, in press-b, pp. 27, 2930). As Wiggins (1992) observes, increasingly, "subordinate
qualities appear to be generally regarded as more scientifically
desirable than the superordinate qualities" (p. 529). McAdams
(1992) also notes that
the Big Five are in no way akin to the basic 'elements' of personality. They are not pure elemental typesbasic ingredients, as it
were, of personality. Instead, they exist as polyglot generic arenas
with fuzzy, overlapping boundaries. Adequate prediction and description in personality studies will usually require a judicious and
informed selection of many different constructs within the various
arenas, (pp. 339-340)

It follows, then, that if we are to have a personality descriptive


system that is scientifically compelling, the global domains require immediate articulation; initial rough distinctions must be
promptly abandoned for more considered and finer differentiations; wide bandwidth/low fidelity categories quickly must be
narrowed to move toward the higher fidelity available through
more discriminating classifications.
However, as soon as it is agreed that it is necessary and fruitful
to function at a level more specific and refined than is afforded
by the five grand, global, summating factors, the FFA per se becomes largely irrelevant. If the global domains require articulation into more differentiating dimensions, it is because the summary labels and measures of the five broad, amalgamated factors are obscure, confounded, and perhaps even unacceptable
in meaning. Happily, it logically and psychometrically follows
that, if reliabilities are equal, whatever can be accomplished
predictively by the set of five global domain measures is no more
than and will generally be lesseven much lessthan the predictiveness available from using the set of "subordinate," more
specific dimensions. That is, the Big Five domains are nothing
more than linear combinations of the more specific dimensions
said to He within them, and therefore there is no need to measure them grossly or at all. The commingling of these specific
dimensions under one of the Big Five summary labels entails
appreciable and irretrievable loss of important, even crucial information. Indeed, as Schonemann (1990) has algebraically observed, a factor score based on several variables can correlate
zero with a criterion while each of the several variables is highly
correlated with the criterion. What then is to be gained scientifically by sticking with the "rough" and confounded, uncertainly understandable five factor labels and measures?
If it is agreed that, for the purposes of personality assessment,
the Big Five dimensions are too global to be scientifically useful,
we must then face the problem of deciding just what is "the
optimal level of measurement" (Briggs, 1989, 1992) and, at
that appropriate level, what specific variables or facets are important. Here, a conceptual donnybrook starts, because a variety of psychologists have a variety of views as to the desirable
8

Moreover, there are many psychologists who abjure reliance on the


factor analytic method as the royal road to truth and who offer alternative conceptualizationsempirically consequentialof what is important to study in the field of personality (e.g., Block & Block, 1980; Epstein, 1973, 1990;Eysenck, 1970; Gray, 1981; McAdams, 1992; among
others).

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH

level of specificity and the specific subdimensions it is fruitful to


use. There is no ordained set of variables or "facets," if we move
away from "spheres of concern" or "global domains;" there are
only preferred lists, and these are highly debatable. Sadly for the
progress of personality assessment, once the five factor "domains" are delved into and subdimensions or facets are offered,
all kinds of complications, untidiness, and disagreements ensue. Angleitner, Ostendorf, and John (1990) have remarked
that "convergence on a set of specific 'middle level' categories
or facets is nowhere in sight" (p. 115). John (1990) and Briggs
(1992) make the same point. Costa and McCrae also acknowledge that, with regard to subdimensions or facets, there is appreciable tumult as to how many, which ones, and where to assign them (in press-b).
In short, there is appreciable disagreement as to which Big
Five "rough distinctions" to use, but there is agreement that,
for scientific purposes, it is necessary to abandon such "roughness" for a much more articulated approach to personality description. Unfortunately, however, at this more differentiating
level of analysis, there again is little agreement on how the nature of personality is to be "carved." This situation remains unresolved while the five-factor approach, per se, continues to be
enthusiastically promulgated by its adherents. Beyond headscratching, what is the personality psychologist to do, given this
state of affairs?
Some Suggestions
Offerings at this juncture are not and cannot be transforming
of the field of personality assessment; rather, my suggestions are
mild, obvious, and entail scientific sobriety coupled with slow,
hard work aiming to educe order from the present disparate,
jumbled empiricism characterizing personality psychology.
A first suggestion is that the bandwagon created by repeated
advocacy of the FFA simply be halted while the contents of the
wagon are examined and its direction is considered. The study
of personality has been revitalized in the quarter century since
the criticisms offered by Mischel (1968). Much has been
learned and much is underway. The field requires time for wider
reflection on its conceptual and empirical requirements and on
its past attainments and deficiencies. Various ways of construing
and studying personality abound and should abound; I suggest
that there is no need yet for nor is there a special advantage to
be gained by restricting ourselves to the special and arguable
outlook posited by the FFA.
A second suggestion is that it be more generally recognized
that the extraordinarily useful method of factor analysis by itself cannot be empowered to make paramount and controlling
decisions regarding the concepts to be used in the field of personality assessment. As Meehl (1992) has remarked, "No statistical procedure should be treated as a mechanical truth generator" (p. 152). We shall also have to use other and diverse
psychological resources involving close conceptualizing, perceptive observation, and unconfounding empiricism.
A third suggestion is that personality psychology use a conceptual language suitable for "experts" rather than one provided by and comfortable for "novices." The scientific concepts
of personality psychology need not be constrained to the terms
used by laypersons to characterize themselves and their ac-

209

quaintances. "It is generally acknowledged that the Big Five


[lexical] factors are too broad to be cohesive" (Hofstee, de
Raad, & Goldberg, 1992, p. 147). It is perhaps precisely because of their origins in rapidly offered lay perceptions that the
five factors lack cohesiveness.
A fourth suggestion is that personality psychologists not limit
their thinking and research by considering only what is or will
be convenient to index via simple self-report or layperson-report measures. Certainly, straightforward self-report or layperson-report measures can be immensely useful, because most
subjects voluntarily subjecting themselves to psychological inquiry or reporting as peers of subjects are willing or able to be
self-revealing or honest in their appraisals. But often enough to
pose serious theoretical and empirical problems, lay subjects
are not willing or are not able to validly present themselves or to
report on others. So, to study certain crucial phenomena lying
within the domain of personality psychology, personality psychologists often will need to turn, or return, to more complicated and complex ways of studying personsfor example, behavioral observations, psychophysiological measures, individual differences in various standardized situational contexts,
the garnering of life facts about the persons studied, truly
intimate interviews, and the longitudinal study of personality
development.
Also, and still, self-report and peer-report approaches should
continue to be used but in more sophisticated, more subtle, and
therefore more penetrating ways. Personality assessment has
progressed far enough to rise above patent and blatant item
content and to permit clever identification of interpretationmisleading responses of which the individual is aware and interpretation-misleading responses of a subject who simply is
lacking in self-insight.
My last two suggestions are not so easily conveyed or implemented. I urge the field of personality psychology to resolutely
confront its severe, even crippling, terminological problems.
Many of the difficulties that beset assessment derive from the
hasty, hazy, lazy use of language. Psychologists have tended to
be sloppy with words. We need to become more intimate with
their meanings, denotatively and connotatively, because summary labels and shorthand chosen quickly will controloften
in unrecognized waysthe way we think subsequently. In part,
this problem is inevitable, but we can do much better. Language
provides a medium for expressing, refining, and calibrating recognitions; we should use this medium with greater sensitivity
and responsibility.
Almost a century ago, it was remarked by a Professor Aikins
that the science of psychology frequently manifests what he
called "the jingle fallacy," a circumstance wherein two things
that are quite different may be labeled equivalently, and thus
the unwary may consider them interchangeable: "unthinking
acceptance of verbal equality as proof of real equality"
(Thorndike, 1904, p. 11). Within the field of personality psychology, the Big Five factors, as they have evolved and become
differently understood while remaining similarly labeled by
different Big Fivers, represent striking instances of the jingle
fallacy.
Somewhat later, Kelley (1927) added the "jangle fallacy" by
which he meant that two things in psychology carrying different
names or labels were often the same: "two separate words or

210

JACK BLOCK

expressions covering in fact the same basic situation, but sounding different, as though they were in truth different" (p. 64).
Within personality psychology, the jangle fallacy also abounds
and is exemplified when the NEO five global dimensions are
put forward without recognition of the earlier or alternative,
differently named personality constructs to which they are intrinsically linked or which they blend and confound.
The jingle and jangle fallaciesby no means limited to personality psychologywaste scientific time. The one suggests
agreements that do not exist; the other involves useless redundancies, sometimes because of the absence of historical knowledge, that lead to the "reinvent[ion of] constructs under new
labels" (Holroyd & Coyne, 1987, p. 367). Together, these errors
work to prevent the recognition of correspondences that could
help build cumulative knowledge.
Because of these prevailing difficulties, serious theorizing and
the construction of an improved set of scientifically useful and
consensually understood personality dimensions should be
taken up actively, sustainedly, and systematically by more personologists. It is not enough to say that the FFA is insufficiently
justified and fails to make crucial distinctions; the further responsibility devolves to provide or reintroduce integrating recognitions that are empirically consequential. This necessary
task remains a daunting one, not to be assumed lightly. We
might fruitfully recommence by deconstructing the myriad, often disparate, meanings that have been ascribed to the Big Five.
Alternative conceptual and empirical analyses of these and related dimensions are required, followed by constructive disputation and efforts at concept calibration. Such close conceptual
reflection, further informed by focused empiricism, will spiral
our field forward.
My final, most ambitious suggestion is that, in the
conceptual/empirical arguments to be made for specific dimensions, of whatever number, these constructs be situated
within a coherent, intraindividual theoretical framework. As
earlier noted, the study of personality psychology has been
overly preoccupied with the study of interindividual differences;
indeed, the field is often defined simply as "the study of
[inter]individual differences." And, of differences between individuals, there is no end. An infinite number of sets of descriptive variables can be formulated, each being preferred by its
progenitor and contestable otherwise. What is needed is a basis
for choosing among these alternative sets. Efforts to study or
conceptualize the dynamics underlying intraindividual functioning might well move the study of personality toward such a
basis.
The integral connection between a theory of the individual
on the one hand and a theory of individual differences on the
other is crucial to recognize (Cronbach, 1957; Lewin, 1946;
Underwood, 1975). Lewin has expressed well the relation between these two approaches: The "problems of individual
differences, of age levels, of personality, of specific situations,
and of general laws are closely interwoven. A law is expressed in
an equation which relates certain variables. Individual differences have to be conceived of as various specific values which
these variables have in a particular case. In other words, general
laws and individual differences are merely two aspects of one
problem; they are mutually dependent on each other and the

study of one cannot proceed without the study of the other"


(Lewin, 1946, p. 794).
From this recognition, it follows that once the parameters
that define the personality system9 of a generic individual have
been conceptually posited and empirically identified, these parameters become the essential, nonarbitrary, overarching variables or concepts for a personality of interindividual differences.
The differences between individuals would then be understandable as due to the different specific values these parameters take
in different individual personality systems. It would be a sweet
intellectual accomplishment if the theoretical constructs required for dynamically understanding within-individual functioning could also be used for understanding the differences between individuals. Personality psychologists might aspire to this
goal. In that effort, our sciencein its theoretical reach and
empirical graspmay better realize its deep aspirations: to provide a bio-social theory of intraindividual development, the moment-to-moment psychodynamic functioning within the individual, the life themes coherently reflected in the adaptive
changes of individuals over time and context, andlhe personality differences among individuals.

Postscript
This article has sought to present an alternative view on what
may be the crucial issue for personality psychology in this decade. It is my wish that the concerns expressed and the questions
raised be viewed within the recognition that the scientific understanding of personality functioning and personality development is intellectually fascinating and of fundamental importance for general psychology. In the last generation, we have
come far in illuminating the coherence of personality. It is my
hope that the present disquisition may encourage constructive
dialogue leading to further conceptual/methodological/
substantive recognitions and the next stage of the scientific
study of personality.
9
In a generative, dynamic system theory of the individual, the laws
or production rules or "if. . . then" relations on which experience and
behavior are predicated would be specified. In a system theory, there are
system goalsfunctions to be maximized or minimized or "satisficed"and parameters influencing system functioning. "For systems
that change through time, explanation takes the form of laws acting
on the current state of the system to produce a new stateendlessly"
(Simon, 1992, p. 160).

References
Allport, G. W. (1961). Pattern and growth in personality. New York:
Holt, Rinehart & Winston.
Allport, G. W., & Odbert, H. S. (1936). Trait names: A psycho-lexical
study. Psychological Monographs, 47(1, Whole No. 211).
Andersen, S. (1984). Self-knowledge and social inference: II. The diagnosticity of cognitive /affective and behavioral data. Journal of Personality and Social Psychology, 46, 294-307.
Angleitner, A., Ostendorf, F, & John, O. P. (1990). Towards a taxonomy of personality descriptors in German: A psycho-lexical study.
European Journal of Personality, 4, 89-118.
Armstrong, J. S. (1967). Derivation of theory by means of factor analysis or Tom Swift and his electric factor analysis machine. American
Statistician, 21, 17-21.

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH


Backteman, G., & Magnusson, D. (1981). Longitudinal stability of personality characteristics. Journal of Personality, 49, 148-160.
Baehr, M. E. (1952). A factorial study of temperament. Psychometrika,
17, 107-126.
Beck, L., McCauley, C, Segal, M., & Hershey, L. (1988). Individual
differences in prototypicality judgments about trait categories.
Journal of Personality and Social Psychology, 55, 286-292.
Ben-Porath, Y. S., & Waller, N. G. (1992a). "Normal" personality inventories in clinical assessment: General requirements and the potential for using the NEO Personality Inventory. Psychological Assessment, 4, 14-19.
Ben-Porath, Y. S., & Waller, N. G. (1992b). Five big issues in clinical
personality assessment: A rejoinder to Costa and McCrae. Psychological Assessment, 4, 23-25.
Bertalanffy, L. von. (1952). Problems of life: An valuation of modern
biological thought. New York: Wiley.
Block, J. (1955). The difference between Q and R. Psychological Review, 62. 356-358.
Block, J.( 1961). The Q-sort method in personality assessment and psychological research. Springfield, IL: Charles C Thomas (Reprinted
1978, Palo Alto, CA: Consulting Psychologists Press).
Block, J. (1965). The challenge of response sets: Unconfounding meaning, acquiescence, and social desirability in the MMPI. New York:
Appleton-Century-Crofts.
Block, J. (1971). Lives through time. Berkeley, CA: Bancroft Books.
Block, J. (1977). Advancing the psychology of personality: Paradigmatic shift or improving the quality of research? In D. Magnusson
& N. S. Endler (Eds.), Personality at the crossroads (pp. 37-63).
Hillsdale, NJ: Erlbaum.
Block, J. (1981). Some enduring and consequential structures of personality. In A. I. Rabin, J. Aronoff, A. M. Barclay, & R. A. Zucker
(Eds.), Further explorations in personality (pp. 27-43). New York:
Wiley-Interscience.
Block, J. (1993). Studying personality the long way. In D. Funder, R.
Parke, C. Tomlinson-Keasy, & K. Widaman (Eds.), Studying lives
through time: Approaches to personality and development (pp. 9-41).
Washington, DC: American Psychological Association.
Block, J. H., & Block, J. (1980). The role of ego-control and ego-resiliency in the organization of behavior. In W. A. Collins (Ed.), The
Minnesota Symposia on Child Psychology: Vol. 13. (pp. 39-101).
Hillsdale, NJ: Erlbaum.
Block, J., & Petersen, P. (1955). Some personality correlates of confidence, caution, and speed in a decision situation. Journal of Abnormal and Social Psychology, 51, 34-41.
Block, J., & Thomas, H. (1955). Is satisfaction with self a measure of
adjustment? Journal of Abnormal and Social Psychology, 51, 254259.
Block, J., Weiss, D. S., & Thorne, A. (1979). How relevant isa semantic
similarity interpretation of personality ratings? Journal of Personality and Social Psychology, 37, 1055-1074.
Bolanovich, D. J. (1946). Statistical analysis of an industrial rating
chart. Journal of Applied Psychology, 30, 23-31.
Borgatta, E. F. (1964). The structure of personality characteristics. Behavioral Science, 9, 8-17.
Boyle, G. J. (1989). Re-examination of the major personality-type factors in the Cattell, Comrey, and Eysenck scales: Were the factor solutions by Noller et al. optimal. Personality and Individual Differences,
10, 1289-1299.
Briggs, S. R. (1989). The optimal level of measurement for personality
constructs. In D. M. Buss & N. Cantor (Eds.), Personality psychology: Recent trends and emerging directions (pp. 246-260). New York:
Springer.
Briggs, S. R. (1992). Assessing the five-factor model of personality description. Journal of Personality, 60, 253-293.

211

Brogden, H. E. (1944). A multiple factor analysis of the character trait


intercorrelations published by Sister Mary McDonough. Journal of
Educational Psychology, 35, 397-410.
Browne, M. W. (1979). The maximum-likelihood solution in interbattery analysis. British Journal of Mathematical and Statistical Psychology, 32, 75-86.
Browne, M. W. (1980). Factor analysis of multiple batteries by the maximum likelihood method. British Journal of Mathematical and Statistical Psychology, 33, 184-199.
Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (1989). Minnesota Multiphasic Personality Invenlory-2
(MMPI-2): Manual for administration and scoring. Minneapolis:
University of Minnesota Press.
Caprara, G. V, Barbaranelli, C., Borgogni, L., & Perugini, M. (1993).
The "Big Five Questionnaire": A new questionnaire to assess the five
factor model. Personality and Individual Differences, 15, 281-288.
Caprara, G. V, Barbaranelli, C., & Comrey, A. L. (in press). Factor
analysis of the NEO-PI Inventory and the Comrey Personality Scales
in an Italian sample. Personality and Individual Differences.
Cattell, R. B. (1943a). The description of personality: Basic traits resolved into clusters. Journal of Abnormal and Social Psychology, 38,
476-506.
Cattell, R. B. (1943b). The description of personality: I. Foundations
of trait measurement. Psychological Review, 50, 559-594.
Cattell, R. B. (1945). The description of personality: Principles [sic]
findings in a factor analysis. American Journal of Psychology, 58, 6990.
Cattell, R. B. (1946). Description and measurement of personality. Yonkers, NY: World.
Cattell, R. B. (1947). Confirmation and clarification of primary personality factors. Psychometrika, 12, \ 97-220.
Cattell, R. B. (1948). The primary personality factors in women compared with those in men. British Journal of Psychology, 1, 114-130.
Cattell, R. B. (1973). Personality and mood by questionnaire. San Francisco: Jossey-Bass.
Cattell, R. B., Eber, H. W, & Tatsuoka, M. M. (1970). The handbook
for the Sixteen Personality Factor (16PF) Questionnaire. Champaign, IL: Institute for Personality and Ability Testing.
Cheek, J. A. (1982). Aggregation, moderator variables, and the validity
of personality tests: A peer rating study. Journal of Personality and
Social Psychology, 43, 1254-1269.
Clark, L. A. (1990). Toward a consensual set of symptom clusters for
assessment of personality disorder. In J. N. Butcher & C. D. Spielberger (Eds.), Advances in personality assessment (Vol. 8, pp. 243266). Hillsdale, NJ: Erlbaum.
Clark, L. A. (1993). Personality disorder diagnosis: Limitations of the
five-factor model. Psychological Inquiry, 4, 100-104.
Cliff, N. (1983). Some cautions concerning the application of causal
modeling methods. Multivariate Behavioral Research, 18, 115-126.
Coan, R. W. (1972). Measurable components of openness to experience. Journal of Consulting and Clinical Psychology, 39, 346.
Colvin, C. R., & Block, J. (1993). Personality implications of overly
positive self-evaluations: They do not imply mental health. Unpublished manuscript, University of California, Berkeley.
Comrey, A. L. (1978). Common methodological problems in factor
analytic studies. Journal of Consulting and Clinical Psychology, 46,
648-659.
Comrey, A. L. (1993). Revised manual and handbook of interpretation
for the Comrey Personality Scales. San Diego: EDITS.
Conley, J. J. (1984). Longitudinal consistency of adult personality: Selfreported psychological characteristics across 45 years. Journal of Personality and Social Psychology, 47, 1325-1333.
Corulla, W. J. (1987). A psychometric investigation of the Eysenck Personality Questionnaire (revised) and its relationship to the 1.7 Impul-

212

JACK BLOCK

siveness Questionnaire. Personality and Individual Differences. 8,


651-658.
Costa, P. T., Jr., & McCrae, R. R. (1976). Age differences in personality
structure: A cluster analytic approach. Journal of Gerontology, 31,
564-570.
Costa, P. X, Jr., & McCrae, R. R. (1978). Objective personality assessment. In M. Storandt, I. C. Siegler, & M. F. Elias (Eds.), The clinical
psychology of aging (pp. 119-143). New York: Plenum.
Costa, P. T., Jr., & McCrae, R. R. (1980). Still stable after all these years:
Personality as a key to some issues in adulthood and old age. In P. B.
Baltes & O. G. Brim (Eds.), Life span development and behaviors
(Vol. 3, pp. 65-102). New York: Academic Press.
Costa, P. T., Jr., & McCrae, R. R. (1985). TheNEO Personality Inventory manual. Odessa, FL: Psychological Assessment Resources.
Costa, P. T., Jr., & McCrae, R. R. (1988). From catalog to classification:
Murray's needs and the five-factor model. Journal of Personality and
Social Psychology, 54, 258-265.
Costa, P. T., Jr., & McCrae, R. R. (1990). Personality disorders and the
five-factor model of personality. Journal of Personality Disorders, 4,
362-371.
Costa, P. T., Jr., & McCrae, R. R. (1992a). Four ways five factors are
basic. Personality and Individual Differences, 13, 653-665.
Costa, P. X, Jr., & McCrae, R. R. (1992b). Normal personality assessment in clinical practice: Xhe NEO Personality Inventory. Psychological Assessment, 4, 5-13.
Costa, P. X, Jr., & McCrae, R. R. (1992c). Revised NEO Personality
Inventory (NEO-PI-R) and NEO Five-Factor Inventory (NEO-FFI)
professional manual. Odessa, FL: Psychological Assessment
Resources.
Costa, P. X, Jr., & McCrae, R. R. (1992d). Xhe five-factor model of
personality and its relevance to personality disorders. Journal of Personality Disorders, 6, 343-359.
Costa, P. X, Jr., & McCrae, R. R. (1993). Ego development and trait
models of personality. Psychological Inquiry, 4, 20-23.
Costa, P. X, Jr., & McCrae, R. R. (in press-a). Xhe Revised NEO Personality Inventory (NEO-PI-R). In S. R. Briggs & J. Cheek (Eds.),
Personality measures (Vol. 1). Greenwich, CX: JAI Press.
Costa, P. X, Jr., & McCrae, R. R. (in press-b). Domains and facets:
Hierarchical personality assessment using the Revised NEO Personality Inventory. Journal of Personality Assessment.
Costa, P. X, Jr., McCrae, R. R., & Dye, D. A. (1991). Facet scales for
Agreeableness and Conscientiousness: A revision of the NEO Personality Inventory. Personality and Individual Differences, 12, 887-898.
Cronbach, L. J. (1957). Xhe two disciplines of scientific psychology.
American Psychologist, 12, 671-684.
Cudeck, R., & Henly, S. J. (1991). Model selection in covariance structure analyses and the "problem" of sample size: A clarification. Psychological Bulletin, 109, 512-519.
Digman, J. M. (1990). Personality structure: Emergence of the fivefactor model. Annual Revim of Psychology, 41, 417-440.
Digman, J. M., & Xakemoto-Chock, N. K. (1981). Factors in the natural language of personality: Re-analysis, comparison, and interpretation of six major studies. Multivariate Behavioral Research, 16, 149170.
Douglas, K., & Arenberg, D. (1978). Age changes, cohort differences,
and cultural change on the Guilford-Zimmerman Xemperament Survey. Journal of Gerontology, 33, 737-747.
Edwards, A. L., & Klockars, A. J. (1981). Significant others and selfevaluation: Relationships between perceived and actual evaluations.
Personality and Social Psychology Bulletin, 7, 244-251.
Epstein, S. (1973). Xhe self-concept revisited, or a theory of a theory.
American Psychologist, 28, 404-416.
Epstein, S. (1990). Cognitive-experiential self-theory. In L. Pervin

(Ed.), Handbook of personality theory and research (pp. 165-192).


New York: Guilford Press.
Everett, J. E. (1983). Factor comparability as a means of determining
the number of factors and their rotation. Multivariate Behavioral Research, 18, 197-218.
Eysenck, H. J. (1967). The biological basis of personality. Springfield,
IL: Charles C Xhomas.
Eysenck, H. J. (1970). The structure of personality (3rd ed.). London:
Methuen.
Eysenck, H. J. (1992). Four ways five factors are not basic. Personality
and Individual Differences, 6, 667-673.
Eysenck, H. J., & Eysenck, S. B. G. (1969). Personality structure and
measurement. London: Routledge & Kegan Paul.
Fiske, D. W. (1949). Consistency of the factorial structures of personality ratings from different sources. Journal of Abnormal and Social
Psychology, 44, 329-344.
French, J. W. (1953). The description of personality measurements in
terms of rotated factors. Princeton, NJ: Educational Xesting Service.
Funder, D. C. (1980). Xhe "trait" of ascribing traits: Individual differences in the tendency to trait ascription. Journal of Research in Personality, 14, 376-385.
Funder, D. C., & Colvin, C. R. (1988). Friends and strangers: Acquaintanceship, agreement, and the accuracy of personality judgment.
Journal of Personality and Social Psychology, 55, 149-158.
Funder, D. C., & Dobroth, K. M. (1987). Differences between traits:
Properties associated with interjudge agreement. Journal of Personality and Social Psychology, 52, 409-418.
Glisky, M. L., Xataryn, D. J., Xobias, B. A., Kihlstrom, J. F, & McConkey, K. M. (1991). Absorption, openness to experience, and hypnotizability. Journal of Personality and Social Psychology, 60, 263-272.
Goldberg, L. R. (1971). A historical survey of personality scales and
inventories. In P. McReynolds (Ed.), Advances in psychological assessment (Vol. 2, pp. 293-336). Palo Alto, CA: Science & Behavior
Books.
Goldberg, L. R. (1977, September). Language and personality: Developing a taxonomy of trait descriptive terms. Invited address to the
Division of Evaluation and Measurement at the Annual Meeting of
the American Psychological Association, San Francisco.
Goldberg, L. R. (1980, May). Some ruminations about the structure
of individual differences: Developing a common lexicon for the major
characteristics of personality. Symposium presentation at the meeting of the Western Psychological Association, Honolulu.
Goldberg, L. R. (1981). Language and individual differences: Xhe
search for universals in personality lexicons. In L. Wheeler (Ed.), Review of personality and social psychology (Vol. 2, pp. 141-165). Beverly Hills, CA: Sage.
Goldberg, L. R. (1982). From Ace to Zombie: Some explorations in the
language of personality. In C. D. Spielberger & J. N. Butcher (Eds.),
Advances in personality assessment (Vol. 1, pp. 203-234). Hillsdale,
NJ: Erlbaum.
Goldberg, L. R. (1983, June). The magical number five, plus or minus
two: Some conjectures on the dimensionality of personality descriptions. Paper presented at a research seminar, Gerontology Research
Center, Baltimore, MD.
Goldberg, L. R. (1990). An alternative "Description of Personality":
Xhe Big-Five factor structure. Journal of Personality and Social Psychology, 59, 1216-1229.
Goldberg, L. R. (1992). Xhe development of marker variables for the
Big-Five factor structure. Psychological Assessment, 4, 26-42.
Goldberg, L. R. (1993). Xhe structure of phenotypic personality traits.
American Psychologist, 48, 26-34.
Goldberg, L. R., & Kilkowski, J. M. (1985). Xhe prediction of semantic
consistency in self-descriptions: Characteristics of persons and of

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH


terms that affect the consistency of response to synonym and antonym pairs. Journal of Personality and Social Psychology, 48, 82-98.
Gough, H. G., McK.ee, M. G., & Yandell, R. J. (1955). Adjective check
list analyses of a number of selected psychometric and assessment
variables (USAF OERL Tech. Memorandum No. 55-10). Maxwell
Air Force Base, AL: U.S. Air Force.
Gray, J. A. (1981). A critique of Eysenck's theory of personality. In
H. J. Eysenck (Ed.), A model for personality (pp. 246-276). New
York: Springer-Verlag.
Graziano, W. G. (1992). Unknown words in assessing the five factor
model of personality: Provincial and lethargic in Texas? Unpublished
manuscript, Texas A & M University, College Station.
Guilford, J. P. (1975). Factors and factors of personality. Psychological
Bulletin, 82, 802-814.
Guilford, J. P. (1977). Will the real factor of extraversion-introversion
please stand up? A reply to Eysenck. Psychological Bulletin, 84, 412416.
Hahn, R., & Comrey, A. L. (1993). Factor analysis of the NEO-PI and
the Comrey Personality Scales. Manuscript submitted for
publication.
Hampson, S. E., John, O. P., & Goldberg, L. R. (1986). Category
breadth and hierarchical structure in personality: Studies of asymmetries in judgments of trait implications. Journal of Personality and
Social Psychology, 51, 37-54.
Harkness, A. R. (1992). Fundamental topics in the personality disorders: Candidate trait dimensions from lower regions of the hierarchy.
Psychological Assessment, 4, 251-259.
Harman, H. H. (1967). Modern factor analysis. Chicago: University of
Chicago Press.
Hase, H. D., & Goldberg, L. R. (1967). Comparative validity of different strategies of constructing personality inventory scales. Psychological Bulletin, 67, 231-248.
Hofstee, W. K. B., de Raad, B., & Goldberg, L. R. (1992). Integration
of the Big Five and circumplex approaches to trait structure. Journal
of Personality and Social Psychology, 63, 146-163.
Hogan, R., & Hogan, J. (1992). Hogan Personality Inventory Manual.
Tulsa, OK: Hogan Assessment Systems.
Holroyd, K. A., & Coyne, J. (1987). Personality and health in the
1980s: Psychosomatic medicine revisited? Journal of Personality, 55,
359-375.
Horn, J. L. (1967). On subjectivity in factor analysis. Educational and
Psychological Measurement, 27, 811-820.
Hough, L. (1992). The "Big Five" personality variablesConstruct
confusion: Description versus prediction. Human Performance, 5,
139-155.
Jackson, D. N. (1984). Personality Research Form manual (3rd ed.).
Port Huron, MI: Research Psychologists Press.
John, O. P. (1990). The "Big Five" factor taxonomy: Dimensions of
personality in the natural language and in questionnaires. In L. Pervin (Ed.), Handbook of personality: Theory and research (pp. 66100). New York: Guilford Press.
John, O. P., Angleitner, A., & Ostendorf, F. (1988). The lexical approach to personality: A historical review of trait taxonomic research.
European Journal of Personality, 2, 171-203.
John, O. P., Goldberg, L. R., & Angleitner, A. (1984). Better than the
alphabet: Taxonomies of personality-descriptive terms in English,
Dutch, and German. In H. Bonarius, G. van Heck, & N. Smid (Eds.).
Personality psychology in Europe: Theoretical and empirical developments (pp. 83-100). Lisse, The Netherlands: Swets and Zeitlinger.
John, O. P., Hampson, S. E., & Goldberg, L. R. (1991). The basic level
in personality-trait hierarchies: Studies of trait use and accessibility
in different contexts. Journal of Personality and Social Psychology,
60,348-361.
John, O. P., & Robins, R. W. (1993). Determinants of interjudge

213

agreement on personality traits: The Big Five domains, observability,


evaluativeness, and the unique perspective of the self. Journal ofPersonality, 61, 521-551.
John, O. P., & Robins, R. W. (1994). Accuracy and bias in self-perception: Individual differences in self-enhancement and the role of narcissism. Journal of Personality and Social Psychology, 66, 206-219.
Johnson, J. A., & Ostendorf, F. (1993). Clarification of the five-factor
model with the Abridged Big Five Dimensional Circumplex. Journal
of Personality and Social Psychology, 65, 563-576.
Joreskog, K. G., & Sorbom, D. (1989). LISREL 7.16: A guide to the
program and application (2nd ed.). Chicago: SPSS Inc.
Kelley, E. L. (1927). Interpretation of educational measurements. Yonkers, NY: World.
Kenrick, D. T, & Stringfield, D. O. (1980). Personality traits and the
eye of the beholder: Crossing some traditional philosophical boundaries in the search for consistency in all the people. Psychological Review, 87,88-104.
Kilkowski, J. M. (1976). An empirical investigation of the consistency
of self-ratings. Unpublished doctoral dissertation, University of Minnesota, Minneapolis.
Krug, S. E., & Johns, E. F. (1986). A large scale cross-validation of
second-order personality structure defined by the 16PF. Psychological
Reports, 59, 683-693.
Lambert, Z. V, Wildt, A. R., & Durand, R. M. (1988). Redundancy
analysis: An alternative to canonical correlation and multivariate
multiple regression in exploring interset associations. Psychological
Bulletin, 104, 282-289.
Leon, G. R., Gillum, B., Gillum, R., & Gouze, M. (1979). Personality
stability and change over a 30-year periodmiddle age to old age.
Journal of Consulting and Clinical Psychology, 47, 517-524.
Lewin, K. (1946). Behavior and development as a function of the total
situation. In L. Carmichael (Ed.), Manual of Child Psychology (pp.
918-970). New York: Wiley.
Livesley, W. J., & Bromley, D. B. (1973). Person perception in childhood
and adolescence. London: Wiley.
Livesley, W. J., Jackson, D. N., & Schroeder, M. L. (1989). A study of
the factorial structure of personality pathology. Journal of Personality
Disorders, 3, 292-306.
Livesley, W. J., Jackson, D. N., & Schroeder, M. L. (1992). Factorial
structure of traits delineating personality disorders in clinical and
general population samples. Journal of Abnormal Psychology, 101,
432-440.
Loevinger, J. (1948). The technic of homogeneous tests compared with
some aspects of "scale analysis" and factor analysis. Psychological
Bulletin, 45, 507-529.
Loevinger, J. (1976). Ego development: Conceptions and theories. San
Francisco: Jossey-Bass.
Lovell, C. (1945). A study of the factor structure of thirteen personality
variables. Educational and Psychological Measurement, 5, 335-350.
Lykken, D. T. (1971). Multiple factor analysis and personality research.
Journal of Experimental Research in Personality, 5, 161 -170.
MacCallum, R. C., & Tucker, L. R. (1991). Representing sources of
error in the common-factor model: Implications for theory and practice. Psychological Bulletin, 109, 502-511.
McAdams, D. P. (1992). The five-factor model in personality: A critical
appraisal. Journal of Personality, 60, 329-361.
McCrae, R. R. (1989). Why I advocate the five-factor model: Joint factor analyses of the NEO-PI with other instruments. In D. M. Buss &
N. Cantor (Eds.), Personality psychology: Recent trends and emerging
directions (pp. 237-245). New York: Springer-Verlag.
McCrae, R. R. (1990). Traits and trait names: How well is Openness
represented in natural languages? European Journal of Personality, 4,
119-129.

214

JACK BLOCK

McCrae, R. R. (1992). Editor's introduction to Tupes and Christal.


Journal of Personality. 60, 217-219.
McCrae, R. R., & Costa, P. T., Jr. (1983). Joint factors in self-reports
and ratings: Neuroticism, extraversion, and openness to experience.
Personality and Individual Differences, 4, 245-255.
McCrae, R. R., Costa, P. T., Jr. (1985). Updating Norman's "Adequate taxonomy": Intelligence and personality dimensions in natural
language and in questionnaires. Journal of Personality and Social
Psychology, 49, 710-721.
McCrae, R. R., & Costa, P. T., Jr. (1986). Clinical assessment can benefit from recent advances in personality psychology. American Psychologist, 41, 1001-1003.
McCrae, R. R., & Costa, P. T., Jr. (1987). Validation of the five-factor
model across instruments and observers. Journal of Personality and
Social Psychology, 52, 81-90.
McCrae, R. R., & Costa, P. T., Jr. (1989a). Rotation to maximize the
construct validity of factors in the NEO Personality Inventory. Multivariate Behavioral Research, 24. 107-124.
McCrae, R. R., & Costa, P. T., Jr. (19895). The structure of interpersonal traits: Wiggins' circumplex and the five-factor model. Journal
of Personality and Social Psychology, 56, 586-595.
McCrae, R. R., & Costa, P. T., Jr. (1990). Personality in adulthood. New
York: Guilford Press.
McCrae, R. R., Costa, P. T., Jr., & Busch, C. M. (1986). Evaluating
comprehensiveness in personality systems: The California Q-Set and
the five-factor model. Journal of Personality, 54, 430-446.
McCrae, R. R., & John, O. P. (1992). An introduction to the Five Factor Model and its applications. Journal of Personality, 60, 175-215.
McDonough, M. (1929). The empirical study of character. Washington, DC: The Catholic University of America.
Meehl, P. E. (1992). Factors and taxa, traits and types, differences of
degree and differences in kind. Journal of Personality, 60, 117-174.
Meehl, P. E., Lykken, D. T. Schofield, W., & Tellegen, A. (1971). Recaptured-item technique (RIT): A method for reducing somewhat
the subjective element in factor naming. Journal of Experimental Research in Personality, 5, 171-190.
Mershon, B., & Gorsuch, R. L. (1988). Number of factors in the personality sphere: Does increase in factors increase predictability of
real-life criteria? Journal of Personality and Social Psychology, 55,
675-680.
Miller, G. A. (1956). The magical number seven, plus or minus two:
Some limits on our capacity for processing information. Psychological Review, 63, 81-97.
Mischel, W. (1968). Personality and assessment. New York: Wiley.
Monson, T. C., Tanke, E. D., & Lund, J. (1980). Determinants of social
perception in a naturalistic setting. Journal of Research in Personality, 14. 104-120.
Moskowitz, D. S. (1990). Convergence of self-reports and independent
observers: Dominance and friendliness. Journal of Personality and
Social Psychology, 58, 1096-1106.
Moss, H. A., & Susman, E. J. (1980). Longitudinal study of personality
development. InO.G. Brim&J. Kagan(Eds-), Constancy and change
in human development (pp. 530-595). Cambridge, MA: Harvard
University Press.
M roczek, D. K. (1992). Personality and psychopathology in older men:
The five factor model and the MMPI-2. Unpublished doctoral dissertation, Boston University, Boston.
Noller, P., Law, H., & Comrey, A. L. (1987). Cattell, Comrey, and
Eysenck personality factors compared: More evidence for the five robust factors? Journal of Personality and Social Psychology, 53, 775782.
Norman, W. T. (1963). Toward an adequate taxonomy of personality
attributes: Replicated factor structure in peer nomination personality
ratings. Journal of Abnormal and Social Psychology, 66, 574-583.

Norman, W. T. (1967). 2800 personality trait descriptors: Normative


operating characteristicsfor a university population. Ann Arbor: University of Michigan, Department of Psychological Sciences.
Ostendorf, F. (1990). Sprache und Person/ichkeitsstruktur: Zur Validitat des Funf-Facktoren-Modells der Personlichkeit [Language and
personality structure: On the validity of the five-factor model of
personality]. Regensburg, Germany: S. Roderer Verlag.
Parker, J. D. A., Bagby, R. M., & Summerfeldt, L. J. (1993). Confirmatory factor analysis of the Revised NEO Personality Inventory.
Personality and Individual Differences, 15, 463-466.
Peabody, D., & Goldberg, L. R. (1989). Some determinants of factor
structures from personality-trait descriptors. Journal of Personality
and Social Psychology, 5 7, 552-567.
Peevers, B. H., & Secord, P. F. (1973). Developmental changes in attribution of descriptive concepts to persons. Journal of Personality and
Social Psychology, 27, 120-128.
Pitkanen-Pulkkinen, L. (1981). Long-term studies of the characteristics of aggressive and non-aggressive juveniles. In P. F. Brain & D.
Benton (Eds.), Multidisciplinary approaches to aggression research
(pp. 225-243). Amsterdam: Elsevier.
Plomin, R. A. (1974). A temperament theory of personality development: Parent-child interactions. Unpublished doctoral dissertation,
University of Texas, Austin.
Reichenbach, H. (1951). The rise of scientific philosophy. Berkeley and
Los Angeles: University of California Press.
Reyburn, H. A., & Taylor, J. G. (1939). Some factors of personality: A
further analysis of some of Webb's data. British Journal of Psychology, 30, 151-165.
Schonemann, P. H. (1990). Facts, fictions, and common sense about
factors and components. Multivariate Behavioral Research, 25, 4751.
Shapiro, D. (1965). Neurotic styles. New York: Basic Books.
Shedler, J., Mayman, M., & Manis, M. (1993). The illusion of mental
health. American Psychologist, 48, 1117-1131.
Sheffield, A. D. (Ed.). (1959). Soule's dictionary of English synonyms.
Boston: Little, Brown.
Shock, N. W., Greulich, R. C., Andres, R., Arenberg, D., Costa, P. T.,
Jr., Lakatta, E. G., & Tobin, J. D. (1984). Normal human aging: The
Baltimore Longitudinal Study of Aging (NIH Publication No. 842450). Bethesda, MD: National Institutes of Health.
Shweder, R. A. (1975). How relevant is an individual difference theory
of personality? Journal of Personality, 43, 455-484.
Simon, H. A. (1992). What is an "explanation" of behavior? Psychological Science, 3, 150-161.
Smith, G. M. (1967). Usefulness of peer ratings of personality in educational research. Educational and Psychological Measurement, 27,
967-984.
Tanaka, J. W, & Taylor, M. (1991). Object categories and expertise: Is
the basic level in the eye of the beholder? Cognitive Psychology, 23,
457-482.
Tellegen, A. (1982). A brief manual for the Differential Personality
Questionnaire. Unpublished manuscript; University of Minnesota,
Minneapolis.
Tellegen, A. (1985). Structures of mood and personality and their relevance to assessing anxiety, with an emphasis on self-report. In A. H.
Tuma & J. D. Maser (Eds.), Anxiety and the anxiety disorders (pp.
681-706). Hillsdale, NJ: Erlbaum.
Tellegen, A. (1993). Folk concepts and psychological concepts of personality and personality disorder. Psychological Inquiry, 4, 122-130.
Tellegen, A. (in press). Multidimensional Personality Questionnaire.
Minneapolis: University of Minnesota Press.
Tellegen, A., & Atkinson, G. (1974). Openness to absorbing and selfaltering experiences ("absorption"), a trait related to hypnotic susceptibility. Journal of Abnormal Psychology, S3, 268-277.

A CONTRARIAN VIEW OF THE FIVE-FACTOR APPROACH


Tellegen, A., & Waller, N. G. (in press). Exploring personality through
test construction: Development of the Multidimensional Personality
Questionnaire. In S. R. Briggs&J. M. Cheek (Eds.), Personality measures: Development and evaluation (Vol. 1). Greenwich, CT: JAI
Press.
Thorndike, E. L. (1904). An introduction to the theory of mental and
social measurements. New \brk: Teachers College, Columbia
University.
Thurstone, L. L. (1951). The dimensions of temperament: Analysis of
Guilford's thirteen personality scores. Psychometrika, 16, 11 -20.
Trapnell, P. D., & Wiggins, J. S. (1990). Extension of the Interpersonal
Adjective Scales to include the Big Five dimensions of personality.
Journal of Personality and Social Psychology, 59, 781-790.
Tryon, R. C, & Bailey, D. E. (1970). Cluster analysis. New York:
McGraw-Hill.
Tschechtelin, S. M. A. (1944). Factor analysis of children's personality
rating scale. Journal of Psychology, 18, 197-200.
Tupes, E. C. (1957). Relationships between behavior trait ratings by
peers and later officer performance of VSAP Officer Candidate School
graduates (USAF PTRC Tech. Note No. 57-125). Lackland Air
Force Base, TX: U.S. Air Force.
Tupes, E. C., & Christal, R. E. (1992). Recurrent personality factors
based on trait ratings. Journal of Personality, 60, 225-251.
(Reprinted from USAF ASD Tech. Rep. No. 61-97, 1961, Lackland
Air Force Base, TX: U.S. Air Force)
Underwood, B. J. (1975). Individual differences as a crucible in theory
construction. Psychological Review, 66, 297-333.
van den Wollenberg, A. L. (1977). .Redundancy analysis: An alternative
for canonical analysis. Psychometrika, 42, 207-219.
Waller, N. G., & Ben-Porath, Y. S. (1987). Is it time for clinical psychol-

215

ogy to embrace the five-factor model of personality? American Psychologist, 42, 887-889.
Watson, D., & Tellegen, A. (1985). Toward a consensual structure of
mood. Psychological Bulletin, 98, 219-235.
Webb, E. (1915). Character and intelligence. British Journal of Psychology, 1, (Suppl. 3).
Wiggins, J. S. (1968). Personality structure. Annual Review of Psychology, 19,293-350.
Wiggins, J. S. (1982). Circumplex models of interpersonal behavior in
clinical psychology. In P. C. Kendall & J. N. Butcher (Eds.), Handbook of research methods in clinical psychology (pp. 183-221). New
York: Wiley.
Wiggins, J. S. (1992). Have model, will travel. Journal of Personality,
60. 527-532.
Woodruff, D. S. (1983). The role of memory in personality continuity:
A 25 year follow-up. Experimental Aging Research, 9, 31-34.
Woodruff, D. S., & Birren, J. E. (1972). Age changes and cohort differences in personality. Developmental Psychology, 6, 252-259.
Woodruffe, C. (1985). Consensual validation of personality traits: Additional evidence and individual differences. Journal of Personality
and Social Psychology, 48, 1240-1252.
Zuckerman, M. (in press). An alternative five factor model for personality. In C. F. Halverson, G. A. Kohnstamm, & R. P. Martin (Eds.),
The developing structure of temperament and personality from infancy to adulthood. Hillsdale, NJ: Erlbaum.
Zuckerman, M., Kuhlman, D. M., Joireman, J., Teta, P., & Kraft, M.
(1993). A comparison of three structural models for personality: The
Big Three, the Big Five, and the Alternative Five. Journal of Personality and Social Psychology, 65, 757-768.
Received December 30,1993
Revision received April 27, 1994
Accepted April 29, 1994

You might also like