Análise de Clusters

329
British Joumal of Health Psychology (2005), 10, 329-358 2005 7fie British Psychological Society www.bpsjournals.co.uk
The use and reporting of cluster analysis in health psychology: A review .

I 1 I ^
Jane Clatworthy *, Deanna Buick , Matthew Hankins * , John Weinman and Robert Home
'Centre for Health Care Research. University of Brighton, UK ^Macmillan Cancer Relief, London, UK ^Department of Psychology. Institute of Psychiatry, King's College, London, UK
Purpose Cluster analysis is a collection of relatively simple descriptive statistical techniques with potential value in health psychology, addressing both theoretical and practical problems. There are many methods of cluster analysis from which to choose. with no ciear guidelines to aid researchers, in the absence of guidelines it is likely that methods already reported by published researchers will be adopted, and so clear reporting of statistical methodology, while always important, is particularly crucial with cluster analysis. The aim of this review is to describe and evaluate the reporting of cluster analysis in healdi psychology publications. Methods. Electronic searches of 18 health psychology journals identified 59 articles using cluster analysis published between 1984 and 2002. Articles were submitted to systematic evaluation against published criteria for the reporting of cluster analysis. Results. Just 27% of the papers reviewed met all five criteria, although 61 % met at least four Details of the similarity measure and the computer program used were most frequently omitted. Furthermore, while researchers usually reported the procedures employed to determine the number of clusters and to validate the clusters, these procedures were often lacking in rigour, and were reported in insufficient detail for replication. Conclusions. The reporting of cluster analysis was found to be generally unsatisfactory, with many studies failing to provide enough information to allow replication or the evaluation of the quality of the research. Clear guidelines for conducting and reporting cluster analyses in health psychology are needed.
It is human nature to classiiy. Classification is an essential means of organizing infonnation, without which we would be unable to make sense of the world around us. Hence, classification systems are fundamental to all sciences: chemists use the periodic table; zoologists, the classification of animals; botanists, the classification of plants;
* Correspondence should be addressed to Jane Oatworthy. Centre for Health Care Research, University of Brighton, 2-3 Turr}pike Piece, Falmer Site. Brighton BNI 9PH. UK (e-mail: j.datwonhy@bton.ac.uk). DOI: 10.1348/135910705X25697
330
jane Clatworthy et ai
astronomers, the classification of stars; geologists, rocks and so on. Empirically-based classifications are known as taxonomies and are most commonly identified through the use of cluster analysis (Bailey, 1994). There are many dUferent types of cluster analysis but most share the same basic principle: to group entities on the basis of their similarit>' with respect to selected variables, so that members ofthe resulting groups are as similar as possible to others within their group (high within-group homogeneity) and as different as possible to those in other groups (low between-group homogeneity). Cluster analysis has the potential to address both theoretical and practical problems in health psychology. From a theoretical perspective, cluster analysis is a means of bridging the gap between nomotlietic and idiographic approaches. This is elegantly summarized by Zubin (1938), one ofthe first psychologists to use clustering techniques:
. . .on the one hand, we have a group of ahstract formulations about ihe reUtionship between variables emanating from research lahoraiories. On the other hand, we have individual case histories written by the practitioner. And there is no way of closing the gap between the two approaches to the problems of ihc individual. Perhaps the research workers might he likened to those who cannot see the trees for the foresi. and the individual case workers lo those who cannot see the forest for the tress. Some middle ground must be found, [t may he that the middie ground will partake of the nature of a grouping of the trees into families, such as oak, birch, etc. That is, the research workers will come to grips with the single individual in the form of a gniup of like minded or likestructured individuals. The case worker will he ahle to transcend his limited interest in the specific individual when the latter's relationship to his like-structured sub-group is estahlished. (p. 509)
From a practical perspective, cluster analysis can perform a number of useful functions. The process of taking a heterogenous sample of entities (usually people in health psychology) and forming relatively homogenous groups, serves to organize large quantities of multivariate information. Labels can be assigned to the subgroups, making the data more manageable for the individual researcher, and facilitating communication between researchers. Moreover, cluster analysis has the potential to make a major contribution to applied health psychology research through the identification of groups that might best benefit from interventions or further research. The BPS Division of Health Psychology aims to apply psychology to three areas: the promotion and maintenance of health, improvement to the health care system, prevention of illness and disability, and enhanced outcomes of those who are ill or disabled (BPS Division of Health Psychology, 2000). Cluster analysis can help to identify key targets for these applied initiatives, enabling resources to be allocated in an optimally effective manner. Indeed cluster analysis has been applied to each of these areas of health psychology. In health promotion, cluster analysis has led to the selection of target groups most likely to benefit from specific health promotion campaigns, while also providing information about the types of beliefs or behaviour held by people in these groups, facilitating the development of relevant health promotion material (Abel, Plumddge, & Graham, 2002; Martin. Wyllie, & Casswell, 1992; Maschewsky-Schneider & Greiser, 1989; Wyllie & Casswell, 1993, 1997). In terms of health care development, health care systems need to evolve and develop to meel communities' changing needs while achieving 'best value'. Cluster analysis can assist by identifying groups of people that may benefit from specific services. For example, Hodges and Wotring (2000) used cluster analysis to group 4,758 young people referred to community mental health services in Michigan according to level of function
Use and reporting of duster analysis
331
in various domains, with the explicit aim of matching services to the specific needs of each group. The authors then developed an algorithm to enable newly-assessed patients to be allocated to the group expected to offer the most appropriate treatment. The major use of cluster analysis in health psychology has been to identify groups of people at risk of developing medical conditions (e.g. Houston, Babyak, Chesney, & Black, 1997; Shaffer, Graves, Swank, & Pearson, 1987) and at risk of poor outcomes (e.g. Brandwin, Trask, Schwartz, 8c Clifford, 2000; Buick, 1997; Chan, 2000; Chan, Lee, & Lieh-Mak, 2000). Identifying these high-risk groups again enables interventions to be developed and targeted appropriately, in a cost effective manner. For example. Weir et al. (2000) used cluster analysis to group hypertensive patients according to adherence to lifestyle change and medication. They identified four patient gn>ups in their sample and considered the type of interventions that might be best suited to each group in order to maximize blood pressure control. While the benefits of classification are clear, appropriate means of forming classifications are not. Although the techniques themselves are fairly simple, there are many methods and procedures from which to choose, with no guidelines to assist researchers in the selection process. In the absence of guidelines, researchers are likely to adopt the methods used by published researchers. As with any research, it is necessary to report the methodology clearly so that others could replicate the study. This is perhaps particularly important with cluster analyses where the potential for variation in methodology is so great. In a popular introductory cluster analysis book, Aldenderfer and Blashfield (1984) list five basic types of information that they believe should be reported following cluster analysis: < r
The computer program It is common practice to state the computer software used for any statistical analysis, and cluster analysis should be no exception. Blashfield (1977) found that given the same data set and the same method of cluster analysis, different computer packages produced different results. Further research is needed to ascertain whether these inconsistencies in computer programs remain today.
The similarity measure Clustering involves grouping people on the basis of their similarity on the chosen variables. There are many methods of assessing similarity, but two methods predominate; squared Euclidean distance and Pearson's correlation. The choice of similarity measure can have a profound impact on the cluster analysis resuhs, as it determines whether the classification is based solely on the pattern of people's scores on the variables of interest, or whether elevation of scores is also taken into account. In health psychology, it is common for the difference in elevation of scores to be considered important to ensure grouping like-minded individuals (e.g. to group people with high anxiety scores separately from people with low anxiety scores). Squared Euclidean distance would be used for this purpose. There are, however, occasions when elevation of scores is not important. Consider a researcher who wished to cluster people according to their response to an atixiety-reducing intervention. Anxiety levels could be assessed before and after the intervention, and people could be grouped according to the similarity in the pattern of their anxiety scores over time. This would enable identification of groups of people showing similar responses to the intervention
332
Jane Oatworthy et ai.
(e.g. reduction in anxiety, increase in anxiety, no change), regardless of their actual anxiety scores. In this instance, a correlation coefficient would be used.
The duster method
Although many different methods of cluster analysis have been developed, the literature focuses almost exclusively on two types; hierarchical agglomerative methods and iterative partitioning methods. Hierarchical agglomerative cluster analysis involves a series of steps, whereby individual cases (people) begin as individual clusters and step-by-step the most similar clusters are joined together, eventually resulting in one cluster containing all cases. Each step is irreversible, so clusters joined at one step cannot be separated later in the clustering process. Hierarchical clustering procedures result in the same ntimber of cluster solutions as there are entities to cluster. Through examination ofthe computer output the researcher is required to decide on the most appropriate number of clusters to describe their data set. There are four main types of hierarchical agglomerative cluster analysis; single linkage (Florek, Lukaszewiez, Perkal, Steinhaus, & Zubrzchi, 1951; Sneath, 1957), complete linkage (Sokal & Michener, 1958), average linkage (Sokat & Michener, 1958) and Ward's method (Ward, 1963). Iterative partitioning methods (e.g. K-means cluster analysis) begin by dividing the entities into the required number of clusters, calculating the cluster centroids and relocating the entities to their nearest cluster centroid. The process of calculating the new cluster centroids and relocating entities continues until all the entities are closer to their own cluster centroid than any other and the solution is therefore stable. Iterative partitioning techniques difler from hierarcliical methods in two key ways. First, the number of clusters is specified by the researcher before the analysis takes place, and therefore otily one cluster solution is given. Second, cases can be moved from one cluster to another during the clustering process to optimize the cluster solution. Milligan (1980) suggested performing a hierarchical method first to determine the number of clusters and the cluster centroids, followed by a K-means cluster analysis to optimize the results.
The procedure used to detennine the number of clusters
When conducting a hierarchical cluster analysis, there will be as many cluster solutions as there are cases to be clustered. The researcher needs to make a decision about which cluster solution to use, using a selection of rules based on heuristics. These rules are sometimes known as 'stopping ailes' as they indicate when the process of clustering from A' to one cluster should be stopped. The most straightforward method of deciding on the number of clusters is to examine both the agglomeration schedule and the dendrogram, (part ofthe cluster analysis output). An inconsistent increase in the dissimilarity measure indicates that the clusters joined at that stage were quite distinct, and that the clustering process should be stopped at one stage prior. This method is fairly subjective since the researcher decides what constitutes an inconsistently large increase in the dissimilarity measure. Everitt (1993) warns that researchers' prior expectations could influence the decision. In an effort to make the decision more objective, formal rules and equations have been derived that can be applied to determine the number of clusters in a sample. Two that were found to perform particularly well in a review of such procedures (Milligan & Cooper, 1985) are the pseudo-/^ statistic (Calitiski & Harabasz, 1974) and the cubic
Use ond reporting of duster analysis
333
clustering criterion (Sarle, 1983). These are not, however, uniformly available across different software packages.
Evidence for the validity of the clusters
The cluster analysis process will identify clusters in all data sets, even if the data are homogenous. As Breckenridge (2000) reported, cluster analysis can create as well as reveal structure' (p. 261). There is a need to show first that the clusters are stable, and second, that they are of value to the field of study. There are various methods of determining the stability of the clusters. The most common method is to randomly divide the study sample into two halves and repeat the cluster analysis on each. If the clusters are stable, a similar cluster structure should be found in each half of the sample. An alternative is to repeat the cluster analysis in a different sample drawn from the same population. Jolliffe. Jones, and Morgan (1982) used a process of removing a few variables and rerunning the analysis to see if the clusters were robust. Kos and Psenicka (2000) advocate repeating the cluster analysis with a different clustering method to see if they replicate the same findings, although this approach may be questionable given that the different methods were designed to cluster in different ways. Perhaps the best advice is from Ketchen and Hult (2000) who suggest employing multiple techniques to assess the stability of the clusters. Stability of the clusters is a necessary but not sufficient determinant of validity (Aldenderfer & Blashfield, 1984). The validation of the clusters must include some evidence of their value to thefieldof study, for example, by using inferential statistics to compare the groups on variables that were not included in the clustering process. In health psychology, the aim of the cluster analysis is often not simply to form a taxonomy, but to see if the groups identified are associated with an external variable (predictive validity). For example, Shaffer et al. (1987) grouped medical students on the basis of personality traits to see if the personality clusters would predict incidence of cancer. The provision of these five basic pieces of information might seem straightforward. Preliminary evidence, however, indicates that reporting may not be satisfactory. Blashfield (1980a) examined six cluster analysis articles published in the Joumal of Consulting and Clinical Psychology in 1979 to see how much of this information they provided. He found that two (33%) failed to report the cluster analytic method used, three (50%) failed to report the similarity measure, five (83%) failed to report the computer program, and four (66%) did not adequately describe the procedure used for determining the number of clusters. Not only did the study reveal the poor standards of reporting of cluster analysis, but it also gave insight into the types and appropriateness of methods used. For example, while all but one of the studies attempted to provide some evidence of the validity of the clusters, inadequate procedures for assessing validity were often selected. Punj and Stewart (1983) also encountered weak reporting of cluster analysis when conducting a review of clustering techniques in marketing research. They stated: 'The lack of detailed reporting suggests either an ignorance or a lack of concern for the important parameters of the clustering method used' (p. 136). The aim of this review is to use Aldenderfer and Blashfield's (1984) reporting criteria as a framework to describe and evaluate the reporting of cluster analysis in health psycholog)'. The review serves to identify the cluster analysis methods that are used in health psychology and to highlight any particular areas of weakness in conducting and reporting cluster analyses. '>
334
Jane Qatworthy et al.
Method
I*
In order to assess the reporting of cluster analysis in health psychology it was decided to conduct a systematic computer database search for peer reviewed health psychology journal articles reporting cluster analysis.
Journal selection
Health psychology is a broad discipline that shares common ground with many other fields (e.g, clinical psychology, occupational psychology, developmental psychology, medicine). Initially electronic databases were searched for any cluster analysis articles that could conceivably fall into the realm of health psychology, but it proved difficult to restrict the search to a manageable number of relevant abstracts. It was, therefore, decided to limit the review to cluster analysis articles published in key' health psychology journals. There were three stages to determining key health psychology journals. First an Ingenta publication search was conducted for any journals containing the words health andpsychology in the title (5 journals). Second, the ISI Joumal Citation Report was examined for any additional journals believed to be primarily health psychology journals with an impact rating >1 (15 journals). Finally, six health psychologists were independently presented with the list of possible journals and were asked to decide whether or not they believed each journal was a key health psychology journal. Wliere reviewers were unfamiliar with the journals they were provided with sample abstracts from the publications to aid their decision. With a majority consensus, 18 journals were selected, listed in Table 1.
Article selection
Articles were identified through electronic searches ofthe PsycINFO, Medline, CINAHL and ISI Web of Science databases. The chosen search terms consisted of the keyword cluster* in combination with each ofthe 18 journal titles (e.g. cluster* in Keyword AND British Journal of Health Psychology in Periodical). The searches were restricted to the period since Aldenderfer's and Blashfield's (1984) published recommendations (1984 2002). The number of articles retrievedfi-omeach database is presented in Table 1. In total, 278 articles were generated from tlie search.
Abstract review
Tliree psychology researchers reviewed all of the abstracts to assess whether they were (a) health psychology and (b) cluster analysis. Following an initial meeting to establish consensus over what 'health psychology' and 'cluster analysis' encompassed, the three psychoiogy researchers independently assessed the abstracts and recorded their decision on the two criteria on a proforma. Full articles were obtained when at least two of the reviewers considered both criteria met, or where it was unclear from the abstract whether the criteria had been met.
Data extraction
A data extraction sheet was compiled to reflect the research questions. The five necessary pieces of information as recommended by Aldenderfer and Blashfield (1984) were recorded (i.e. the computer program, the similarity' measure, the cluster method, the process for deciding on the number of clusters and evidence for the validity of the
Use ond reporting of duster analysis Table 1. Number of articles retrieved from each database for each journal
335
Journal American Journal of Health Promotion Annals of Behavioral Medicine Behavioral Medicine British Journal of Health Psychology Health Education and Behavior Health Promotion International Health Psychology Journal of Behavioral Medicine Journal of Health and Social Behavior Journal of Health Psychology Journal of Psychosomatic Research Psychology and Health Psychology, Health and Medicine Psychosomatic Medicine Psychosomatics Psychotherapy and Psychosomatics Social Science and Medicine Sociology of Health and Illness
No. ISI Web PsycINFO Medline of Science CINAHl. replications Total

2 5 4 4 4 0 1 0 7 3 0 0 0 0 0 5
5 5 2
4 4 IB 17
4
4 3 6 5 13 9 1 I
9 8
9 2
4
0
4
1 19 18 1 0 23 0 0 21 10 16 82 0
10
7 29
5 5 3 II
6 21 19 2 2 28 15 0 23 12 18
25
1
1 2 22
9
1
34 8 0 35 15 21
17
9 0
0
19 8 14 23 0
0
0
18
9 9 53 1
0
0
28 0
85 0
Total:
101 1 278
clusters) In addition, general information about the type of study was recorded (e.g. area of health psychology, research aims, types of variables used). One researcher performed the data extraction. Any ambiguous information was discussed with two independent researchers. Information from the data extraction sheets was entered into an SPSS 11 database for computing descriptive statistics.
Results
Of the 278 articles generated by the computer search, 31 were not considered to be health psychology papers, 119 did not use cluster analysis, and 57 were neither health psychology nor used cluster analysis. A further 12 cluster analysis papers were excluded as they used cluster analysis to group variables rather than people.' Brief descriptions of the remaining 59 articles included in the study are presented in Table 2.
How is cluster analysis being used in health psychology?
The 59 studies were divided into the three BPS Division of Health Psychology areas of health psychology research (BPS Division of Health Psychology, 2000). The majority of the studies (80%) fell into the broad category of 'prevention of illness and disability and enhancement of outcomes for those who are ill and disabled'. Three medical areas predominated; cardiac disorders/hypertension (27%), pain (12%) and cancer (12%).
'Although some authors use the term duster analysis in a broad sense to meon both the grouping of entities and variables (e.g. Tryon&c &ailey, 1970). it is more commonly used to mean just tbe grouping of entities (Blashfietd, 19B0b). This review is adopting the laaer, more popular, definition of cluster analysis.
336
Jane Oatworthy et ai
ra u Z V o tn w cT
VI
' i -
O . 0
0(
VI
2 S
tai rellati
re X o 'C
Ol' O
3 J3
'O
a.
b 2 I /I
u n
ra
_c
& J ^ 0 '^ ra * ^
"rt c
'O
o
o
rvici
Ic oJ Q. VI 3 ra (U
1 *
tf "S
3 u.
C M
Ii
o
a.
i= ra
i
0 01
^
0 o Z
3
. o E .S "S Q 0
01 ^
ra
oddi
E O 0)
VI
U E
5 5
l/l O
= 5
a >
V Si
s 3
l/l
U
c '> ra 01 x: ^
J3
o o
Q ^ C O C .5i
Is
a. Z
CL
c fi
Li. J3 M
o med adul
0)
ital
a.
ra >
0)
o u
e^
VI
13
> T.
O.
. 0 O
O4
E o
,c
u
onic
o OJ rt
Stati
CL
) ol c 01
I
2 3
.9-E
0 3
%
go
u ula
0)
1-
S
-E
xes
Ul
CL c .o 0)
2
-? T3 n =
O
o G vi
y 13
0
rt
p o
n u
i
'illF
VI
>*- rt
O S
lasit
o i3
4) O
c c <J Hi ?^ -c
tu L. . V) *~^ O J 0
o -fc
oup
< J
* ^ ra
c
0) 3
o a.
o> t; u ra
E :5
CL
(M)
i_
_^
"rt
rt
j-)
|P
M-fc -O
S .^ ^ re
0> m o Z u i _
!! w b E ra .2 c
3 o
n
-C
u
T' vj
V 2
m 3.
rghi
1 / 1 'n
08
CQ
mX mC
Use ond reporting of cluster artalysis
337
5 o u>
V)
ra a.
<n
a. C
00
:
M -9
u
C
o
.E o
n
3 Q J
1^
ps).
vi
CtOI
rt c
'3 ra
0)
, ^ VI L
a.
o E
o 3 E 9-
.y
o ^ o .y
05 rt
2
0 CO
c
O V
f^-2
E ot oJ M
1 1
2
VI
^
ra
0.
X
VI J3
S
c B
01 4-1
'!^ S
E -5 S
rt O rt >.
5b * i
X w
ra ^
fa
C" 0)
1 . 1
S a.
l/l
-C
^ Q
VI
C _
2. 3 .0 o
>
on c dical
o sed
3 V)
S. *" /^
ra
S 81 e
-=: "-J
>.. a.
ogl
-=
a X u
Si rt
01
e Es
"^ "S o
ot'
E Z
Wt S.
ra
"^
^1 i
E ra
E "^
E
2 o .u
3 B
VI
3 0
VI

o
"a
rt
C
0 0
C n
0 0
go
E t
rt
:>
CO
a.
LU
g B-
X u c tc n
at
patii
I
i3
1 01
8^
j; o j * E
D 0)
OJ
ra
c .a -D
OP " D OJ
ati
OJ
Z
o m 00 rs
E E o o
m o
E V -S
a.
E
O3 M
Itiphasic
d 3
0
VI
atmei
r-
ra _ S
l/l
1
heal
vt C OJ
P
I-
U ot F ra
Z cor
q E fc S.
c "
u Dp
01
E
c .0
ra
E o
u
"d O
pti
1 lUJ:
ra c c g
o k-
4) 3
G.
tfi
3 c
F -c
'rt
"5 "rt
VI C
t...
E C .E
13 H
Clu
OJ
c rt iff 3
ter
x
rs
ter oca
tf)
1
co"
00
OJ 0)
*>*
5 = at ^
CQ XI
li
ii
5! = E
o o
:l 2
p u
"~
- s :
B
<j
ra
a. E
a.
z ts (^ o^
00 Q^ ^
o o
u ?
3 E
VI
""^^
4)
2
X C
X w
o ^ X :=.
3 cr O o X * ^
Use ar)d reporting of cluster anafys'is
339
I
TJ
'o
O O.
5
2
OJ
s
oi E E ro,
V)
. &
Q.
3 e E o
U O
.1 2^
rt P U ra
Q- 3
M 0 c
OJ
"ra
X OJ CL
a. VI
c
00
a:
? "ra
oI
E ^
*->
Hi
ra
00 : :
a. o
0)
a
o U
c 0)
? E
-z:
c S
< -a
O
0
1 _ J
IS'
0
.^ E fl c 'C
o o VI
C aj e
OJ Q .. >
"rt 01
c
M
u
D
rent;
opte
mal
01
" g
rt
a.
at
a.
rt
u ft" t; 00 O
TJ
19
ft
"3
'equii :racti
X OJ
c . o; ra
0
a.
rt
ra
60
O
Q . Ot
si
_ ra
O S.
00 i.
o E
TD
-C
Health Ic cont rol related d tress and behaviou
E
E 0 u E
0
c o CL
VI 01
Oj OJ
Applying uste anal] of chroni evaluatec rand
c
I-I
Family hi ory 1 Datterns. rdio>
0 '^
Lt VI
5.^0 O JJ
c at c u -u O o ra ;
VI u
.a y .E
01 c
"*" U
ra
s s. -G r
"O ra O 'S .iS -5 rt
LI
at
at 3 00 'C JO
OJ O J
"o ^ E
at -?; O
-o o 3
rt 3 " " OJ
0 u
_ c "2
4-1
V)
ra u T) rt c
0 S c JS
: ;
. .
L;
s o is
rt c
at O J rt 3
c
Q,
c
<J
>
tJ
E .g
O rt
. o fe
o. o Q.
3
c
tu
a c
Q J rt
> l_
<-> C
E St 3
( S "
ei
il
I
.2,1
o ,
340
Jar)e Oatworthy et ai
Q. Q ^
VI . Q
-=
o
CL
.li Lt ^ O lies
ot o
U o >
0 0 u
CL O J tn
Q.
o
CL.
Demog coping.
ra
l i ra
U a.
vio
I! n
L. -2 3
Q.
B 3 5 "-
1 s

w
* J
01
01
If
"B
0)
z ^ z^5
-S "^ .2.
I^
o o** Z :3.
34!
a. =
c ra Q"t3
o (J
0
OJ JZ
rt
OJ
o Z
T3
s
TJ
rt
Si
E o a. Z
41
a.
Z Z
J at
>
a. Z
o
at Q
o U
Q.
a. O
' cardi ansp ents. 159 chroi k pain patiients. dache patiients
chroi ents
low i ents
y 'c
u rt
gene ral ctitiorlers
rt _o
paJn
u ra
c u ra ' c
.ii
V,
rt
O.
u ra o ^c ra Si ra at ^ a.
^
f~
f^
t_
a.
sO
u rt Q.
T)
M 5 Z
.^ 3
aj
OJ
1 1 1 at
I
fc -5
0,
VI O
Ster sona onic
rt C rt
rt TJ
2.
&
rt
01
"o
j
'kj
B o ra
M
iysi Inv
t*-
-D S
vi
'"
v>
VI
0) .ii y
m
21 's
1.
U a. u
01
I,
a. .
l/l
:=
Q -fe
zr
.ti
VI
o s
I
OS
"3
OJ
a:
" O ^ C o -^ o
o a.
342
]ane Chtworthy et al
E
(U U ^ E
. 0
0 1/1
rcui
a.
M
"^ ^ 3
O ^ J3 a) rt : m VI ^
s
o
lent
O JJ
rt
o
t
VI 3
<
lily
raphics :tendan
0
rt 0) 0)
E .c -g S. ^' tf
o
iO
o 1i/T ^.
E VI
Vl
ui
ul U
Q.
3 VI
"D
_c
"u
a > Q
.3
m
rt
E O 0) Q J
c 0 Z
o Z
rt ^ O J ra >
Si
VI
t%
ba 5L C ,,V'
* 2; o ^ 2
IM) C
Q.
2 -D ^
3 9 0) id '-' d p jg <
ra c
2P c o. o U
15 0 ^ F >
t 0
(/I O
a. a.
in
^ o -g
*p O ra ^ -i^ < ra C
f o .
I ra a u C
<V
J >
C {^
^ ro rs
3 QC 0, O JC
1 2
O .M
11
M 00 P 13
II
1^
-5 3
î
(: f
<j
go
-S !5
>^
II ^1
.t
' o o
V
in
VI c rt Qi 01
ra
s
l/>
U
N
ra oB
-^ z ra 3
Z Z yoca innei
g- s <
08
"2
"5
343
Ofthe 10 'promotion and maintenance of health' studies, four were concerned with general health promotion, two with alcohol consumption, one with diet, one with smoking, one with stress management and one with safe sex behaviour. The two studies that fell into the category 'analysis and improvement of the health care system and health policy formation' were both concerned with the roles of GPs. For some studies, the goal of the cluster analysis was clearly to advance theory by forming a taxonomy. For example, Wyllie and Casswell (1993) formed a taxonomy of male drinkers, with the hope of informing health promotion efforts in the future. Others used cluster analysis for practical purposes. For example, Griffiths (1999) performed a cluster analysis in order to draw a cluster sample for the main study, while Janman, Jones, Payne, and Rick (1988) chose to use cluster analysis as they had insufficient power to use ANOVA with the number of independent variables in their study By grouping the participants according to the numerous variables, they believed that their results would also be psychologically clearer' (p. 19). Many studies addressed both theoretical and practical issues. For example, Brandwin et al. (2000) formed a taxonomy of cardiac transplant candidates based on scores on personality measures, and then compared the groups on mortality All studies, whether as a specific goal or by default, formed a taxonomy. The variables used to form these taxonomies are listed in Table 2. Unfortunately it was not possible to judge the worth of the taxonomies as many of the cluster analyses were reported in insufficient detail for evaluation.
To what extent do the studies meet Atdenderfer and Blashfield's (1984) recommendations for reporting cluster analysis?
Details of the cluster analysis methodology reported are presented in Table 3Seventy-three percent of studies failed to meet all of Aldenderfer and Blashiield's criteria (see Fig, 1). The criteria least often met were reporting the similarity measure and computer program used.
The computer program
Almost half of the studies failed to report the program used. Of those that did report the software, SPSS was used more than twice as often as any other type of statistical software.
The similarity measure
Fifty-three percent of the studies foiled to report the similarity measure used. The remaining studies all reported using Euclidean distance or squared Euclidean distance.
The duster method
Four studies failed to report the cluster method. A further nine studies only broadly defined the clustering method as a type of hierarchical method. Hierarchical methods were used in 45 studies, whilst iterative partitioning techniques (K-means) were used in 32 studies (see Fig. 2). The most frequently reported hierarchical method was Ward's method, used in 24 studies. Fourteen studies used a hierarchical method followed by K-means, but only six of these appeared to have used the recommended technique of using the cluster centroids from the hierarchical clustering as the starting seeds for the K-means analysis (Milligan, 1980). Instead, the
344
jane Clatworthy et al
-s
rt 01
-O
.2
?
rt rt
a.
E
rt
4.J
fl)
T3 fl)
I.
8 c o
s; 
o 0)
L.
fli
flj
rt Q.
rt O.
xte
0)
E @
c
fl)
k_
o
rt
o
Z
o U
a. E 0
g. 5 E .
a.
E o U
0
a.
UJO
eta bili
bili
o
at
s
u
fl) D
a.
cri
eta
Ste
4-1
Ofl)
tJ
lari
o t;
Q o
fl) V) 'rt \*^
Qfli
c
c
S 3
U
I-l
E 3 -^ ^ 1/1 w ^ S c E v
QJ U
><
o- SI Q-
C (U
I-
.12
"i/i ' i - .^2 C

U C o V)
o
u
s z
^ K u a:
o Z
^3
s 3
c 2
^ Q
=^.
0)
T3 fl>
3 M
0
2 4/1
0
3 in
s 3 tn
0
fl)
Z
T3 flt I *
z
o ^
m 0
l/l Q. l/l
Z z 3 V)
SPS
1/1
(/I
z S
CD
991)
a
01
1
oa 00
<
^
CT*
g o 2
99
8
.M
OJ
4.1
Q
Qi
o
111
ftO
CD
CO ^
cfl
Bu
o o*
(o s
Use and reporting of duster analysis
345
rt 0> X
rt C 0) X
rt
ith pre
s
X
red mpli E o U
rt rt D. I-l ul
rt 01
rare
exp red
red
TJ Qi
red
Q.
rt E rt Ul O-
rt Q-
rt E rt a.
red
a.
^=
rt
Q-
^
KJ C
E
0
Q.
E 0 U
E
0
u O-
E o U
vt ra n a
Qi
rt
Q-
PU
DO
E o
E 01 o A E U
Oi
Q) c
rt O.
rt Q-
J) E ?^ 0 Q U
E o U
E o U
8
E
c
Q>
3
rt O
I
<i)
c cu
sist arc
c ^
> c
4)
1
c c
0 u c
s
0)
.2 S
3
E .5"
_rt
t-l
S o Z
c
in in
-a
vt
s
rt
CD
I/I
o 'E
c u
'ut
I-l 0
'o
E
'CA
Ul Oi I.
t-l
o Z
rt
Q}
rt 01
E rt
c
T)
3 VI
->
s 0
V)
<-
z
Not stated Not stated
z z
TJ "O E
Not stated
Not stated
Not stated
Veldm
I i I 3I
,
ON
CO CO
co"
eps
S"
El
o o
00 00
O^
a*
Oi DO t_ 01
Q)
oB
Q z: O
.E x:
(JJ
a,
.red
rt
s
c
C \^
rt
VI
s
o
"5.
0
0>
c 0 c
rt
c
01 X
Qi iH ul
rt C i _
rt
rt
c
L. ID n rt l/l -
C ;_
0) 4) 0
rt u 0) 0)
red
red
red
U "rt
u
V
s<
ot
X c 0
d)
ISU
lUE;
tn "
346
jane Ciatworthy et al.
J3 rt
XI
ra "u,
.a
rt
Oi -O
Qi
OJ
'L.
01
_rt
ra
rt
.o
ra
o rt
.a
rt c
< u
rt c
01 n
ra
01
external
irnal
rnal
exte rnal
external
mal
"6
U)
pared exte :urs w ith previ gro Js/exp opin ion
extei
0)
Q J
rt
o U
Spl
Spl
a. E 0
a E
u
rt
a. U
0
rt
I/I
kl
rt
rt
a. E
0
UU
E c 0 0
\j
pared
TP 0)
"5.
o U
a > "P c o a; 3
01
11
CL
. rt
:ed
rl
01
OL
E 3 .5, rt
E 3 _
(.j
art
o ,. 2

1 1 81
t-t
0 Z
o u
Ê
C J9 O 0 = 0)
vt
y ^c .i S t/O
Jz
0 u c
rt
SI u ^ 0)
l/l
E vt
o;
1-
"O 0)
"D
O J
. E
.ra E
3 T) a ra 3 <Ô
Q c
TP OJ
3 -D l/l ; ^
3 l/l
2 3
Q
k)
UJ
z
TP OJ CO
_J
S "g ^
3 O- -
1 = rt 3
z
o.
o Z
SPS
o %
7
_0> T3 <
Z 3
Stat
4J
4"
to
iTAh
Z
oo"
CD CT*
U
00 03
r
00
_i
CT* CT*
CO CT* CD
0) 9) = CT*
"~^
rgen oust
S '-
'^ X ^ - 'S
08
Vi
0)
5
o r:
X :^ X ^
o I
Stai
a 3
Q 3 c ' ^ -,
T3 0>
gro )s/expi
3
Ipared sampi'
pared
Of
pared atedK ng see
c 0
c
0}
w 0
rt
OJ
o pa.
0)
c 0
c 0
c 0
c 0
T3 5
previ Opini
s X
-5 S
rna!
|EUJ
sno
(EUJ
m 3
o 0 t
dl
33X3
33X3
i33X3
pa.
ISU
SVS
Use and reporting of duster anatysis
347
"1
Q-
I/I
E t;
s.
o
3. e
o ^^ U Q
S -D rt
Q-
rt O-
rt
o U
o U
t-l
(U
C
O
"" -5 E
A E U
o
E
U
o
A EU
.o E
Q.
u/
E P.
(U
E 5
E
0)
l/l
l/l
imil rity 1
ri de
Siste
F
rt
'D
o
15
l_
rt 01
E .",
c
OJ
l/l
3 rt OJ
ntei
1
TP O
t-l
1 1
i
"Ci
00
J c
ti
Qi W) _C 01
ra w 00 ^^
ra 01 : = _QJ 0) 00
.imil
imil
o s
3 8;
c E
E
C rt Q-
E
rt
E
L-
&
'vt
c o
rt
ODU
t-i
i^
= E
I "
o
s
3 V
rity
tatP
E
rt = n
XT
VI
*-l
rt S
Z o. Q Z ca
I 3
SPS
(/I
/era /era -me;

<
'ard /era
_
_3
J3
.c I/I
1- ff^
aum
700
ODU
2 gtfO o
1
Ot 00
E ra
ra o>
..vt
oo
ti
0
D 00 OJ
OJ CT*
Z c;
o o ^ Z 3.
348
Clatworthy et al.
_0J
-O
ra
r
c = ^
X Oi
dl
c
t-
o
TJ
amp
v c o
TJ O J rt
E o .0 2 '> E
01 X
o .0 > s
Oi
i
O 0) O P
in t-t L.
rt Q.
O U
o
.t!
OJ
a. E 0 u
EM
OOP U U M
lit
Q U
E o U
a.
O
0>
p
*J
01
E .t:
IJO
U Q
E U feb
u
00
a.
01
a. tated
Q-
rity m
: dust
sisten
c 0
c
'i~ 0>
E .^,
1-1
3 VI
a.
3_
CL r
ra
ith
iste rch
OJ
in
sist
t!
3
J < > cin
o;
C O J
.9 m u
OJ
E
rt
ao c rt
la 3
rt
VI
kJ
Q "o
o u
E c m
ra
B
o Z
OJ N
8 E
in
o
S 01
3 O "w
L.
M
c;
01
01
E rt
ra OJ
2 3
<-i
3 ::^
0) T) 0)
2
0
Q
in >-
3 o Z
"^ OJ 3
M
LU
-D 0 1
1^
01
XI Qi
3
"C
LLJ
rt 3
u-i
o Z
s
in
<-
a.
" S
" 2 -S 3
o ^ D Z S Z CO
987)
(/I I/)
o 3 Z 987)
>-
Rocti (199
Nou (199
Rose
RapF
Raja
Sch (20
0> CO 00

c
X 01 O-
o .2
Q- 0
a. o
at CL X at U
a.
at
o p A E U u &
g at
d.
o U
E o U c
0
E o U
a.
o U
o o U 00
Ups/
rt
-^ ji
in S _
rt
rt
rt
o o U 00
Ql
E
OJ
sta sti
QL_
3 in
t-t
at
u.
gi
o S! 1 ^ U < < >
^ ^
Dp rt at ( at E
c rt 0)
0 1
I
o
1/1
Q
c
3 m
0 Z
n Vi
IS
(1997
Whit
SA:
sta Sti
rt OJ
ncur ups/ exp u

c
mpa red
red
08
350
jane Clatwortby et al.
1. Computer software 2. Similarity measure 3. Cluster mettiod 4. Method of selecting no. clusters 5. Evidence of cluster validity Met all five criteria 0 10 20 30 40 50 60 70 80 90 100
Percentage of studies Figure I . The percentage of studies meeting each of Aldenderfer and Blashfietd's reporting criteria.
majoiity had only used the hierarchical cluster analysis to determine the number of clusters for the K-means analysis. Inconsistencies in cluster analysis terminology' were readily apparent. For example, six of the studies used the average linkage method but it was reported under four different najnes (average linkage, average linkage betweengroups, between-group average and unweighted pair-group method). Similarly, five researchers referred to K-means cluster analysis by the computer program s algorithm name (e.g. FASTCLUS, Quick Cluster), necessitating further investigation to decipher the actual method used,
TTie procedures used to determ}r)e the number of groups in the data
Ten studies (17%) did not report how the number of clusters was established. In addition, many studies selected questionable methods of determining the number of clusters. For example, eight studies chose cluster solutions that had sufficiently large numbers of people in each group, seven studies automatically chose the same number of clusters as found in previous similar research, and seven studies chose the cluster solution that facilitated interpretation. One study (Barsky & Wyshak. 1989) reported that the number of clusters was chosen 'arbitrarily' (p. 4l6). The most common method, however, was to look for an inconsistently large jump in the similarit)' measure between joined clusters (37% studies), either by examining the agglomeration schedule, the dendrogram or both. None of the authors provided the relevant section ofthe agglomeration schedule or dendrogram for the reader to assess. Eight studies used the more formal rules ofthe cubic clustering criterion and/or the F statistic. Again though, minimal information was provided, with no studies giving the actual cubic clustering criterion or f-statistic figures.
The validation procedures
Five studies failed to show any evidence of validating the cluster solution. Several studies made unsuccessful attempts to validate the cluster solution (e.g. attempting to repeat tht* cluster structure a few years later or to repeat the cluster structure in a different sample). Less than half of the studies addressed both the stability and external validity of the clusters (36%).

I. u> 0) 10 w V I/I >.,
_3 rt C
351
1 3
n
<

E o o a
rt
UJ .-
c
OJ
"I .=
tu> E
ll
f 0
ll
01
Ii
O c a- I
0
tl
e S
(0
O (0
i
9 .
Q. O O < D E
. 3
a.
o " 5 <1>
s ?>
C
2
c
=3
ta
S .E 5
E J3
:2 s
vt
01 -a
,S
,M
rt
01 oj
352
Jane Clatworthy et at.
In terms of cluster stability, 17 of the studies (29%) repeated the analysis is split halves of the sample or in a different sample drawn from the same population. Five smdies (8%) attempted to display stability- by finding consistent results across different hierarchical clustering methods, althougli, with the exception of Shapiro, Rodrigue, Boggs, and Robinson (1994), this was poorly reported (perhaps because there was limited overlap between the methtxis). In terms of validity, nine of the studies (15%) reported concurrent validity as the clusters were In line with the findings of previous research or expert opinion. Forty-nine of the studies (83%) used extemal variables to validate the cluster solution, either by comparing the groups on extemal variables (usually ANOVA) or predicting group membersliip from extemal variables (discriminant fLmction analysis), For many this was the primary goal of their investigation (e.g. to see if group membership could predict another variable) while for others the comparison was purely to validate the cluster structure. Of concem were the number of studies conducting ANOVA (17 studies) or discriminant function analysis (10 studies) using the variables that were used to form the clusters. Although these can give an indication of the relative contributions of the different variables to the cluster solution or the degree of separation of the clusters, the actual statistics are meaningless. The clusters have been purposely created to be as different as possible on the selected variables, hence it is unsurprising that ANOVA tests indicate significantly differences between the groups on those variables or that group membership can be predicted from the variable scores. Aldenderfer and Blashfield (1984) state,'. . .the perfonnance of these tests is useless at best and misleading at worst' (p. 6S). One of the few papers that gave a clear explanation of the limited value of tests was that of Mulry, Kalichman, Kelly Ostrow, and Heckman (1997), stating that, 'Significance tests on the variables used to create the clusters coukl not be used to validate the cluster solution; such analyses merely indicate whether (and how) clusters differ on each clustering variable' (p. 62). Most studies, however, gave no explanation of how the results of these tests should be interpreted, while some made ambiguous or incorrect statements about their value; for example, Kaluza (2000) wrote, 'ANOVAs indicated that the three clusters differed reliably in terms of mean scores for each coping scale' (p. 427).
Discussion
This study has revealed a need for improvement in the way cluster analysis is reported in health psychology publications. Using a strict interpretation of Aldenderfer and Hlashfield's (1984) five basic criteria, only 27% of the studies reported the cluster analyses in sufficient detail: 61%, however, met at least four of the criteria. Reporting the similarity measure used was the criterion least often met. Where it was stated, Euclidean distance and squared Euclidean distance were exclusively the measures of choice. One would expect Euclidean distance to be popular as psychologists are often interested in differences in elevation between people's score. A few studies, however, conducted cluster analysis on repeated measurements of variables over different time points (e.g. smoking frequency over 7 years, Soldz & Cui, 2002; anxiety and depression over 10 years, Arefjord, Hallarakeri, Havik, & Maeland, 1998; change in physical health over I year, Dewetal., 1998). In these situationsacorrelation measure of similarity, which focuses on the pattem of the variables (i.e. change over time), might have been useful. It is unclear whether Euclidean distance/squared Euclidean distance are consistently being selected because they arc indeed the most appropriate measures of similarity, because
Use ond report/ng of duster analysis
353
researchers look to what has been conducted in the past, or because they are the default setting for SPSS. Four studies failed to report the cluster analysis method, the most fundamental detail of the cluster analysis. Nine more studies only brcjadly defined the clustering method as a type of hierarchical method. Different hierarchical methods can produce profoundly different results (Romesburg, 1984) therefore it is essential that more detail is given if the study is to be replic-able. Amongst those who did speciiy the cluster method used, two methods dominated; K-means cluster analysis and Ward's method. It is not clear why these methods were used so much more than the others. Monte Carlo studies^ have generally found that they perform well (Milligan, 1981) but also report similar results for other methods (e.g. average linkage, complete linkage). Unfortunately, no single cluster analysis method can be recommended as Monte Carlo studies have generated confiicting results depending on the type of data used. Eurther research is needed to determine which method of cluster analysis is most appropriate for health psychology research. It was interesting, however, that Miiligan's (1980) recommendation of conducting a hierarchical cluster analysis to determine the number of clusters and appropriate starting seeds, followed by a K-meatis analysis to 'fine-tune' the results has so rarely been followed. Indeed, just six studies used this procedure, which is perhaps a reflection of the difficulty of obtaining such an analysis. The review has highlighted the lack of consistency in cluster analysis terminology used within health psychology, with the same cluster methods being reported under different names. There is a need to establish a standard terminology to avoid confusion. As reported by Punj and Stewart (1983), the problem lies not only in cluster analysis texts using different terms for the same methods, but also in computer programs developing their own names for methods (e.g. 'Quick Cluster' in SPSS and 'FASTCLUS' in SAS). It is important that researchers report the actual clustering method, not just the computer software's name, to enable comparison hetween studies and to facilitate replication. Although the majority of studies stated how the number of clusters had been determined (83%), the quality of this reporting was generally poor. Details of procedures used were usually reduced to simplistic statements (e.g. 'the number of clusters was determined by looking for a large increase in the similarity measure), with no description of the criteria used. No study included a dendrogram or section of the agglomeration schedule to support the decision. Similarly, none of the eiglit studies using the F-statistic or cubic clustering criterion reported the statistical values for these measures. It is luidear why researchers have not included such important information. It may simply be hecause they are following the lead of previous researchers or that publishers consider the intbmiation superfluous. This issue needs to be addressed: when reporting other statistical tests it would be unacceptable for researchers to state the overall finding but present no figures. In some studies, when the method for selecting the number of groups was stated, it was of questionable value. Forexample, eight studies chose the number of clusters on the basis of avoiding small groups. There is little point in continuing clustering until small distinct groups are merged with larger ones. The small groups may represent important groups of individuals with extreme beliefs that warrant attention. Altematively, they may be outliers that can be excluded from the atiaiysis. Fdelbrock (1979) has aiûed tliat it is not essential to classify everybody in psychology. He suggests that an algorithm that classifies fewer people but does it accurately is more desirable that one that classifies
Carlo studies of cluster anaiysis involve generating artificial data sets with a known cluster structure and applying different methods of cluster anatysii to see which meôd recovers the predetermined cluster structure most accurately.
354
Jane Clatworthy et al.
everybody' but makes misclassifications. It is acceptable to select, for example, a six cluster solution and then choose only to focus on the four largest groups if justification is given. Seven studies attempting to replicate a cluster structure found in previous research automatically selected the same number of clusters that were fc>uiid previously, without considering ifa more appropriate clustering solution existed in the data. This biases their results in favour of replicating previous finding. It would be more appropriate to independently select the number of clusters on the basis of one of the recommended stopping rules' and only then compare the findings with those found in previous research. Seven studies chose the number of clusters that allowed for the most meaningful explanation. With so many cluster solutions to choose from it is probable that some will be conducive to more desirable explanations than others. There is little point in conducting research if the results are to be biased in favour of researchers' expectations. It is essential that more objective methods of determining the number of clusters are used. While all but five studies reported some form of validation, less than half addressed both cluster stability and external validity. All cluster analyses will provide clusters, whether true groups exist in the data or not, therefore thorough validation is the most important part of the clu.ster analysis procedure. On a more positive note, most studies did validate the clusters against external variables and reported a wide range of applications of cluster analysis in health psychology. For example, to help identily predietore of anxiety and depression in the partners of those with myocardial infarction (Arefjord et al., 1998). to explore the relationship between dental fear and response to fear-reducing interventions (Litt. Kalinowski, & Shafer. 1999) and to explore the relationship between coping style and impact of rheumatoid arthritis (Newman, Fitzpatrick, Lamb, & Shipley, 1990). It is clear that cluster analysis has immense ptHential in health psychology as a means of addressing both theoretical and practical problems. The restriction of this review to key health psychology journals was a major but necessary limitation of this study. The review has certainly not identified all cluster analysis articles in the tield of health ps>'cho!og)' since 1984 because many articles would have been published in journals not included in the review. It will also undoubtedly have missed some cluster analysis articles published in the selected journals because the statistical methodology used is often not reported in the key words or abstract and therefore the term cluster might not have been detected in the electronic search. This review has, however, served to gain insight into the current use of cluster analysis in health ps7cholc>gy Although over half of the studies met four ofthe five reporting criteria, it is important to remember that unless all five criteria are met. a given study is not reported in sufficient detail for replication. Moreover, evaluation ofthe quality of a study may not be possible. This .study has not only identified some problems with the way duster analysis is reported but aiso potentially serious problems with the underlying methods used. There is a need for clear guidelines to assist health psychologists wishing to conduct cluster analysis, since the contradictory and bewildering literature may deter researchers from using what might be a valuable technique. Moreover, without such guidelines, a tautologous situation may arise whereby researchers try many different methods. If the theoretical and practical value of cluster analysis is to be realized, it is clear that a more principled approach to conducting and reporting these techniques is required.
Use and reporting of cluster analysis 355
Acknowledgements
We wish to thank Richard Jenkins for assistance with the journal selection and reviewing the abstracts. In addition we would like to thank Vanessa Cooper and Sue Hall for assistance with selecting the journals used in the review. Finally, thanks to the anonymous reviewers for their invaluable advice.
:
. . , j ^
References
Abel, G., Plumridge, L,, & Graham, F (2002). Peers, networks or relationships: Strategies for understanding social dynamics as determinants of smoking behaviour. Drugs-Education Prevention and Policy, 9, 325-338. Abel, T. (1991). Measuring health lifestyles in a comparative analysis: Theoretical issue.s and empirical findings. Social Science and Medicine, 32, 899-908, Ahlberg, J,, Suvinen, T I., Rantala. M., Lindholm, H., Nikkilae, H., Savolainen, A,. Nissinen, M., Kaarento, K., Sama, S., & Kononen, M, (2002), Distinct biopsychosocial profiles emerge among nonpatients./owmw/ nf Psychosomatic Research, 5.J, 1077-1081. Aldenderfer, M. S,, & Blashfield, R. K, (1984). Cluster analysis. Newbury Park, CA: Sê Publishing. Alien, M, T,. Boquet, A.J,, & Shelley, K. S, (1990 Cluster analyses of cardiovascular responsivity to three laboratory strcs.sors. Psychosomatic Medicine, 5J, 272-288. Aretjord, K., Hallarakeri, E., Havik, O. E., & Maeland, J. G. (1998). Myocardial infarction: Emotional consequences for the wife. Psychology and Health, 13, 135-146. Bailey. K, Ii. (1994). Typologies and taxonomies: An introduction to classipcation techniques. Thousand Oaks, CA: Sage, Barsky, A.J,, & Wyshak, Ci. (1989). Hypochondriasis and related health attitudes. Psychosomatics, 30, 412-420. BergbuisJ. P.Heinman,J, R., Rothman, I., & Berger, R, E, (1S>96), Psychological and physical factors involved in chronic idiopatbic prostatitis./oMr/ii// of Psychosomatic Research. 41. 313-325. Blasbfield, R, K, (1977). Tbe equivalence of three statistical packages for performing hierarchical cluster analysis, Psychometrika, 42. 429-431. Blasbtieid. R. K. (1980a). Propositions regarding the use of cluster analysis in clinical research. Journal of Consulting and Clinical Psychology, 48, 456-459. Blashfield. R. K, (1980b), Tbe growtb of cluster analysis: Tryon, Ward and Jobnson, Multivariate liehavioral Research, 15, 439-458. Bombardier, C, H., Divine, G. W,, Jordan, J. S., & Brooks, W, B, (1993). Minne.sota Mulliphasic Personality Inventory (MMPO cluster groups among chronically ill patients: Relationship to illness adjustment and treatment outcome,./oMrria/ of Behavioral Medicine, 16, 467-484, BPS Division of Healtb Psycbology (2000), Aims and Objectives of tbe DHP as agreed at the Annual General Meeting, September 2000, Retrieved 22nd July, 2003. from bttp://www. health-psychology.org, uk/aimsdw.htm Bradley, L, A,, & Van der Heide, L, H. (1984). Pain-related correlates of MMPI profile subgroups among back pain patients. Health Psychology, 3, 157-174, Brandwin, M,, Trask, P C , Schwartz, S, M., & Cliftbrd, M. (2000), Personality predictors of mortality in cardiac transplant candidates and recipients.yowma/ of Psychosomatic Research, 49, 141-147, Breckenridge, J. N, (2000). Validating cluster analysis: Consistent replication and symmetry, Multivariate Behavioral Research, .J5, 261 285, Bucks, R, S., WiUiams. A., Wbitfield, M, J,, & Routb, D. A. (1990). Towards a typology of general practitioners' attitudes to general practice. Social Science and Medicine.. 30, 537-547, Buick, D, (1997), Illness representation and breast cancer: Coping witb radiation and chemotherapy. In K, J, Petrie & J, A, Weinman (Eds). Perceptions of Health and Illness. London: liarwood Academic, Calinski, T, & Harabasz, J, (1974). A dendrite method for cluster analysis. Communications in statistics, J, 1-27.
356
Jane Clatworthy et al.
Chan. R. C. K., Lee, P. W J.. & Ueh-Mak, E (2000). The pattem of coping in persons witb spinal cord injuries. Disability and Rehabilitation: An International Multidisciplinary Joumal, 22, 501-507. De Bourdeaudhuij, !,, & Van Oost, P. (1998). Family cbaracteristics and healtb behaviours of adolescents and families. Psychology and Health, 13, 785-803, Denollet. J. (1993). Biobebavioral research on coronary heart disease: Where is the person? Jourtial of Behavioral Medicine. 16, 115-141. Deter, H. C , & Scbepank, H. (1991), Patterns of self-definition of asthma patients and normal persons in the Freiburg Personality Inventory, Psychotherapy and Psychosomatics, 55,47-56. Dew, M. A.. Goycoolea, J. M., Stukas, A. A., Switzer, G, E., Simmons. R, G., Roth, L, H,, & Dimartini, A. (1998), Temporal pn)files of physical health in family members of heart tran.splant recipients: Predictors of health change during caregiving. Health Psychology, J 7, 138-151. Edelbrock, C. (1979), Mixture model tests of hierarchical clustering algorithms: The problem of classifj'ing ever>body, Multivariate Behavioral Research, 14, 367-384. Edinger,J. D., Stout, A, L,, & Hwlscher, 1, J. (198), Cluster analysis of instjmniacs' MMPI pmfiles: Relation of subtypes to sleep history and treatment outcome. Psychosomatic Medicine, 50,77-87. Everitt, B. S. (1993). Cluster analysis (3rd ed.). London: Amold. Fisher, L., Chesia, C. A,, Skaff, M, A,, Gilliss, C , Kanter, R. A., Lutz, C. P, & Bartz, R. J, (2000). Disease management status: A typologj' of Latino and EuroAmerican patients with type 2 diahetes. Behavioral Medicine, 26, 53-66, Fisher, L,, Soubhi, IL. Mansi. O., Paradis, G,. Cîuvin. L.. & Pot\Tn, L. (1998), Family process in health researcb: Extending a family typoUigy to a new cultural context. Health Psychology, 17,358-366, Florek, K., Luka.szewiez. J., Perkal, J,. Steinhaus, H,. & Zuhr/chi, S. (1951), Sur la Uason. Division des points d un ensemble fini. Colloquium Mathematicum, 2, 282-285. Griffiths. E (19S)9), Women's controi and cboice regarding HRT Social Science and Medicine, 49, 469-481. Guck, T P, Meilman. P W., Skultetv; E M., & Poloni, L, D. (1988). Pain-patient Minnesota Multipbasic Personality Inventory (MMPI) subgroups: Evaluation of long-term treatment outcome. Journal of Behavioral Medicine, 11, 159-169. llagoel, L, Ore, L., Neter, E., Silman, Z,, & Renncrt, G. (2002). Clustering women's health behaviors. Health Education and Behavior, 29, 170-182. Havik, O. E,, & Maeland, J. G. (1990). Patterns of emotional reactions after a myocardial infarction. Joumal of Psychosomatic Research, 34. 271-285, Hiatt, D. P, Peglar. M,, & Borgen. E H. (1984). Pattems of perception of health in cardiac patients. Journal of Psychosomatic Research, 28, 87-92. Hillhouse, J, J.. & Adler. C. M. (1997). Investigating stress effect pattems in hospital staff nurses: Rcsultsof a cluster analysis. Social Science and Medicine, 45. 1781-1788. Hodges, K.. 8t Wotring. J. (2000). Client typology based on fimctioning across domains using the CAFAS: Implications for service planning. Joumal of Behavioral Health Services and Research. 27, 257-270. Houston, B, K,. Babyak, M, A,, Chesney. M. A.. & Black, (i, (1997), Social dominance and 22-year all-cause mortality in men. Psychosomatic Medicine, 59, 5-12. Houston, B. K., Chesney, M. A., Black, G. W., & Cates, D. S, (1992). Behavioral clusters and coronary heart disease risk. Psychosomatic Medicine. 54. 447-461. Iezzi, T. Archibald. Y., Barnett, P, Klinck. A., St Duckwortb. M. (1999). Neurocognitive performance and emotional status in chronic pain patients. yoMrwa/ of Behavioral Medicine, 22, 205-216. Jamison. R,N., Rock, D.L.,&Parris,W.C. (1988). Empirically derivedSymptomCbecklist 90 subgn)ups of chninic pain patients: A cluster analysis.yoMr/ ofBehaviond Medicine. 11. 147-158, Janman. K., Jones. J. G.. Payne. R, L,. & Rick, J, T (1988). Clustering individuals as a way of deiiUiig with multiple predictors in occupational stress research. Behavioral Medicine, 14, 17-29. Jason, L, A,, & Taylor. R, R. (2002), Applying ciaster analysis to define a typol(^- of cbmnic fetigue syndn>me in a metlically-cvaluated random commwratysampic.Psyclxikigy and Health, 17.323-337,
357
Jenkins, R, A., & Burish, T. G. (1995). Health locus of control, chemotherapy-related distress, and response to behavioral intervention in cancer patients. Psychology and Health, 10, 463-475, Jolliffe, I, T. Jones, B,, & Morgan. B. J, T (1982), Utilizing clusters: A case-study involving Uie c\iier\y. Journal ofthe Royal Statistical Society, 145. 224-236. Jorgensen, R.S.. Gelling, P. D.. & Kliner, L. (1992). Pattems ofsocialdesirdbility and anger in young men with a parental history of hypertension: Association with cardiovascular activity. Health Psychology, 11, 403-412, Jorgensen, R. S,, & Houston, B. K, (1986). Family history of hypertension, personality pattems and cardiovascular reactivity to stress. Psychosomatic Medicine, 48, 102-117. Kaluza. G. (2000), Changing unbalanced coping profiles: A prospective controlled intervention trial in worksite bealth promotion. Psychology and Health, 15, 423-433. Ketcben. D, J,, & Hult, T M. (2000). Validating cluster assignments. Psychological Reports, 87, 1057-1058. Kirschbaum, C , Pruessner, J. C , Stone, A. A., & Federenko, 1. (1995). Persistent bigb cortisol responses to repeated psychological stress in a subpopulation of healthy men. Psychosomatic Medicine, 57, 468-474, Kos, A. J,, & Psenicka, C, (2000). Measuring cluster simUarity across methods. Psychological Reports, 86, 858-862. Lin, M. D., Kalinowski, L,, & Shafer, D. (1999). A dental fears typology of oral surgery patients: Matching patients to anxiety interventions. Health Psychology, 18. 614-624. Ma, J,, Betts, N. M., Horacek, T, Georgiou, C , White, A,, & Nitzke, S. (2002), The importance of decisional balance and self-efficacy in relation to stages of change for fruit and vegetable imakes by young adults. American Joumal of Health Promotion. 16, 157-166, Martin, C . Wyllie. A,, & Casswell, S. (1992). Types of New Zealand drinkers and tbeir associated alcohol related pmMcms. Journal of Drug Issues, 22, 773-796, Mascbewsky-Schneider. U,, & Greiser, E, (1989). Primary prevention of coronary heart disease versus health promotion - a contradiction? Annals of Medicine, 21, 215-218. Miller, S, B, (1993). Cardiovascular reactivity in anger-defensive individuals: The influence of task demands. Psychosomatic Medicine, 55, 78-85. Milligan, G. W, (1980), An examination ofthe effect of six types of error perturbation on fifteen clustering algorithms, Psychometrika, 45, 325-342, Milligan, G. W. (1981), A review of Monte Carlo tests of cluster analysis. Multivariate Behavioral Research, 16, 379-407, Miliigan, G, W., & Cooper, M. C, (1985). An examination of procedures for determining the number of clusters in a data set, Psychometrika, 50, 159-179, Mills, P J,, Dimsdale, J. E,, Nelesen, R. A,, & Jasiewicz. J, (1994). Patterns of adrenergic receptions and adrenetîc agonists underlying cardiovascular responses to a psychological challenge. Psychosomatic Medicine, 56, 70-76. Mulry, G., Kalichman, S, C , Kelly J. A., Ostrow, D, G.. & Heckman, T (1997). Grouping gay men on dimensions reflecting sexual behavior preferences: Implications for HIV-AIDS prevention. Psychology and Health, 12, 405-415. Nelson, 1), V, Friedman. L C , Baer, P E.. Une, M., & Smith, F. E. (1994). Subtypes of psychosocial adjustment to breast cancer. Joumal of Behavioral Medicine, 17, 127-141, Newman. S,, Fitzpatrick, R., Lamb. R., & Shipley, M. (1990). Pattems of coping in rheumatoid arthritis. Psychology and Health. 4. 187-200. Nolan, R, P, & Wielgosz. A, T, (1991), Assessing adaptive and maladaptive coping in the early phase of acute myocardial infarction. yoMr<3/ of Behavioral Medicine, 14, 111-124. Nouwen, A., Gingras, J., Talbot, E, & Bouchard, S. (1997), The development of an empirical psychosocial taxonomy for patients with diabetes. Health Psychology, 16, 263-271, Pimj, G,, & Stewart. D. (1983). Cluster analysis in marketing researcb: Review and suggestions for application, yowrww/ of Marketing Research, 20, 134-148. Radley, A,, & Green, R. (1988). Bearing illness: Study of couples where husband awaits coronary graft sur^ry. Social Science and Medicine, 23, 577-585.
358
Jane Clatworthy et al
Raja, S. N,, Williams, S., & McGee, R. (1994). Multidimensional bealth locus of control beliefe and psychological health for a sample of mothers. Social Science and Medicine, 39, 213- 220. Rappaport, N. B., McAnuIty, D, F, Waggoner. C. D,, & Brantley, P J. (1987). Cluster analysis of Minnesota Multiphasic Personality Inventory (MMPI) profiles in a chronic headache popuhtion. Joumal of Behavioral Medicine, 10, 49-60. Robinson. M. E,, Greene, A. F, &Geisser, M. E. (1993). Specificit>'ofMMPI cluster types to chronic illness. Psychology and Health, 8, 285-294, Roche, A. M., & Richard, G. P (1991), Doctors' willingness to intervene in patients' drug and alcohol problems. Social Science and Medicine, 33. 1053-1061. Romesbut^. H. (', (1984). Cluster a rmlysisfor researchers. Belmont: Lifetime Learning Publications, Rosen, J. C . Grubman, J. A., Bevins, T, & Frymoyer, J. W <1987), Musculoskeletal status and disability of MMPI profile subgroups among patients with low back pain. Health Psychology, 6, 581-598. Sarle, W, S, (1983). Cubic clustering criterion. SAS Technical Report A-108 Cary, NC: SAS Institute Inc. Schnoll, R. A,, & Harlow, L, L. (2001), Using disease-related and demographic variables to form cancer-distress risk groups.youma/ of Behavioral Medicine, 24, 57-74. Shaffer, J. W., Graves, P. I., Swank, R. T, & Pearson, T. A. (1987). Clustering of personality traits in youtb and subsequent development of cancer among physicians. Joumal of Behavioral Medicine, 10, 441-447. Sbapiro, D. E., Boggs, S, R., Rodrigue,J. R,, & Urry, H, L (1997). Stage II breast cancer: Differences between four coping pattems in side effects during adjuvant chemotherapy, Joumal of Psychosomatic Research, 43, 143-157. Shapiro. D. E,. Rodrigue, J, R., Bo^s, S. R., & Robinson, M, E. (1994), Cluster analysis of the Medical Coping Modes Questionnaire: Evidence for coping with cancer styles. Joumal of Psychosomatic Research, 38, 151 -159. Skevington, S. M., & White, A. (1998), Is laughter the best mei^cme? Psychology and Health, 13, 157-169, Sneath, P H. A. (1957), The application of computers to taxonomy. Joumal of General Microbiology, 17, 201-226. Sokal, R, R., & Michener, (1958). A statistical method for evaluating systematic relationships, Kansas Scientific Bulletin, 38, 1409-1438, vSoIdz, S,, & Cui, X, (2002), Patliways through adolescent smoking: A 7-year longitudinal grouping analysis. Health Psychology, 21, 495-504, SoUner, W., Zschocke, L, Zingg-Schir, M., Stein, B., Rumpold, G., Fritsch, P, & Augustin. M. (1999). Interactive pattems of sociat support and individual coping strategies in melanoma patients and their correlations with adjustment to illness, Psychosomatics, 40. 239-250. Trask, P C , Iezzi, T, & Kreeft, J. (2001), Comparison of headache parameters using headache type and emotional status. 7OM;-/ of Psychosomatic Research, 51, 529-536. Tryon, R, C , & Bailey, D, E. (1970). Cluster analysis. New York: McGraw-Hill. Ward. J. H. J. (1963). Hierarchical grouping to optimize an objective function. 7oMm/ of the American Statistical Association, 58, 236-244. Weir, M. R,, Maibach, E, W, Bakris, G, L., Black, H. R., Chawla, P, MesserU, F. H., Neutel, J. M., & Weber, M. A, (2000). Implications of a health lifestyle and medication analysis for improving hypertension control. Archives of Intemal Medicine, 160, 481-490. Wyllie. A., 8t CassweU, S. (1993). Identifying target segments of male drinkers for health promotion. Health Pnymotion International, 8, 249 261. Wyllie, A,, SL Cas.swell, S. (1997). Gender focus of target groups for alcohol health promotion strategies in New Zealand. Health Promotion International. 12, 141-149. Zubin. J. (193H), A technique for measuring Ukemindedness./owma/o/^feworwa/am/ Psychology, 33, 508-516. Received 23 July 2003; revised version received 19 April 2004

Análise de Clusters

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Análise de Clusters

Uploaded by

Copyright:

Available Formats

329

The use and reporting of cluster analysis in health psychology: A review .

Use and reporting of duster analysis

Jane Oatworthy et ai.

The procedure used to detennine the number of clusters

Use ond reporting of duster analysis

Jane Qatworthy et al.

No. ISI Web PsycINFO Medline of Science CINAHl. replications Total

Use ond reporting of cluster artalysis

]ane Clatworthy et al.

Use ar)d reporting of cluster anafys'is

Health Ic cont rol related d tress and behaviou

Applying uste anal] of chroni evaluatec rand

Family hi ory 1 Datterns. rdio>

Use ond reporting of duster analysis

gene ral ctitiorlers

Ster sona onic

Use ond reporting of duster analysis

fl) V) 'rt \*^

"i/i ' i - .^2 C

Use and reporting of duster analysis

jane Ciatworthy et al.

pared exte :urs w ith previ gro Js/exp opin ion

pared atedK ng see

Use and reporting of duster anatysis

/era /era -me;

J < > cin

Robi nsor (199:

Use and reporting of cluster analysis

o S! 1 ^ U < < >

ncur ups/ exp u

jane Clatwortby et al.

Use and reporting of cluster analysis

Jane Clatworthy et at.

Use ond report/ng of duster analysis

Jane Clatworthy et al.

Use and reporting of cluster analysis 355

Jane Clatworthy et al.

Use and reporting of cluster analysis

You might also like