Professional Documents
Culture Documents
www.emeraldinsight.com/0957-8234.htm
JEA
51,2
126
Philip Hallinger
Asia Pacific Centre for Leadership and Change,
Hong Kong Institute of Education, Hong Kong, SAR, China
Abstract
Purpose The purpose of this paper is to present a framework for scholars carrying out reviews of
research that meet international standards for publication.
Design/methodology/approach This is primarily a conceptual paper focusing on the
methodology of conducting systematic reviews of research. However, the paper draws on a database
of reviews of research previously conducted in educational leadership and management. In a separate
effort, the author identified 40 reviews of research that had been published in educational leadership
conducted over the past five decades. The paper draws upon narrative examples from the empirical
review as a means of clarifying and elaborating on the elements of the conceptual framework. The paper
also refers to specific findings from the earlier paper in order to illustrate broader trends with respect to
how the various elements of the framework have been employed in exemplary reviews.
Findings As scholars working across a broad range of scientific fields suggest, high quality reviews
of research represent a potentially powerful means of reducing the gap between research and
practice. Yet, the quality of research reviews conducted in educational leadership and management
remain highly variable in methodological rigor. This paper provides a conceptual framework and
language that scholars might use to guide the conduct and evaluation of future research reviews in
educational leadership and management.
Research limitations/implications The contribution of this paper lies first in highlighting the
need for scholars to employ systematic methods when conducting research reviews in educational
leadership and management. Beyond this broad purpose, the paper provides a framework for decisionmaking at different points in the review process, and a set of criteria or standards by which authors,
readers and reviewers can judge the quality of a research review. It is hoped that this conceptual
framework can provide useful methodological guidance that will enhance longstanding efforts in our
field to advance knowledge in a more systematic and coherent fashion.
Originality/value This originality of this paper lies in its adaptation and application of recent
methodological advances in conducting reviews of research across the natural and social sciences to
the field of educational leadership and management. A search of core journals in educational
leadership and management found not a single paper that discussed methods of conducting reviews of
research. The paper offers a clear framework that will allow future scholars in educational leadership
and management to improve the quality of their research reviews.
Keywords Educational administration, Educational research, Methods, Methodology,
Research methodology, Literature, Education, Research, Leadership
Paper type Conceptual paper
Journal of Educational
Administration
Vol. 51 No. 2, 2013
pp. 126-149
r Emerald Group Publishing Limited
0957-8234
DOI 10.1108/09578231311304670
The author wishes to acknowledge the useful comments on this paper offered by Edwin M.
Bridges, Kenneth Leithwood, Ronald H. Heck, and Joseph Murphy. The author wishes to
acknowledge the funding support of the Research Grant Council (RGC) of Hong Kong for its
support through the General Research Fund (GRF ILEA 841512).
Systematic
reviews of
research
127
JEA
51,2
128
2007; Hallinger, 2012; Hunter and Schmidt, 2004; Jackson, 1980; Light and Pillemer,
1984; Lipsey and Wilson, 2001; Sandelowski and Barroso, 2007; Weed, 2005) as well as
lessons drawn from our earlier study of reviews of research in educational leadership
and management (Hallinger, 2012).
We begin the paper by examining the evolution of reviews of research in
educational leadership in management. Next we clarify the methodological approach
adopted in this paper. Then we present the conceptual framework for conducting
systematic reviews of research. This section draws examples from exemplary research
reviews in educational leadership and management (Hallinger, 2012) in order to
illustrate elements of the conceptual framework. The paper concludes with a critical
assessment of the state-of-the-art in reviewing research in educational leadership and
management and recommendations for future directions in this domain.
The evolution of reviews of research in educational leadership and
management
Reviews of research in educational leadership began to appear in the published
literature during the 1960s, concurrent with the inception of the theory movement in
educational administration in the USA (Briner and Campbell, 1964; Campbell and
Faber, 1961; Erickson, 1967; Lipham, 1964). The normative approach adopted in these
reviews was consistent with what Gough (2007) termed an ad hoc method of reviewing
research. In ad hoc reviews the author begins with very broad purpose, often without
stating specific questions or goals that will guide the review. Similarly, ad hoc reviews
often omit information on the basis for selecting studies, and procedures for extracting,
evaluating and synthesizing information (Gough, 2007; Hallinger, 2012).
For example, Campbell and Faber (1961) began their early review by stating the
following purposes and approach:
This chapter is concerned chiefly with theoretical and empirical studies of administrative
behavior and of training programs in administration. Writings which seemed to
present significant conceptual formulations or empirical data were reviewed; texts in
educational administration were omitted. Reference was made to studies in educational
administration and to those in the general field of administration which appeared relevant
to education ( p. 353).
This ad hoc approach to reviewing research was consistent across multiple authors
and journals during the first 20 years of the growth of educational leadership
and management as a formal field of study. The first systematic reviews of research
in educational leadership only began to appear in our journals around 1980 (e.g.
Bridges, 1982 in Educational Administration Quarterly (EAQ); Campbell, 1979 in EAQ;
Leithwood and Montgomery, 1982 in Review of Educational Research (RER)). Yet, even
then, the broader methodological trend in reviewing the literature in educational
leadership continued to be mixed. More specifically, the author found that reviews of
research in educational leadership and management conducted over the succeeding
decades (i.e. 1980 to the present) have continued to evidence a combination of both ad
hoc and systematic reviews (Hallinger, 2012).
It should also be noted that these trends do not pertain only to reviews of research in
educational leadership and management (Gough, 2007). It is only in the past decade
that a substantial array of scholars in the natural and social sciences has sought to
build on earlier efforts (e.g. Cooper, 1982; Jackson, 1980; Light and Pillemer, 1984)
to elaborate the methodologies for conducting systematic reviews of research
(e.g. Dixon-Woods et al., 2006; Fehrmann and Thomas, 2011; Gough 2007; Lipsey and
Wilson, 2001; Lucas et al., 2007; Sandelowski and Barroso, 2007). As research reviews
have been employed increasingly to inform public policy (EPPI, 2012; DeGeest and
Schmidt, 2010; Hattie, 2009; Lorenc et al., 2012; Shemilt et al., 2010; Valentine et al.,
2010), scholars have sought to identify a commonly accepted set of methods, criteria
and standards for conducting and assessing reviews of research (e.g. Cooper and
Hedges, 2009; Gough, 2007; Lipsey and Wilson, 2001; Lucas et al., 2007; Sandelowski
and Barroso, 2007; Thomas and Harden, 2008; Weed, 2005). The EPPI at the University
of London sums up the rationale for making reviews more systematic:
Most reviews of research take the form of traditional literature reviews, which usually
examine the results of only a small part of the research evidence, and take the claims of report
authors at face value. The key features of a systematic review or systematic research
synthesis are that:
.
there is a requirement of user involvement to ensure reports are relevant and useful.
Systematic reviews aim to find as much as possible of the research relevant to the research
questions, and use explicit methods to draw conclusions from the body of studies. Methods
should not only be explicit but systematic with the aim of producing varied and reliable
results (http://eppi.ioe.ac.uk/cms/Default.aspx?tabid 67).
The assumptions and procedures underlying systematic reviews are derived from
broadly accepted standards of scientific methods and reporting (Gough, 2007).
This paper is located among these efforts to make the features of systematic reviews of
research more transparent and accessible to those who engage in this important
type of research activity. It should be noted that although the paper focusses on
published reviews of research in most cases the procedures outlined in this paper
apply will strengthen the quality of literature reviews conducted in other types of
research-related reports (e.g. policy documents, research proposals, research reports,
master and doctoral dissertations).
Method
This is primarily a conceptual paper that examines the methodology of conducting
systematic reviews of research. However, the paper draws on a database of reviews
of research previously conducted in educational leadership and management (see
Hallinger, 2012). As indicated earlier, in a separate effort, the author identified 40
reviews of research that had been published in a comprehensive set of relevant
journals over the past five decades (Hallinger, 2012).
The reviews were sourced from eight well-recognized, core international journals
specializing in educational leadership and management, and one general education
journal: EAQ, Journal of Educational Administration ( JEA), Educational Management
Administration and Leadership (EMAL), International Journal of Leadership in
Education (IJLE), Leadership and Policy in Schools (LPS), School Leadership and
Management (SLAM), School Effectiveness and School Improvement (SESI),
International Journal of Educational Management (IJEM) and RER. It should be
Systematic
reviews of
research
129
JEA
51,2
130
noted that this particular set of journals is largely consistent with the set of core
journals selected in Leithwood and Jantzis (2005b) review of research. An empirical
analysis of these reviews of research resulted in the identification of 17 exemplary
reviews (see Table I). These reviews were labeled exemplary in the sense that they
met all or most of the criteria that are incorporated into this papers conceptual
framework for conducting systematic reviews of research (Hallinger, 2012).
The current paper seeks to provide greater detail on this conceptual framework for
conducting systematic reviews of research. We do, however, draw upon narrative
examples from the empirical review as a means of clarifying and elaborating on
the elements of the conceptual framework. We also refer to specific findings from the
earlier paper in order to illustrate broader trends with respect to how the various
elements of the framework have been employed in exemplary reviews conducted
in our field.
A conceptual framework for systematic reviews of research
A review of research can be organized around a set of questions that guide the
execution of the study. Taken together, these questions comprise a conceptual
framework for conducting systematic reviews of research. The questions that form this
framework include the following:
Table I.
Exemplary reviews of
research in educational
leadership and
management (1960-2012)
(1)
What are the central topics of interest, guiding questions and goals?
(2)
(3)
What are the sources and types of data employed for the review?
(4)
(5)
What are the major results, limitations and implications of the review?
Author
Year
Locus
Journal
Campbell
Leithwood and Montgomery
Bridges
Leithwood, Begley and Cousins
Eagly, Karau and Johnson
Hallinger and Heck
Hallinger and Heck
Witziers, Bosker and Kruger
Leithwood and Jantzi
Murphy, Vriesenga and Storey
Robinson, Lloyd and Rowe
Murphy
Hallinger
Leithwood and Sun
Walker, Hu and Qian
Hallinger, Wong and Chen
Hallinger and Bryant
1979
1982
1982
1990
1992
1996
1998
2003
2005
2007
2008
2008
2011
2012
2012
In press
In press
USA
USA/Can
USA
International
USA
International
International
International
International
USA
International
International
International
International
China
International
East Asia
EAQ
RER
EAQ
JEA
EAQ
EAQ
SESI
EAQ
LPS
EAQ
EAQ
JEA
EAQ
EAQ
SESI
EAQ
JEA
Total cites
Cites/year
25
371
193
145
121
963
778
370
276
12
305
7
11
1
1
19
6
6
6
52
51
34
22
2
61
1
11
1
Systematic
reviews of
research
131
JEA
51,2
132
database that the author analyze. However, instead of collecting primary data, the
reviewer collects, evaluates and synthesizes information from a particular set of studies.
Searching for sources. With this in mind, the identification of suitable studies
represents a critical condition bearing on the interpretation of findings from the review.
It is no exaggeration to assert that the reviewers conclusions are inextricably linked to
the nature of the sample of studies that is gathered. This raises the importance of
ensuring that methods of search are comprehensive, systematic and justifiable.
Consequently, the reviewer must make both search criteria and procedures, as well as
the nature of the resulting sample of studies explicit.
In some domains of inquiry, the challenge is to identify sources from a relatively
small population of studies. In other cases, the available set of studies may be
extensive; then the challenge is to reduce the total number of studies down to a
manageable size. Thus the exemplary reviews referred to in this paper evidenced a
wide range in the sample size of primary research studies (e.g. Robinson et al. (2008),
22 studies; Witziers et al. (2003), 37 studies; Hallinger and Heck (1996), 40 studies;
Hallinger and Bryant (in press), 184; Bridges (1982), 322 studies). There is no magic
number that defines the optimal number of papers to be included in a review.
The variables addressed in the research goals provide the first condition to be
considered in determining the search strategy. For example, the Hallinger and Heck
(1998), Witziers et al. (2003) and Robinson et al. (2008) studies of leadership effects all
required research reports that included, at a minimum, measures of school leadership
and student learning that had been analyzed quantitatively. Thus, inspection of
the research questions will generally point toward the domains and types of studies to
be included in the review.
The author must determine and describe the types of sources that will be included
in the review. A review may include any one or a combination of journal articles,
dissertations, books, book chapters, conference papers, etc. Again, there is no rule to
determine which combination is best. It depends largely upon the density and quality
of relevant literature identified in the domain. Exemplary reviews in educational
leadership have employed mixed source types (e.g. Bridges, 1982; Hallinger and Heck,
1996, 1998; Robinson et al., 2008) as well as single source types (e.g. Hallinger, 2011a;
Leithwood and Jantzi, 2005b; Leithwood and Sun, 2012; Murphy et al., 2007).
The author may further delimit the scope of sources in the review by specifying a
particular subset of journals (Hallinger, 2012).
Reviews can also be delimited by specification of a time period for the review. The
time period selected for each review will have its own logic. Hallinger and Bryant
(in press) stated the rationale for determining the time period for their review of
research on educational leadership and management in East Asia:
Our rationale for choosing this particular period was both historical and pragmatic.
Early commentary on the need for more research on educational leadership and management
from non-Western cultural contexts first emerged and gathered headway during the
mid-1990s [y] However, it would take several years for research stimulated by this
commentary to appear in journals. Thus, we felt that there was reasonable justification for
beginning our search in 2000.
Often the logic is grounded in the evolution of the literature related to the reviews
guiding questions. This reprises the notion of lineage-linked reviews. For example,
Bridges (1982) set the starting point for his review (i.e. 1967) at the end date of
Ericksons (1967) earlier review of research on the school administrator. In sum, it is
Systematic
reviews of
research
133
JEA
51,2
134
incumbent upon the reviewer to explicate the rationale for the selected search criteria
since they determine the composition of the database under review and the
information that will be synthesized.
We can further classify search procedures as selective, bounded or exhaustive.
In selective searches the criteria for inclusion in the review are based on the authors
judgment, but the criteria are never stated clearly (e.g. Bossert et al., 1982; Briner and
Campbell, 1964; Campbell and Faber, 1961; Erickson, 1967, 1979; Hallinger, 2005,
2011b; Leithwood et al., 2008; Lipham, 1964; Riehl, 2000). Selective searches do not
meet the standard for systematic reviews of research.
In a bounded search the reviewer either samples from a population of studies
(e.g. Bridges, 1982), or delimits the review through the use of explicitly stated criteria
(e.g. dates of the sources reviewed, set of journals or types of sources (e.g. Hallinger,
2012, Hallinger and Bryant, in press; Leithwood and Jantzi, 2005b; Leithwood and
Montgomery, 1982)). Bounded reviews meet the standard for systematic reviews when
the criteria are both explicit and defensible. For example, Bridges (1982) review
specified a particular period (1967-1980). This period was bounded by the date of a
review conducted by Erickson in 1967 up to the time of Bridges own effort. Bridges
included doctoral dissertations as well as published studies in order to achieve a broad
view of research in the field. However, his search revealed an unmanageable number of
doctoral studies. This required a creative strategy in order to reach a pragmatic but
defensible database of studies. In light of the huge volume of research contained in
this single source [i.e. doctoral dissertations], these studies (n 168) were selected from
each monthly issue of Dissertation Abstracts (Bridges, 1982, p. 12). Bounded reviews
meet the standard for a systematic review when the selection of search criteria and
subsequent procedures are explicit and defensible in light of the studys goals.
In an exhaustive search the reviewer combs a wide range of possible sources in an
attempt to identify potentially relevant studies (Bossert et al., 1982; Eagly et al., 1992;
Hallinger and Heck, 1996, 1998; Murphy, 2008; Robinson et al., 2008; Witziers et al.,
2003). Exhaustive reviews place a premium on the authors ability to search for sources
efficiently and effectively. Thus, scholars are paying increased attention to the methods
of searching for relevant studies (Gough, 2007). Computer search tools (Fehrmann and
Thomas, 2011) as well as analytical tools (e.g. Harzing, 2008) can assist in making
searches more systematic and comprehensive.
Exhaustive reviews meet the standard for a systematic review when the description
of search criteria and procedures are explicit and defensible in light of the studys
goals. A common exhaustive search approach was described by Witziers et al. (2003):
A systematic search of documentary databases containing abstracts of empirical studies was
conducted. Of particular importance were Educational Resources Information Center (ERIC)
documents and database, School Organization and Management Abstracts, Educational
Administration Abstracts, and the Sociology of Education Abstracts. Although these abstracts
cover the most important scholarly journals, they do not cover all. Therefore, we paged
through volumes of relevant educational peer-reviewed journals not covered by these
(e.g. Journal of School Effectiveness and School Improvement, School Leadership and
Management, Journal of Educational Administration, etc.). Moreover, reviews and handbooks
were examined for references to empirical studies. Finally, all selected studies were examined
for references to studies as yet not uncovered ( p. 404).
Data extraction. After a body of literature has been identified, the next step involves
reading the studies and extracting relevant data for analysis and synthesis (Gough,
2007). Although in-depth discussion of how to read research goes beyond the scope of
this paper, we wish to highlight the fact that all research sources (e.g. master theses,
doctoral dissertations, blind-reviewed published research) should not be treated
as equal in quality. Therefore during the process of extracting information from
the individual studies, it is important to keep notes concerning the strengths and
weaknesses of the individual studies. Of course the types of information to be extracted
from each study will vary based upon the thematic focus, goal orientation and research
questions that are guiding the review.
The author should describe the steps taken in extracting information from the
constituent studies. The nature of the information being extracted will vary depending
upon the reviews methodology. In quantitatively oriented reviews the extracted
data may be numerical (e.g. sample sizes, effect sizes, correlations, reliability
coefficients, etc.). In qualitatively oriented reviews the extracted data may consist of
narrative text, idea units, descriptions of studies or summaries of findings. In all
instances, a clear and explicit description of the data extraction procedures is
essential. This pertains to the standard of replicability of a high quality research
review (EPPI, 2012; Gough, 2007).
Tracking these data across studies is a challenging yet critically important task.
While keeping notes (e.g. in MS Word) is a necessity, in many literature reviews data
can also be coded and tracked in a MS Excel spreadsheet. Information entered into the
spreadsheet can be raw or coded, numerical or raw text (see descriptions in Hallinger,
2011a; Hallinger and Bryant, in press).
Murphys (2008) review of the literature on turnaround leadership provides a useful
description of the process of data extraction. He provides a detailed list of the steps
involved as the reviewer moves from reading studies, extracting information,
generating thematic categories and coding the information prior to data analysis.
Murphys description is too extensive to include here. However, the ten-step process of
data extraction and transformation that he followed offers a practical example of one
type of systematic approach to preparing information for data analysis and synthesis
(see Murphy, 2008, pp. 78-9).
In sum, systematic reviews place a premium on describing the nature of the
database of studies being reviewed and highlighting the means by which the data
presented to the reader have been extracted. Both should be grounded in a logic
that reflects the research questions and conceptual framework guiding the review.
In the absence of this type of explication of procedures, the reader of the review is
unable to gauge the quality of evidence (Gough, 2007) and weigh potential biases that
frame subsequent findings and conclusions.
How are data evaluated, analyzed and synthesized in the review?
All reviews of research involve the evaluation, analysis and synthesis of data.
The nature of the data gleaned from the review database will determine the types
of data analysis and synthesis that will be employed in the course of the review.
As Gough (2007) asserts:
Just as there are many methods of primary research there are a myriad of methods for
synthesizing research which have different implications for quality and relevance criteria [y]
synthesis can range from statistical meta analysis to various forms of narrative synthesis
which may aim to synthesize conceptual understandings (as in meta ethnography) or both
empirical and conceptual as in some mixed methods reviews (Harden and Thomas, 2005). In
this way, the rich diversity of research traditions in primary research is reflected in research
reviews that can vary on such basic dimensions as the nature of the questions being asked;
Systematic
reviews of
research
135
JEA
51,2
136
Possibly the most significant contributions to the literature on reviewing research over
the past two decades are found in the elaboration of methods of data synthesis.
The procedures used to synthesize findings from both qualitative (Barnett-Page and
Thomas, 2009; Dixon-Woods et al., 2006; Lorenc et al., 2012; Paterson et al., 2001;
Sandelowski and Barroso, 2007; Thomas and Harden, 2008; Weed, 2005) and
quantitative studies (Hunter and Schmidt, 2004; Lipsey and Wilson, 2001; Lucas et al.,
2007; Shemilt et al., 2010; Valentine et al., 2010) have undergone increased scrutiny and
development in recent years. The author notes the signal contribution made by the
launch of a new journal, Research Synthesis Methods, in 2010 by Schmidt and
Lipsey[3]. This journal is an invaluable resource for scholars interested in fine-tuning
the methods of their research reviews.
Evaluation of data. Evaluation refers first to an assessment of the quality
of information contained in the studies. Although the need for careful evaluation of
studies applies to all research reviews, its importance has been especially highlighted
by those engaged in meta-analysis where the phrase, garbage in-garbage out reaches
its ultimate application. Kyriakides et al. (2010) made this point explicitly in their
meta-analysis of the educational effectiveness literature:
These reviews, however, were usually based on a collection of studies that were subjectively
seen by the narrative review authors as good examples of research (e.g. Creemers and
Reezigt, 1996; Sammons et al., 1995) and the authors judgments of methodological merit were
often based on idiosyncratic ideas. On the other hand, some reviews were not selective at all,
leading to a huge number of factors under consideration for which little empirical support
was provided (Levine and Lezotte, 1990). As a consequence, the results of these reviews were
questionable ( p. 2).
Within the review process, the evaluation of information entails several related tasks.
First studies must be screened for relevance to goals of the review. On closer inspection
the researcher will often find that some studies which appeared to meet the criteria for
inclusion are inappropriate. For example, the actual sample size could be too small or
comprised of the wrong population. Other features of the study that were not apparent
on the surface could also render the study inappropriate for inclusion.
As suggested above evaluation of the quality of studies is a separate but critically
important step. In some cases, quality concerns could lead a scholar to eliminate a study
from the review. In the case of a quantitative study this could imply the need to run
quantitative analyses both with and without that particular study (i.e. sensitivity analysis).
Alternatively, the researcher could employ other methods to compare the trend of the study
with the general trend of other studies (e.g. Gough, 2007; Hallinger and Bryant, in press).
Qualitative studies deserve equally stringent examination on the grounds of quality
standards (Dixon-Woods et al., 2006; Gough, 2007). Of course, the researcher cannot use
the same analytical techniques to assess the quality of qualitative data. However,
scholars are increasingly engaged in defining standards and procedures that can be
applied when working with qualitative data (Gough, 2007; Sandelowski and Barroso,
2007; Thomas and Harden, 2008; Weed, 2005). Gough (2007) describes a useful
approach to assessing the weight of evidence that is based on multiple criteria (e.g.
generic quality, research design, evidence focus, overall quality).
With both qualitative and quantitative data, however, the goal at this stage is the
same: to generate a body of information that meets the requirements of the research
review in terms of both relevance and quality. As Gough (2007) discusses, relevance
and quality are interactive. Upon close inspection, a high quality study might not
be relevant due to its definition or operationalization of variables. This requires the
researcher to exercise judgment and also to articulate the decision-making process in
presentation of the review method and findings.
As noted earlier, ad hoc reviews typically skip the explicit description of evaluative
and analytic procedures applied to information extracted from the sample of studies.
This does not meet the standard of a systematic review. As in any empirical study,
systematic reviews outline and justify the analytic procedures applied to the data.
Analysis of data. The process of reviewing a body of literature can involve a
considerable amount of data analysis using tools of quantitative and/or qualitative
inquiry. As suggested earlier, at its heart, a research review is trying to make sense
of findings from a set of studies. Scholars may choose to incorporate a variety of
quantitative information into their literature reviews: effect sizes, reliability estimates,
number of members of a role group studied and sample sizes of studies. They may
use descriptive statistics to quantify trends in study characteristics and findings
across studies. For example, Bridges (1982) reported the following statistical trends
in his review:
The bulk of the research on school administrators uses either description (60%) or a single
factor/correlational without control approach (25%) in data analysis. Those approaches
that enable the investigator to render rival explanations implausible are used in less than
16% of the studies (p. 16).
During the course of a research review, patterns of findings may emerge that call for
more definitive explanation. Sometimes these questions can be resolved through the
reanalysis of data reported by studies identified in the review. This occurred during
Hallinger and Hecks (1998) review of the school leadership effects literature:
As noted at the outset, one purpose of this review was to explore possible explanations for the
ambiguity and inconsistency in findings of principals effects. The conceptual organization of
the studies [y] began to offer clues for the discrepant findings [y] The contrasting findings
between mediated- and direct-effects studies led us to re-analyze one of the direct-effects
studies to see if formulating a different theoretical model might affect the nature of the
findings concerning leadership and school outcomes. We used available data (through
inclusion of a correlation matrix) that had employed direct-effects models and found no
principal effects on student outcomes (Braughton and Riley, 1991) [y] We formulated
antecedent with mediated-effect models using their available observed variables and, in the
case of the Braughton and Riley (1991) study, applied a different analytic method. For this
analysis, we used structural equation modeling (p. 183).
In this case, the authors used stronger inferential statistical methods and a more
sophisticated conceptual model in the reanalysis of secondary data identified during
the course of their review. This enabled them to draw firmer conclusions than had been
possible from the original data analysis. Reanalysis of the original data not only
strengthened the conclusions they were able to draw concerning the substantive
research question that guided the review, but also served to illustrate an important
methodological finding from the review. Moreover, it supported their contention that
progress in research on school leadership effects had been held back by the use of
overly simplified conceptual models and statistical methods.
The selective inclusion of quantitative methods of data analysis within a review of
research that relied primarily on description enabled the review to cross over from an
Systematic
reviews of
research
137
JEA
51,2
138
understanding the trend of substantive findings across studies (Glass, 1977; Hunter
and Schmidt, 2004; Lipsey and Wilson, 2001).
The contribution of meta-analysis to the advancement of knowledge cannot be overstated. Scholars in the field of educational leadership and management (e.g. Bridges,
1982) were not alone in decrying the lack of knowledge accumulation and evincing
skepticism toward potential for the future. Scholars in organizational psychology
also shared this perspective toward knowledge advancement. DeGeest and Schmidt
(2010) summarize the change it has evolved in their field of inquiry:
Researchers mourned their seemingly fundamental inability to create replicable results:
different studies produced different results, both in terms of statistical significance and the
size of relationships. It was difficult for researchers in I/O psychology to answer basic
questions important to social programs and policy [y] The adoption of research synthesis
in the form of psychometric meta-analysis [y] produced important ramifications for
how future research was to be conducted and how individual studies were viewed.
Meta-analysis has allowed researchers to demonstrate generalizable results across
situations for relationships between variables and to identify replicable moderators, and
has revealed other information that was obscured, distorted or unclear in the previous
primary studies (pp. 186-7).
Meta-analysis provides a weighted average effect size that adjusts for the sample size
of the particular studies, giving greater weight to studies with larger samples (Glass,
1977). The resulting generalization of effect sizes across the body of studies is more
accurate than the effect size obtained from any single research study (Hunter and
Schmidt, 2004; Lipsey and Wilson, 2001). This approach represents a significant
improvement over counting and categorizing the results obtained from a set of studies
by providing a higher level of precision and certainty concerning the pattern of findings.
Systematic
reviews of
research
139
JEA
51,2
140
(2)
Does the reviewer discuss how the design of the research review (e.g. search
criteria, sample composition, method of analysis) impacts interpretation of the
findings?
(3)
Does the reviewer identify implications of the findings for relevant audiences
and clarify future directions for theory, research, policy and/or practice?
These criteria hold the reviewer accountable for making clear what has and has not
been learned from the review. Since research reviews lay down markers on the path of
knowledge accumulation, it is incumbent upon the reviewer to label the signposts
clearly. High impact reviews communicate the findings effectively, and place the
findings in perspective for the reader. As indicated in the prior sections of the paper,
research syntheses involve the compilation and summarizing of large amounts of
information. Articulating the process of compiling, extracting, evaluating, analyzing
and synthesizing the data must be carried out just as systematically as the research
process itself. Communicating and interpreting the meaning of the findings are
essential components of high quality reviews.
By way of example, Hallinger and Heck (1998) clarified the limitations of their own
findings: Even as a group, the studies do not resolve the most important theoretical
and practical issues entailed in understanding the principals role in contributing to
school effectiveness. These concern the means by which principals achieve an impact
on school outcomes as well as the interplay (p. 182). Witziers et al. (2003) concluded:
The empirical evidence reported in these five studies supports the tenability of the
indirect effect model, and comparisons of the direct with the indirect model all favor
the idea of mediated effects (p. 418).
As asserted throughout the elaboration of this conceptual framework, the findings
from any review of research are shaped and bounded by the nature of the studies
reviewed, as well as the methods of data extraction and analysis. Systematic reviews
treat these boundaries as conditions that influence interpretation of the findings, and
make those limitations explicit. Clarifying the limitations of the review will aid in
delineating the boundaries of the accumulating knowledge base.
Finally, elaboration on the meaning of the findings that emerge from a review of
research requires the reviewer to consider multiple audiences (e.g. researchers,
practitioners, policymakers) as well as domains of knowledge (e.g. empirical,
conceptual, practical). Again, the metaphor of clearly labeling the signposts best
conveys the underlying requirement for reviews of research on this element.
Systematic reviews should point the relevant stakeholder audiences toward productive
directions, and away from unproductive cul de sacs.
Summary of criteria in the conceptual framework
Drawing upon the five questions that comprise the conceptual framework, it is possible
identify a number of related criteria or standards for systematic reviews of research.
These include:
(1)
(2)
(3)
(4)
the types of sources included in the review (mixed, journals, dissertations) are
explicitly communicated and defensible in light of the studys goals;
(5)
Systematic
reviews of
research
141
JEA
51,2
142
(6)
(7)
(8)
Currently, these criteria form a type of holistic rubric. That is, the criteria are simply
defined in terms of key attributes. The holistic rubric was, for example, used in the
authors earlier assessment of reviews of research in educational leadership and
management (Hallinger, 2012). In the future, the author plans to transform these
into an analytical rubric that can be used to assess levels of quality with greater
reliability (Arter and McTighe, 2001).
Conclusion
This paper sought to provide a conceptual framework and language that scholars can
use to guide the conduct of research reviews in educational leadership and
management. As scholars working across a broad range of scientific fields suggest,
high quality reviews of research represent a potentially powerful means of reducing
the gap between research and practice (e.g. Bero et al., 1998; DeGeest and Schmidt,
2010; Gough, 2007; Hattie, 2009; Hunter and Schmidt, 2004; Light and Pillemer, 1984;
Lucas et al., 2007; Montori et al., 2003; Shemilt et al., 2010; Valentine et al., 2010). It is
hoped that the methodological guidance offered through this conceptual framework
will enhance longstanding efforts to advance knowledge in a more systematic
and coherent fashion (Bossert et al., 1982; Bridges, 1982; Campbell, 1979; Campbell and
Faber, 1961; Donmoyer et al., 1995; Erickson, 1967, 1979; Griffiths, 1979; Hallinger
and Heck, 1996; Lipham, 1964; Murphy et al., 2007; Ribbins and Gunter, 2002).
The term systematic review of research only came into currency during the past
decade, riding the wave of evidence-based decision making[5]. When viewed within
this context, both the rationale and procedures for making reviews of research more
systematic seem almost self-evident. Indeed, they simply mirror recommended
practice for the conduct of high quality research. However, as noted by Gough (2007),
some scholars have taken issue with the procedures employed in systematic reviews.
These scholars suggest that some forms of research review may fall outside of the
evidence-based paradigm.
Ribbins and Gunter (2002, pp. 373-7), for example, differentiated between five
different knowledge domains: conceptual, humanistic, critical, evaluative and
instrumental. They have suggested that although the methodology used in systematic
reviews of research is well suited to the latter two knowledge domains, it may have more
limited applicability for the other three. Their argument implies that some procedures
recommended for systematic reviews could actually dull the edge on the interpretive
tools used in reviews grounded in the other knowledge domains.
It is, of course, possible to construct a useful review of research that eschews some
of the methods advocated in this paper. Indeed, several highly cited reviews published
by well-respected scholars failed to address a majority of the elements of the
conceptual framework (see Hallinger, 2012). Does this mean, as suggested by Ribbins
and Gunter (2002), that the systematic review framework is only valid for reviews that
are grounded in specific knowledge domains?
In order to assess this possibility, let us examine Riehls (2000) review of research
on educational leadership for inclusive education. In the review, Riehl explicitly
adopted a conceptual perspective from critical theory. This presumably informed
her selection of the sources included in the review, extraction of information from the
studies and the interpretation of findings. The word presumably was highlighted
because Riehl omitted any information concerning how the sources for the review
were obtained, the collective nature of her sources, or the means by which information
was selected, evaluated, analyzed and synthesized. As suggested above, this limits
the capacity of the reader to evaluate the authors conclusions, or to even assess
alternative explanations.
Riehls review, along with several other highly cited reviews that aligned poorly
with this conceptual framework, have been influential. For example, of September
2012, the Riehl review had amassed over 250 citations and Bosserts review more than
650 citations. Indeed, the author of this paper has also published a well-cited research
review that aligned poorly with the conceptual framework (Hallinger, 2005).
Nonetheless, the author contends that even these influential studies of the literature
would have benefitted from being more systematic and explicit about their methods of
review. At its heart, a review of research involves identifying, accessing, managing,
evaluating and synthesizing various forms of information. This is the case regardless
of whether the information consist of numbers, narratives, ideas or themes. Scholars
working in disciplines from education, social work and management to medicine,
engineering and economics increasingly agree that even reviews that rely primarily on
the synthesis of ideas (e.g. qualitative data such as discourse, interview data, etc.)
benefit from being more systematic and explicit (DeGeest and Schmidt, 2010; Gough,
2007; Lipsey and Wilson, 2001; Paterson et al., 2001; Shemilt et al., 2010; Valentine et al.,
2010). When reviewers depart from these standards, accepted scholarly practice
requires an explicit statement of the rationale.
In conclusion, we suggest that changes in our approach to reviewing research
mirror the evolution of qualitative research over the past 30 years. More explicit
standards of practice that emphasize transparency in the research process have
replaced personal interpretation over time (Barnett-Page and Thomas, 2009; Paterson
et al., 2001; Sandelowski and Barroso, 2007; Thomas and Harden, 2008; Weed, 2005). As
the field of educational leadership and management moves forward, reviews of
research will continue to offer influential guidance to both beginning and mature
scholars. It is therefore critical that these tools used in the knowledge production
enterprise meet standards that enable them to produce cutting-edge findings that can
reliably guide theory, research, policy and practice.
Postscript
I wish to close with some final thoughts that follow from this effort to more
systematically define the standards and practices involved in conducting reviews of
research in educational leadership and management. These comments concern the
relationship between theory development, empirical research and research reviews
as scholarly activities.
While reading the early reviews of research conducted by scholars in educational
leadership and management (e.g. Briner and Campbell, 1964; Campbell and Faber,
1961; Erickson, 1967; Lipham, 1964), I was struck by the challenge these scholars had
assumed in undertaking reviews of research in an immature field of inquiry. These
pioneers sought to make sense of a field that had yet to yield a substantial foundation
Systematic
reviews of
research
143
JEA
51,2
144
of empirical research. It was only during subsequent decades that a knowledge base of
greater breadth and depth emerged on which scholars could conduct more rigorous
reviews (see Donmoyer et al., 1995; Campbell, 1979; Griffiths, 1979; Hallinger, 2011a,
2012; Murphy et al., 2007; Ogawa et al., 2000; Ribbins and Gunter, 2002). This explains,
for example, why all of the reviews of research in educational leadership and
management conducted between 1960 and 1990 were exploratory in nature (Hallinger,
2012). It was only with the emergence of a more substantial empirical knowledge base
in the 1990s that scholars were able to begin to conduct explanatory reviews (e.g. Eagly
et al., 1992; Hallinger and Heck, 1998; Leithwood and Sun, 2012; Robinson et al., 2008;
Witziers et al., 2003).
This observation yields several related recommendations. First, the field
should acknowledge its debt to these pioneering scholars. Their reviews set the
stage for the empirical and theoretical efforts of future generations of scholars
in educational leadership and management. As a field, we should not forget the
roots of our scholarship.
Second, with this particular point in mind, I wish to register my personal distress
with the short-sighted perspective of scholars who are overly prone to critiquing
authors (e.g. of dissertation or manuscripts) for citing too many out of date
references. In a field that is distinguished by a very slow pace of knowledge
accumulation (Bridges, 1982; Donmoyer et al., 1995; Hallinger, 2011a; Ogawa et al.,
2000), high quality research retains an especially long shelf life. Perhaps even more
importantly, sound scholarship is built upon a firm understanding of the long-term
trend of knowledge accumulation. In my own scholarship, I cannot imagine writing
about leadership for learning in 2012 without drawing on earlier work from Lipham
(1961), Bridges (1967, 1982), March (1978), Erickson (1979), Bossert et al. (1982),
Cuban (1984), Murphy (1988) and Leithwood et al. (1990). Both as an experienced
reviewer of research and as a journal editor, I would exhort colleagues
internationally to replace demonstrates understanding of recent literature with
demonstrates deep understanding of the literature as the relevant criterion when
assessing the quality of scholarship.
A third related recommendation highlights the lineage that evolves among a set of
reviews as a field of study matures over time. I earlier asserted that the explanatory
power of the reviewers conceptual lens can be magnified dramatically by linking the
questions, frameworks and measures employed in a review to those of previous
reviewers (see Bridges, 1982; Hallinger, 2011a; Murphy et al., 2007). By doing so,
the reviewer is able to trace the developing lineage of a field more clearly and make the
current reviews contributions more explicit. Thus, reviewers should be explicit
in placing their reviews of research in historical context.
A fourth recommendation arising concerns the critical importance of high quality
empirical research as a pre-requisite for conducting high quality reviews. A persisting
finding from scholarship conducted over the past 50 years has been the highly
variable quality of research conducted in our field (e.g. Bridges, 1982; Campbell, 1979;
Griffiths, 1979; Haller, 1979; Leithwood et al., 1990; Hallinger, 2011a; Murphy, 1988;
Witziers et al., 2003). Nonetheless, progress over time has resulted from the hard
labor of scholars who have been willing to seek funding, manage research staff, obtain
the participation of school practitioners, deal with university bureaucracies, and
more generally live with the unpleasant tasks involved in conducting empirical
research projects. Notably, a relatively small set of international scholars have
contributed the empirical studies on which reviews of research in our field are based.
These scholars also deserve acknowledgement. Their names are found in the reference
lists of our research reviews.
Finally, on the heels of acknowledging the contribution of empirical researchers
who cut the individual stones on which the foundations of our field are based, I wish to
close by reasserting the importance of the research review as a form of research
activity. As suggested earlier, research reviews take the stones cut by individual
researchers and mold them into a coherent meaningful shape.
Perhaps because all scholars write literature reviews in the course of their
empirical studies, this form of scholarship comes to be taken for granted. Our field
must, however, take the practice of reviewing research more seriously, and accord
it a status equal to that of theoretical and empirical contributions (see also Murphy
et al., 2007). As suggested, all three forms of research activity make unique yet
complementary contributions to knowledge accumulation. Perhaps as our reviews
of research become more systematic, their value to the research enterprise will
be acknowledged more explicitly among scholars, not only implicitly through their
high citation rates.
Notes
1. This figure was obtained through a search of Googlescholar on November 10, 2012.
2. We examined citation trends in eight widely read international journals in the field of
educational leadership and management. Research reviews of research held the position as
the most frequently cited article in six of the eight journals. We further note that the
education journal consistently among those with the highest impact factor in the Social
Science Citation Indexis the RER.
3. See the journals website at http://onlinelibrary.wiley.com/journal/10.1002/%28ISSN%
291759-2887
4. This may be less critical in a review that focusses on methodological characteristics
of studies.
5. Readers will note the explosive response to the publication of the Hattie (2009) meta-analytic
review of factors that impact achievement. Similarly, within educational leadership and
management, the reviews by Robinson et al. (2008) and Leithwood et al. (2008) have achieved
annual levels of citation impact only occasionally seen in the educational leadership and
management literature (i.e. 450 citations per year).
References
Arter, J. and McTighe, J. (2001), Scoring Rubrics in the Classroom, Corwin Press,
Thousand Oaks, CA.
Barnett-Page, E. and Thomas, J. (2009), Methods for the synthesis of qualitative research: a
critical review, NCRM Working Paper, NCRM, Social Science Research Unit, Institute of
Education, London.
Bero, L., Grilli, R., Grimshaw, J., Harvey, E., Oxman, J. and Thomson, M.A. (1998), Closing
the gap between research and practice: an overview of systematic reviews of interventions
to promote the implementation of research findings, British Medical Journal, Vol. 317
No. 7156, pp. 465-8.
Bossert, S., Dwyer, D., Rowan, B. and Lee, G. (1982), The instructional management role of the
principal, Educational Administration Quarterly, Vol. 18 No. 3, pp. 34-64.
Bridges, E. (1967), Instructional leadership: a concept reexamined, Journal of Educational
Administration, Vol. 5 No. 2, pp. 136-47.
Systematic
reviews of
research
145
JEA
51,2
146
Hallinger, P. (2011b), Leadership for learning: lessons from 40 years of empirical research,
Journal of Educational Administration, Vol. 49 No. 2, pp. 125-42.
Hallinger, P. (2012), Reviewing Reviews of Research in Educational Leadership: An Empirical
Analysis, Monograph, Asia Pacific Centre for Leadership and Change, Hong Kong Institute
of Education, Hong Kong, available at: www.ied.edu.hk/apclc/monographs.html
Hallinger, P. and Bryant, D.A. (in press), Mapping the terrain of research on educational
leadership and management in East Asia, Journal of Educational Administration.
Hallinger, P. and Heck, R.H. (1996), Reassessing the principals role in school effectiveness:
a review of empirical research, 1980-1995, Educational Administration Quarterly, Vol. 32
No. 1, pp. 5-44.
Hallinger, P. and Heck, R.H. (1998), Exploring the principals contribution to school effectiveness:
1980-1995, School Effectiveness and School Improvement, Vol. 9 No. 2, pp. 157-91.
Hallinger, P., Wong, W.C. and Chen, C.W. (in press), Assessing the measurement properties of the
principal instructional management rating scale: a meta-analysis of reliability studies,
Educational Administration Quarterly.
Harzing, A.W. (2008), Google scholar: a new data source for citation analysis, available at:
www.harzing.com/pop_gs.htm (accessed February 22, 2008).
Hattie, J.A.C. (2009), Visible Learning: A Synthesis of over 800 Meta-Analyses Relating to
Achievement, Routledge, London.
Heck, R.H. (2012), personal communication, August 5.
Hunter, J.E. and Schmidt, F.L. (2004), Methods of Meta-Analysis: Correcting Error and Bias in
Research Findings, 2nd ed., Sage, Thousand Oaks, CA.
Ioannidis, J.P.A. (2010), Meta-research: the art of getting it wrong, Research Synthesis Methods,
Vol. 1, pp. 169-84.
Jackson, G.B. (1980), Methods for integrative reviews, Review of Educational Research, Vol. 50
No. 3, pp. 438-60.
Kyriakides, L., Creemers, B., Antoniou, P. and Demetriou, D. (2010), A synthesis of studies
searching for school factors: implications for theory and research, British Educational
Research Journal, Vol. 36 No. 5, pp. 807-30.
Leithwood, K. and Jantzi, D. (2005a), A review of empirical evidence about school size effects:
a policy perspective, Review of Educational Research, Vol. 79 No. 1, pp. 464-90.
Leithwood, K. and Jantzi, D. (2005b), A review of transformational school leadership research
1996-2005, Leadership and Policy in Schools, Vol. 4 No. 3, pp. 177-99.
Leithwood, K. and Montgomery, D. (1982), The role of the elementary principal in program
improvement, Review of Educational Research, Vol. 52 No. 3, pp. 309-39.
Leithwood, K. and Sun, J.P. (2012), The nature and effects of transformational school leadership:
a meta-analytic review of unpublished research, Educational Administration Quarterly,
Vol. 48 No. 3, pp. 387-423.
Leithwood, K., Begley, P. and Cousins, B. (1990), The nature, causes and consequences of
principals practices: an agenda for future research, Journal of Educational
Administration, Vol. 28 No. 4, pp. 5-31.
Leithwood, K., Harris, A. and Hopkins, D. (2008), Seven strong claims about successful school
leadership, School Leadership and Management, Vol. 28 No. 1, pp. 27-42.
Light, R.J. and Pillemer, D.B. (1984), Summing Up: The Science of Reviewing Research, Harvard
University Press, Cambridge, MA.
Lipham, J. (1961), Effective Principal, Effective School, National Association of Secondary School
Principals, Reston, VA.
Systematic
reviews of
research
147
JEA
51,2
148
Systematic
reviews of
research
Further reading
Harzing, A.W. (2012), Publish or perish, available at: www.harzing.com.
Hass, E., Wilson, G., Cobb, C., Hyle, A. and Kearney, K. (2007), An analysis of citations to
Educational Administration Quarterly, Educational Administration Quarterly, Vol. 43
No. 4, pp. 494-513.
149