Professional Documents
Culture Documents
Paul Zachos,
Thomas L. Hick,
Cynthia Sargent
Many science education programs in the United States are adopting the
goal of developing competence in the conduct of scientific inquiry in
response to recent recommendations from national advisory panels
(American Association for the Advancement of Science, 1990, 1993;
National Research Council, 1996). However, there are no generally
recognized, systematic methods for assessing whether this goal has been
achieved (Doran, Lawrenz, & Helgeson, l994; Tamir, 1993). A validated
instrument for assessing students' competence in conducting scientific
inquiry would provide a basis for planning, monitoring, and facilitating
student learning. Such an instrument would also permit comparisons of
the relative efficacy of instructional programs dedicated to the
development of scientific inquiry capabilities. This paper sets
foundations for the development of such an instrument.
93
9
science
The present study deals exclusively with the attainment of the first
goal which traditionally has been referred to by terms such as: science
process skills, problem solving skills, the scientific method, scientific thinking, critical
thinking, and re.ftexive thinking. Here the development of these intellectual
characteristics will be referred to as the development of Scientific Inquiry
Capabilities. Two broad reasons are typically given for developing these
capabilities (Gauld, 1982):
1. Training for would-be scientists, and
2. Education for nonscientists in effective ways to deal with the
world.
and again,
until the work of Galileo and Newton, has been a mathematical one. This
is expressed in a radical form by Neumann
(1963):
Figure 2.
The early use of the clinical method, however, was often limited by
exclusive reliance on verbal reports from students. When Piaget began
his collaboration with Barbel Inhelder, in studies such as The Growth of
Logical Thinking: from childhood to adolescence (1958), empirical ""tasks''
became a standard feature of his research. Participants' cognition was
studied in the presence of an object of inquiry common to both the
researcher and the participant. The virtue of having an empirical task
as the focus of a clinical interview is that it provides a common
referent for participant, researcher, and scientific audience. The
protocol then becomes a ""concurrent verbal report'' (Ericksson & Simon,
1996) referring to an objective event rather than an attempt to make
inferences simply from subjects'
verbalizations.
In order to maximize the authenticity and generalizability of our
findings we decided to use tasks in which participants worked directly
with natural phenomena instead of simulations. Working directly with
natural phenomena is often considered ""messy'', since too many
variables are involved to allow for conventional methods of control
needed
to
test highly specific hypotheses concerning student
performance. In our experience, however, it is this very messiness that
provides the rich environment necessary to elicit the full range of
scientific inquiry capabilities and the participants' attempts to grasp
and explore the parameters of the phenomenon. Capabilities such as
Concern for Precision of Measurement, the Search for Necessary
Underlying Principles, the application of Ratios and Proportions to the
task, Goal Oriented Observation, Consideration of the Relative Value of
Empirical Evidence and others require a living interaction with natural
phenomena to be elicited and
engaged.
Studies of scientific inquiry capabilities such as those of Kuhn,
Amsel and O'Loughlin (1988), which are composed of contrived tasks that
are carefully structured to test hypotheses about specific scientific
inquiry capabilities, while laudable in many respects, are necessarily
constrained in their generalizability because their focus is removed
from the complexity of concerns and activities that characterize genuine
scientific investigations. Furthermore, since participants in such
studies are given the concepts they are to work with, preformed by the
researcher, the opportunity for participants to build concepts is not
available.
One protocol of particular value was found in Lochhead's study, ""On
learning to balance perceptions by conceptions: a dialogue between two
science students,'' (1979). Lochhead presented a complete transcript of
a student inquiry into a natural phenomenon mediated and made manifest
through clinical interview. The conditions provide a rich context that
allows the participant to build and test concepts, and allows the
observer to study the emergence and engagement of the participant's
scientific inquiry capabilities and
conceptualizations.
Using Lochhead's study as a model, we developed an interview method
that we call
Structured Inquiry, which serves a number of functions:
1) Students are presented with a natural phenomenon and a task
related to that phenomenon, but without being provided with the
researcher/interviewer's con- ceptualization of the phenomenon.
2) The interviewer ensures that the student completely understands the
task.
3) The interviewer elicits the student's prior knowledge related to
the phenomenon as a baseline for assessing growth in
conceptualization.
4) The interviewer elicits student's hypotheses, methods of inquiry,
reasoning and knowledge, and dispositions related to the
phenomenon during the course of the inquiry session.
Figure ].
task.
""For instance I may give you 5 weights to put on one side and 2
weights to put on the other.''
as a phenomenal
coordination
attribute
but
volume
is
already
an
operation,
Figure 4.
task.
Figure 5.
is that
as
the
levels
grow
higher they
Figure 6.
scale.
Figure 7.
Scoring Procedures
Raters. The structured inquiry sessions were videotaped. This provided
a rich record of the inquiry session that could be reviewed repeatedly
and used for both assessment and training. Judgements of participant
levels of conceptualization and participant levels of performance on
each of the inquiry capabilities were made, from the videotapes, by 13
different individuals who were referred to as Raters. Raters were former
or current physical science teachers at the secondary school level,
practicing scientists, a mathematics teacher, a philosopher, and a
doctoral level student of education.
(a) guidelines for the scoring process, (b) scales and scoring forms for
each of the Scientific Inquiry Capabilities, and (c) scales and scoring
forms for each of the three tasks. These resource materials are
available on our website.
The first step in training was an introduction to the purpose of the
research and a description of the scales and scoring procedures. This
was followed by several intensive sessions in which the trainer and
prospective raters watched videotapes of guided inquiry sessions that
demonstrated varying levels of competence on the three tasks. Whenever a
viewer identified a scorable moment (i.e., an instance of evidence
concerning the absence or presence of a Scientific Inquiry Capability or
a Level of Conceptualization of the phenomenon) the videotape was
stopped and ratings were recorded. The viewers then compared their
ratings (or lack thereof) and discussed discrepancies in rating. The
videotape segment could be replayed to question or confirm ratings. In
this way a general understanding of the meaning of the scales was
achieved. Typically, those portions of the video in which participants
were speaking contained scorable instances. Even periods of silence on
the videotape could contain scorable instances (e.g., SIC#12 Consults
Recorded Notes). Often there was a period of time when no new scoring
could take place because a stable Level of Conceptualization had been
reached and a particular Scientific Inquiry Capability was being
constantly applied. Such a period provided ample opportunities for
confirming the last rating that had been made. In some cases where
several minutes passed without scorable instances, the videotape was
stopped and the scoring guidelines were reviewed to see if an
opportunity for a scorable instance may have been
missed.
Scoring. Raters attended one or more training sessions for the purpose
of familiarizing themselves with the scoring process. Raters took video
copies of the inquiry sessions home with them to conduct analyses. They
were also given copies of notes made by participants in the course of
inquiry sessions. The trained raters then viewed the video tapes and
used the scales of Scientific Inquiry and Discovery to characterize the
performance of these participants on the inquiry tasks. They viewed
videotapes of the inquiry sessions, recorded initial and final levels of
conceptualization on each of the criterion variables, and the highest
level of attainment demonstrated on each of the scientific inquiry
abilities.
To eliminate inadvertent bias on the part of the raters, Levels of
Conceptualization and Scientific Inquiry Capabilities for each task were
always scored by different individuals. In addition, no rater scored
more than one LOC and one set of SIC's for the same participant. This
doubled the magnitude of the scoring task but greatly simplified the
task of the raters by allowing them to focus either upon scientific
inquiry processes or upon level of conceptualization but not both. Thus
no rater made judgments on the inquiry capabilities and levels of
conceptualization for a participant on the same task, and no rater made
judgments on more than one set of inquiry capabilities or one instance
of levels of conceptualization for the same participant. This procedure
protected against contamination of Discovery variable by knowledge of
Scientific Inquiry variables and vice versa. Preventing such
contamination is critical because the crux of the study was to establish
the nature of the relationship between these variables. Thus, raters
were assigned so that correlations between Scientific Inquiry
Capabilities and Levels of Conceptualization would be free of rater
bias.
Analysis
When scoring of the videotaped records of Structured Inquiries was
completed, an analysis procedure was applied to establish the extent of
the link between Discovery and competence in
Table 1
Floating and Sinking Task; Cross tabulation of Initial and Final Levels of Conceptualization
Table 2
Growth in Level of Conceptualization on all tasks
Amount of change in
Task
Balance beam
Floating & sinking
Pendulum
Object density
Liquid density
1
0
0
1
0
0
LOC
Mean
4
12
10
18
25
5
10
9
6
2
5
2
6
1
0
5
4
4
0
1
10
4
0
2
1
2
0
2
3
2
1
0
0
2
1
2.69
1.31
1.31
1.34
0.78
32
32
32
32
32
task.
956
Table 3
Values of r2 for effects of Scientific Inquiry Capabilities on Discovery
SIC
BAL
SIC #1:
SIC #2:
SIC #3:
SIC #4:
SIC #5:
SIC #6:
SIC #7:
SIC #8:
SIC #9:
SIC
#10:
SIC
#11:
SIC
#12:
SIC
#13:
SIC
#14:
SIC
#15:
SIC
#16:
SIC
#17:
SIC
#18:
SIC
#19:
SIC
#20:
SIC
#21:
SIC
#22:
SIC
#23:
SIC
#24:
SIC
#25:
SIC
#26:
SIC
#27:
SIC
#28:
SIC
#29:
* = Significant at .01<X<.05
** = Significant at .001<X<.01
*** = Significant at <X<.001
0.07
0.07
0.01
0.22**
0.03
0.01
0.01
0.03
0.00
0.06
0.13*
0.24**
0.04
0.28***
0.08
0.02
0.08
0.12*
0.03
0.20**
0.04
0.14*
0.09
0.01
0.03
0.05
0.29***
0.05
0.02
FLO
OBJ
DENS
0.04
0.06
0.07
0.06
0.00
0.03
0.08*
0.1l*
0.00
0.02
0.00
0.00
0.04
0.05
0.19*** 0.57***
0.00
0.01
0.05
0.23**
0.01
0.13*
0.01
0.10*
0.01
0.04
0.13**
0.22*
0.01
0.00
0.03
0.08
0.03
0.01
0.05
0.18**
0.14** 0.28***
0.00
0.09
0.00
0.00
0.02
0.12*
0.10**
0.11*
0.03
0.15*
0.03
0.11*
0.01
0.12*
0.08*
0.25***
0.01
0.10*
0.01
0.15*
LIQ
DENS
0.02
0.05
0.00
0.06
0.00
0.00
0.04
0.55***
0.03
0.14*
0.05
0.05
0.01
0.22**
0.00
0.05
0.01
0.12*
0.27***
0.05
0.01
0.07
0.09
0.07
0.06
0.09
0.24***
0.03
0.06
PEN
0.00
0.02
0.00
0.10
0.06
0.02
0.06
0.03
0.00
0.00
0.00
0.00
0.01
0.19**
0.16*
0.03
0.00
0.00
0.00
0.09
0.09
0.17*
0.01
0.00
0.00
0.00
0.22**
0.01
0.01
ZACHOS
ET
95
7
levels of the SIC scales. Recall that the scales are ordered to
represent increasing competence with regard to some aspect of knowledge,
skill or disposition related to scientific inquiry. The lowest level
(Level 0) indicates that no appearance of the capability was found over
the course of the Structured Inquiry session. The higher levels
represent increasing movement towards expert levels of performance. In
many cases the highest level is recognized when the participant shows an
awareness of the nature and importance of that capability for scientific
inquiry as well as experiencing success in applying the capability.
While the complete set of scales is available on the website, it may be
valuable for illustrative purposes to consider performance on those
Scientific Inquiry Capabilities that were most strongly related to
Discovery:
SIC #14, The Search for a Necessary Underlying Principle.
Participants who scored highly were those who gave evidence that they
were seeking a rule, law or formula that could account for variations in
the phenomenon they were observing.
SIC # 27, Proportional Reasoning. Participants had to successfully
coordinate relationships between two or more ratios related to the task,
for example, to coordinate the ratio of mass and distance on one side of
the balance beam to the ratio of mass and distance on the other side.
SIC #4, Coordinating Theory with Evidence. High performance on this
scale required the participant to use data (evidence) to evaluate their
concept (theory), or to use their concept to evaluate the data that they
had collected. The highest level of performance on this scale required
that they give evidence of doing both.
SIC #8, Formulates Composite Variables. Participants had to create a
variable which is a function of two or more other variables (e.g.,
torque, volume, density) and use this variable to investigate the
phenomenon under consideration.
SIC #19, Identifies Sources of Error in Taking Measurements.
This scale
required
participants to spontaneously (i.e., without prompting by the inquiry
guide) suggest potential sources of error related to their measurements
(e.g., inconsistency in application of a measurement instrument,
inadequacies of the instrument, bias).
In general, students who were strong in the competencies measured by
the scales of Scientific Inquiry Capabilities were more successful in
making discoveries. All significant correlations between competency in
conducting scientific inquiry and growth in concept attainment were
positive.
Summary
We have presented methods for operationally defining Scientific
Inquiry Capabilities and Discovery for use in science education
programs. These methods include scales for assessing competence in
conducting disparate aspects of scientific inquiry and scales for
assessing progress in conceptualizing natural phenomena. A procedure
called Structured Inquiry was also presented. Structured Inquiry is a
method for administering tasks and eliciting participant performance
related to the targeted cognitive capabilities. This method has the
desirable features of facilitating a rich interaction between the
participant and the natural world, of eliciting the participant's
relevant cognitive characteristics, and of keeping the researcher's
judgment processes independent of the participant's.
A procedure was presented for empirically validating the proposed
Scientific Inquiry Capabilities against the criterion of Discovery using
multiple regression analysis. The procedure was able to identify a
95
ZACHOS ET AL.
8
number
of capabilities that were strongly related to Discovery, some on
more than one discovery task.
It was demonstrated that progress in personal discovery of important
scientific concepts,
that is, success in building and testing concepts related to these
phenomena, can be accomplished
Educational Implications
In designing this study we constrained ourselves to be in harmony
with the following principles in order to facilitate the applicability
of the research to teaching
situations:
The constructs used in the study were designed to be
understandable by and useful to teachers in classroom learning
situations.
e
The levels of competence on the scales of Scientific Inquiry
Capabilities were stated as
performance objectives so that they could translate immediately into
intended learning outcomes.
e The study was constructed so that participants would be learning as
their capabilities
were being assessed. In fact, participants were actively
constructing knowledge throughout the process of Structured
Inquiry.
e All phases of the study were reviewed by a panel of teachers who
regularly
made
suggestions which resulted in valuable improvements.
e
Through the present research, a method has been presented that can be
used to empirically justify the inclusion of specific capabilities in a
References
American Association for the Advancement of Science. (1967). Science:
A process approach. Washington D.C.: AAAS.
American Association for the Advancement of Science. (1990). Science
for all Americans.
New York: Oxford University Press.
American Association for the Advancement of Science. (1993).
Benchmarks for science literacy. New York: Oxford University Press.
Bruner, J.S. (1960). The process of education. New York: Random
House.
Bruner, J.S., Goodnow, J., & Austin, G. (1956). A study of thinking.
New York: John Wiley & Sons.
Campbell, N.R. (1957). Foundations of Science. New York:
Dover.
Chi, M.T.H., Feltovich, P.J., & Glaser, R. (1981). Categorization and
representation of physics problems by experts and
novices. Cognitive
Science, 5,
121-152.
Chi, M.T.H., Glaser, R., & Rees, E. (1982). Expertise in problem
solving. In R.J. Sternberg (Ed.), Advances in the psychology of human
intelligence (Vol. 1, pp. 7 - 75). Hillsdale, NJ: Lawrence Erlbaum
Associates.
Clement, J. (1988). Observed methods for generating analogies in
scientific problem solving. Cognitive Science, 12, 563 - 586.
Clement, J. (1991). Informal reasoning in experts and in science
students: The use of
analogies, extreme cases, and physical intuition. In J. Voss, D. Perkins,
& J. Siegel (Eds.), Informal reasoning and education. Hillsdale, NJ:
Lawrence Erlbaum
Associates.
Clement, J. (1989). Learning via model construction and criticism. In
G. Glover, R. Ronning, & C. Reynolds (Eds.), Handbook of creativity:
Assessment, theory & research (pp. 341 - 381). New York: Plenum.
Dawson, C.J., & Rowell, J.A. (1986). All other things being equal: A
study of science graduates solving control of variables problems.
Research in Science and Technological Education, 4, 49 - 60.
DeBoer, G.E. (1991). A history of ideas in science education:
Implications for practice. New York: Teachers College Press.
Dewey, J. (1910). How we think. Lexington, MA: D.C. Heath.
Dewey, J. (1933). The process and product of reflective activity:
Psychological process and logical forms. In J. Boydston (Ed.), Later
Works of John Dewey pp. 171 - 186. 88, Carbondale: Southern Illinois
University Press.
Doran, R., Lawrenz, F., & Helgeson S. (1994). Research on assessment
in science. In E. Gabel (Ed.), Handbook of research on science teaching
and learning. New York:
Macmillan.
Easley, J.A. Jr. (1974). The structural paradigm in protocol
analysis. Journal of Research in Science Teaching, 11, 281 - 190.
Einstein, A. (1935). The world as we see it. London: John
Lane.
Ericksson, K.A., & Simon, H. (1996). Protocol analysis. Cambridge, MA:
The MIT Press. Feynman, R.P. (1993). Six easy pieces. New York:
Addison-Wesley.
Gagne, R.M. (1963). The learning requirements for enquiry. Journal
of Research in Science Teaching, 144 - 153.
Gauld, C. (1982). The scientific attitude and science education: A
critical
reappraisal.
Science Education, 66, 109 - 121.
Giancoli, D. (1986). The ideas of physics. San Diego, CA: Harcourt
Brace
Jovanovich.
Ginsberg, H.P., Kossan, N.E., Schwartz, R., & Swanson, D. (1983).
Protocol methods in research on mathematical thinking. In H. Ginsburg
(Ed.), The development of mathematical thinking (pp. 7 - 47). New York:
Academic Press.
Helm, H., & Novak, J. (Eds.), (1983). Proceedings of the
international seminar: Misconceptions in science and mathematics.
Ithaca, NY: Department of Education, Cornell University.
Hesse, M. (1966). Models and analogies in science. Notre Dame, IN:
University of Notre Dame Press.
Hodson, D. (1990). A critical look at practical work in school
science. SSR, 70 (256), 33 - 40.
Holland, J.H., Holyoak, K.J., Nisbett, R.E., & Thagard, P.R. (1986).
Induction: Processes of inference, learning, and discovery. Cambridge,
MA: MIT Press.
Inhelder, B., & Piaget, J. (1958). The growth of logical thinking
from childhood to adolescence. New York: Basic Books.
Jones, M.L., & Rowsey, R.E. (1990). The effects of immediate
achievement and retention of middle school students involved in a metric
unit designed to promote the development of estimating skills. Journal
of Research in Science Teaching, 27(9), 901 - 913.
Joram, E., Subrahmanyam, K., & Gelman, R. (l998). Measurement
estimation: Learning to map the route from number to quantity and back.
Review of Educational Research, Winter 68(4), 413 - 449.
Karplus, R., Pulos, S., & Stage, E. (1983). Proportional reasoning of
early adolescents. In.
R. Lesh & M. Landau (Eds.), Acquisition of mathematical concepts and
processes (pp. 45 - 90). Orlando, FL: Academic Press.