Professional Documents
Culture Documents
Taylor & Francis makes every effort to ensure the accuracy of all the
information (the “Content”) contained in the publications on our platform.
However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness,
or suitability for any purpose of the Content. Any opinions and views
expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the
Content should not be relied upon and should be independently verified with
primary sources of information. Taylor and Francis shall not be liable for any
losses, actions, claims, proceedings, demands, costs, expenses, damages,
and other liabilities whatsoever or howsoever caused arising directly or
indirectly in connection with, in relation to or arising out of the use of the
Content.
This article may be used for research, teaching, and private study purposes.
Any substantial or systematic reproduction, redistribution, reselling, loan,
sub-licensing, systematic supply, or distribution in any form to anyone is
expressly forbidden. Terms & Conditions of access and use can be found at
http://www.tandfonline.com/page/terms-and-conditions
Downloaded by [Tulane University] at 11:11 26 January 2015
Journal of Moral Education, Vol. 31, No. 4, 2002
The Test for Ethical Sensitivity in Science (TESS) described in this article is a
Downloaded by [Tulane University] at 11:11 26 January 2015
ABSTRACT
pen-and-paper measure for studying ethical sensitivity development in young adults. It was
developed to evaluate the impact of a short ethics discussion course for university science
students. TESS requires students to respond to an unstructured story and their responses are
scored according to the level of recognition of the ethical issues in the scenario provided. When
TESS was used in conjunction with ethics teaching it showed that university science education
seems to provide no inherent bene ts in ethical sensitivity development but that a short course
in ethics can have a signi cant impact on students’ ability to recognise ethical problems.
1. People may block from their consciousness certain aspects because the cues
in the situation are ambiguous and it becomes dif cult to interpret them.
2. Research shows that there are distinct individual differences in sensitivity to
needs and welfare.
3. Research has shown that there can be a strong affective response before
extensive cognitive encoding.
practising dentists and moral philosophers. DEST has proved to be reliable, with
inter-scorer agreement averaging 0.87 and test–retest correlation averaging 0.68
(Bebeau & Brabeck, 1987). The correlation between DEST scores and DIT was
found to be between 0.2 and 0.5.
DEST is very speci c for measuring moral sensitivity in a professional context.
The research literature does not entertain considerations of whether professional
moral sensitivity can be understood as general moral sensitivity or whether moral
sensitivity can develop in relative isolation in different areas of life, and thus one
should not extrapolate these results to measures of general moral sensitivity. Fur-
ther, this approach is best applicable to professions where moral considerations are
situated in personal interactions and which have an agreed code of professional
ethics, as in medicine, teaching and law and, to a certain extent, in science (fraud,
whistle-blowing); but this is a less suitable approach for measuring moral sensitivity
Downloaded by [Tulane University] at 11:11 26 January 2015
interaction with the problem. We can construct morally demanding and complex
situations where students would need to face issues in animal welfare, whistle-
blowing, or use of human subjects, but many ethically demanding problems are not
easily captured in this manner: what are the limits of genetic research, what type of
research should we do, who makes the decisions in the direction of modern
bioscience, etc. Therefore, to study ethical sensitivity towards general scienti c
problems in a large population of science students, it became necessary to develop
a new test.
problems. A moral problem is unstructured when it does not directly indicate the
moral issues involved, either by describing them in the problem narrative or by
giving moral statements to choose as possible solutions or considerations for arriving
at a solution. The problems used in DIT are “structured” moral problems, because
the narrative structure describes a particular moral dilemma (e.g. should Heinz steal
and save his wife, or should he not steal and not save his wife) and the consider-
ations for the decision are all part of the moral deliberation process. An
“unstructured” moral problem is thus a problem scenario which has moral compo-
nents, but where these components are not self-evident, and a solution to the
problem can be arrived at without ethical considerations (although that solution
would indicate low ethical sensitivity).
It is therefore impossible to measure ethical sensitivity with a “tick-a-box”
method. Any such method would have to include some level of pre-established
moral analysis, which would have taken place before any statements to choose from
could have been produced. For example, a test protocol which gives students an
unstructured moral problem and then offers several ethical and non-ethical elements
to choose to include in their deliberation, would not test the recognition of ethical
issues, but the importance students place on these issues. It has been found that
people can recognise and discriminate and thus prefer an idea before they can
paraphrase it or before they can spontaneously produce the idea in a response to a
story dilemma (Rest, 1976). An ethical sensitivity test needs to measure the spon-
taneous recognition of moral issues, the interpretation of a situation in moral terms,
if we wish it to represent the ethical sensitivity skills needed in real-life situations.
Therefore, the nature of moral sensitivity requires the test of moral sensitivity
to be qualitative, to allow subjects to respond to an unstructured problem with only
minimal guidelines or pre-established thought-patterns. This type of qualitative data
can be collected either verbally in an interview or in a written form. DEST used both
methods, which provided equally valid and reliable data (Bebeau et al., 1985), while
the interview scores yielded higher estimates of moral sensitivity, as judges felt they
had a better opportunity to con rm their judgement from verbal responses. Inter-
views may produce more data, but they are also more laborious to administer. When
the need is to test large numbers, the appropriate choice is a written test-format.
444 H. Clarkeburn
could be used to treat patients with cystic brosis. Other pharmaceutical methods to produce this protein
have not been successful or they have been very expensive. The plan is to introduce a new gene from another
animal into the genetic sequence of the cow that directs the production of the mammary gland to change
it from producing normal milk into producing a pharmaceutical milk containing the desired proteins. The
new gene will be introduced by nuclear transfer, a technique also used in cloning. The group hopes to
develop its research ndings into a commercial product.
Do you think the research should go ahead?
Pilot Studies
TESS was developed as part of a research programme aimed at designing and
evaluating an ethics programme for a large number of life sciences undergraduates
at the University of Glasgow (Clarkeburn et al., 2001).
Three different unstructured scenarios were piloted for TESS. Two of the
stories were based on realistic research proposals found in Bruce and Bruce (1999)
(genetic modi cation of a cow to produce pharmaceutical milk for cystic brosis
(Story 2) and genetically enhancing nutritious qualities of a plant (Story 3)). The
third story in the pilot study described a take-over offer made to a successful
research laboratory (Story 1). Each story nished with a question asking whether or
not some action should be taken (see Table I for details).
For each story students were asked to write down no more than ve issues or
questions they believed should be considered before a decision on the topic could be
made. Students had 15 minutes to complete the task. Each story was piloted with
approximately 20 bioscience students from the rst year (Level One/L1) and 20
from the third year of a four-year Honours course (Level Three/L3).
A Test for Ethical Sensitivity in Science 445
L1 L3
At the rst stage the numbers of responses per student were collected and
themes of responses identi ed. Table II details the mean numbers of responses
made by each student group to each of the piloted stories.
Downloaded by [Tulane University] at 11:11 26 January 2015
Stories 1 and 3 did not generate signi cantly different numbers of responses by
L1 and L3 students, while L3 students made signi cantly more responses than L1
students to Story 2. There are three possible interpretations for the different
answering patterns (1) either there is no spontaneous developmental advantage in 2
years of science study measurable by simple response frequencies and Story 2
provides a false impression of such advantage; or (2) the spontaneous advantage
occurs and Stories 1 and 3 fail to capture it; or (3) due to small sample sizes, there
is a possibility of pseudo-difference between levels which is coincidental rather than
representative.
The number of themes identi ed by students for each pilot story are shown in
Table III. At this stage, Story 1 was removed from further analysis and development
for two reasons: (1) it generated the least number of responses from both student
groups and thus provided the least material for further analysis; and (2) it generated
the highest number of themes, which complicates the design of a scoring guide.
At the next stage the responses to Story 2 and 3 were categorised between
ethical and non-ethical considerations. This was achieved by asking whether the
question (only 9% of the responses were not questions) can be answered suf ciently
by reference to scienti c/technical/ nancial data alone? If the answer was yes, the
response was classi ed as non-ethical. An example of non-ethical response is: how
much milk do the CF sufferers need to drink? In contrast, an example of an ethical
response is: will the bene ts to patients be worthwhile enough to justify altering the
genetic composition of a cow? Table IV details the results.
Number of themes
Story 1 13
Story 2 8
Story 3 11
446 H. Clarkeburn
% of ethical responses
L1 L3
Story 2 generated more ethical responses than Story 3 in both student popula-
tions. This can be interpreted as either (1) that the ethical issues in Story 2 are more
accessible; or (2) that story 2 contains more ethical issues per se. It is also worth
Downloaded by [Tulane University] at 11:11 26 January 2015
noting that the lower number of L1 responses to Story 2 were more concentrated on
ethical issues than the larger number of L3 responses. It was assumed that it is most
likely that the ethical issues in Story 2 are more accessible to students than in Story
3 rather than there being an inherent difference of ethical concerns to be recognised.
The next stage of ethical sensitivity measure development was to look at the
ethical responses in more detail. A three-tier structure, similar to that developed for
DEST, was adopted (see Table VI for details of the approach). The lowest tier
represents a very general recognition of the issue, the second tier shows more
detailed understanding of the issue, and the third and last tier provides evidence of
a more extensive and mature understanding of the problems and stakeholders
involved. A three-tier scoring guide was developed for each theme in both stories,
and the responses were then analysed.
Story 3 proved harder to analyse as many responses covered several themes and
in some thematic categories there were either no lowest or highest tier responses,
casting doubt on the accuracy of the scoring guide. Due to these problems which
were absent from Story 2 analysis, further efforts were concentrated on improving
the scoring method for Story 2.
Scoring TESS
The scoring guide for Story 2 was developed from a total of 44 completed
questionnaires from L1 and L3 students. First, all the themes were submitted to
pre-established tests of logic, as suggested by Bebeau et al. (1985): is a criterion
logically independent of every other (i.e. could an individual score high in one, but
not the other)? Using this method the response themes for this story were reduced
from eight to four. In the remaining four main themes, there were altogether nine
different subthemes. See Table V for details.
For each theme/subtheme a four-tier scoring guide (tiers 0 (non-ethical
response)–3 (highest level ethical response)) with sample entries was developed. To
ensure its validity in representing ethical sensitivity development, it was indepen-
dently evaluated by four academics at the University of Glasgow, representing
A Test for Ethical Sensitivity in Science 447
Main theme Risks Cost and bene t Basic values Public opinion
different disciplines: philosophy, education and science. A special effort was made to
describe each tier so that the length of a students’ answer was not a decisive element
in its allocation into a tier. As an example of the nal scoring guide, for risks/
animals, the tiers can be seen in Table VI.
Once the scoring guide was complete, a further three independent raters were
asked to use it to score 10 questionnaires consisting of altogether 36 responses.
If there was an inconsistency between ratings, the response was brought to a
meeting. There were eight responses where agreement needed to be sought and in
ve cases the disagreement was about which subgroup the response belonged to and
Tier 0 Questions of risks for which an answer can be given on purely factual basis—i.e. no moral
consideration required
Sample entries How will the gene affect cow’s original genes?
Where do the genes come from?
Tier 1 First level recognition of risk, which might serve as a stepping stone for higher level
considerations, but that is not apparent in the response
Sample entries Will the cow suffer from producing the milk?
Animals should not be subjected to any pain or distress
Tier 3 The responses include mature considerations about the role of decision-makers and
what should in uence the acceptance of different levels of risk. Justi cation of using
animals is explicitly sought
Sample entries How much animal suffering can be justi ed for commercial pro t?
448 H. Clarkeburn
Student levels L1 L3
in three cases which tier was most appropriate. The guidelines in the scoring guide
Downloaded by [Tulane University] at 11:11 26 January 2015
that led to these inconsistencies were altered after consultation with the independent
raters. The raters also reported that the guide took some time to learn, but was
logical and simple to use thereafter.
To generate a TESS score, it was decided that responses would accumulate a
score equivalent to the tier it belonged to (tier 0 5 0 points, tier 1 5 1 point, tier
2 5 2 points and tier 3 5 3 points). If there was more than one response belonging
to the same subcategory, only the highest scoring response was included in the nal
score to avoid high scores being generated by rephrasing essentially one item several
times. Also, if in doubt, the response was scored on a lower tier to remove the
possibility of the rater lling in gaps in the responses and thus scoring more
subjectively.
TABLE VIII. Mean numbers of responses for L3, pre- and post-TESS
4.275 and for L3 4.780. These scores were not statistically different (P 5 0.097,
Downloaded by [Tulane University] at 11:11 26 January 2015
unpaired t-test). This suggests that 2 years of university studies in science provides
no advantage in ethical sensitivity development as measured by TESS.
In L3, students were randomly divided into test (n 5 133) and control (n 5 134)
groups. The test group participated in an ethics intervention which aimed at
increasing students’ awareness of ethical issues. The intervention consisted of three
structured group discussions on ethical themes introduced to the students by
preliminary reading, which was either a scienti c paper or a short philosophical
extract. Discussion groups were never larger than 15 students, usually 12, and each
discussion lasted approximately 2 hours. The discussion themes were chosen in
collaboration with the course co-ordinators and students. The aim was to design the
discussions so that they would be interlinked with the existing science curriculum
and touch on topics relevant to student experience. All groups started with a
discussion on the use of animals in bioscience research, followed by a subject-
speci c topic (drug trials for pharmacology, DDT use in malaria control for zoology,
etc.); the last discussion was on scienti c integrity and misconduct. All groups were
facilitated by an ethicist (Clarkeburn et al., 2001).
TESS was administered to both test and control groups the second time
(post-test) a minimum 3 weeks after the intervention at the end of term 2. The
control group students participated in the ethics discussion groups in term 3.
The number of responses made in the pre-questionnaire were not signi cantly
different between test and control groups (P . 0.05, unpaired t-test), while the mean
number of responses was signi cantly different in the pre- and post-administration
of TESS for both test and control groups (P , 0.001, paired t-test) with both groups
showing a signi cant increase in the response number in the post-test. However,
there was no signi cant difference in the number of responses between control and
test groups in the post-test (see Table VIII for details).
The pre-TESS scores were not signi cantly different between test and control
groups. However, the post-TESS scores, after the intervention, were signi cantly
different between test and control groups (P , 0.05, Wilcoxon’s t-test). Table IX
details the mean pre- and post-TESS scores.
When we look at the direction of change within both groups (Table X) and the
paired t-test results in Table VIII, we nd that the ethical sensitivity score was not
450 H. Clarkeburn
*Signi cant.
number of students both regressing and progressing in the control group suggests
that TESS is sensitive to other elements than just development of ethical sensitivity.
This level of noise can make interpretation of small sample sizes dif cult, but should
cast less doubt on analysis of larger sample sizes.
The data on the control group ethical sensitivity scores and direction of change
indicate that there is no advantage in completing TESS twice, as there was no
signi cant increase in their TESS scores.
Finally, there were no signi cant gender differences in the pre- or post-TESS
scores (P . 0.05 for pre- and post-, unpaired t-test). However, there was an indica-
tive tendency of male students in the control group to regress more often than
female students, and similarly, the male students in the test group were more likely
to progress than female students. Table XI details the results.
Administering TESS takes less than 15 minutes and an experienced scorer can
score around 30 protocols an hour. This makes TESS relatively low in both labour
requirements and demands on student time. Students in this study had no dif culty
lling in TESS and judging from the number of responses they wrote down they also
took the test seriously, thus the results can be considered a fair approximation of
their ethical sensitivity developmental stage in relation to scienti c problems. How-
ever, it is possible that a more accurate picture of students’ ethical sensitivity could
be gained if they were to respond to more than one story at the same time, possibly
relating to different areas of science or representing ethical scenarios outside the
scienti c realm. This would provide an opportunity to compare ethical sensitivity
452 H. Clarkeburn
across topics, reduce the risk of the chosen story being attractive/non-attractive to
the student. This approach was used in both DEST and McNeel’s measure, but not
employed here due to time pressures given to the over all test time.
It is also worth noting that the scores were relatively low, when the maximum
score is 15 and a pre-ethics education average less than ve. Students recognised
relatively few ethical issues and in most cases recognised them only at the most
super cial level. This can naturally be partly an effect of the structure of TESS,
where students might wish to write as short answers as possible in order to complete
the test as quickly as possible. However, when the lengths of the answers from
students scoring high and low were compared, there were no signi cant differences.
This seems to be an indication of relatively low levels of ethical sensitivity in science
students. Their ignorance of even very obvious ethical issues in the discussion
groups supports this interpretation (Clarkeburn et al., 2001). Education designed
Downloaded by [Tulane University] at 11:11 26 January 2015
speci cally to raise students’ awareness and support the development of their ethical
sensitivity seems to be both possible and needed.
The evidence presented in this paper suggests that TESS provides a good
methodology for measuring ethical sensitivity development in large student and
young adult populations.
Acknowledgements
I would like to thank Dr Roger Downie for his support throughout the development
of TESS and The European Commission for the funding which allowed the work to
be carried out.
REFERENCES
BEBEAU , M.J. & BRABECK, M.M. (1987) Integrating care and justice issues in professional moral
education: a gender perspective, Journal of Moral Education, 16, pp. 189–203.
BEBEAU , M.J., REST, J.R. & YAMOOR, C.M. (1985) Measuring dental students’ ethical sensitivity, Journal
of Dental Education, 49, pp. 225–235.
BRUCE, D., & BRUCE, A. (1998) Engineering Genesis: the ethics of genetic engineering in non-human species
(London, Earthscan).
CALLAHAN , D. (1980) Goals in the teaching of ethics, in: D. CALLAHAN (Ed.) Ethics Teaching in Higher
Education, pp. 61–80 (New York, Plenum).
CLARKEBURN , H.M. (2000) How to Teach Science Ethics—strategies for encouraging moral development in
biology (and other) students through the design and use of structured exercises in bioethics, PhD thesis,
University of Glasgow.
CLARKEBURN , H.M., DOWNIE, J.R. & MATTHEW, B. (2002) Impact of minimal ethics intervention in a
science curriculum, Teaching in Higher Education, 7, pp. 65–79.
MCNEEL , S.P. (1994) College teaching and student moral development, in: J.R. REST & D. NARVAEZ
(Eds) Moral Development in the Professions: psychology and applied ethics, pp. 1–26 (Hillsdale, NJ,
Lawrence Erlbaum Associates).
A Test for Ethical Sensitivity in Science 453
REST , J.R. (1976) New approaches in the assessment of moral judgement, in: T. LICKONA (Ed.) Moral
Development and Behavior: theory, research, and social issues, pp. 198–218 (New York, Holt, Rinehart
and Winston).
REST , J.R. (1986) Programs and Interventions, in: J.R. REST (Ed.) Moral Development: advances in
research and theory (New York, Praeger).
Downloaded by [Tulane University] at 11:11 26 January 2015