You are on page 1of 17

This article was downloaded by: [Swinburne University of Technology]

On: 26 August 2014, At: 08:59


Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number:
1072954 Registered office: Mortimer House, 37-41 Mortimer Street,
London W1T 3JH, UK

Assessment & Evaluation in


Higher Education
Publication details, including instructions for
authors and subscription information:
http://www.tandfonline.com/loi/caeh20

Does the Use of Student


Feedback Questionnaires
Improve the Overall Quality
of Teaching?
David Kember , Doris Y. P. Leung & K. P. Kwan
Published online: 27 May 2010.

To cite this article: David Kember , Doris Y. P. Leung & K. P. Kwan (2002) Does
the Use of Student Feedback Questionnaires Improve the Overall Quality of
Teaching?, Assessment & Evaluation in Higher Education, 27:5, 411-425, DOI:
10.1080/0260293022000009294

To link to this article: http://dx.doi.org/10.1080/0260293022000009294

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all
the information (the Content) contained in the publications on our
platform. However, Taylor & Francis, our agents, and our licensors
make no representations or warranties whatsoever as to the accuracy,
completeness, or suitability for any purpose of the Content. Any opinions
and views expressed in this publication are the opinions and views of
the authors, and are not the views of or endorsed by Taylor & Francis.
The accuracy of the Content should not be relied upon and should be
independently verified with primary sources of information. Taylor and
Francis shall not be liable for any losses, actions, claims, proceedings,
demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in
relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study
purposes. Any substantial or systematic reproduction, redistribution,
reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access
and use can be found at http://www.tandfonline.com/page/terms-and-
conditions
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014
Assessment & Evaluation in Higher Education, Vol. 27, No. 5, 2002

Does the Use of Student Feedback


Questionnaires Improve the Overall Quality
of Teaching?
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

DAVID KEMBER, DORIS Y. P. LEUNG & K. P. KWAN, Hong Kong


Polytechnic University, Hung Hom, Kowloon, Hong Kong

ABSTRACT An investigation was conducted into 3- or 4-year departmental sets of


student feedback questionnaire data from one university. Only four out of 25 depart-
ments had signi cant changes to any of the six dimensions in the 3- or 4-year period,
and three of these signi cant changes were falls. There is, therefore, no evidence that
the use of the questionnaire was making any contribution to improving the overall
quality of teaching and learning of the departments, at least as perceived by the students.
If it was, there should have been evidence of rising values. The following reasons why
the use of the questionnaire might not have been conducive to improving teaching
quality are discussed. The possibility that teaching quality is inherently stable is
rejected. It is possible that feedback from the questionnaire was not used effectively.
Related to this is whether instructors perceived that the university rewarded good
teaching, so felt there was an incentive to make use of the feedback. The emphasis of the
system was on appraisal, which might negate any developmental effect. The standard
questionnaire and the associated procedures may have lacked exibility and been
inappropriate for innovative forms of teaching. The study questions whether student
feedback questionnaires are utilising resources effectively if they are administered in an
environment similar to the university in question, which appears reasonably typical.

Student Feedback Questionnaires


Many universities now make use of student feedback questionnaires. They must be the
most widely used form of teaching evaluation in higher education. Near the end of every
semester millions of students throughout the world ll in questionnaire forms to give
their ratings on their instructors and their courses.
Three justi cations are commonly given for this exercise and these can be interrelated.
The rst is that the feedback obtained through the questionnaires contributes to
ISSN 0260-2938 print; ISSN 1469-297 X online/02/050411-1 5 2002 Taylor & Francis Ltd
DOI: 10.1080/026029302200000929 4
412 D. Kember et al.

improving the quality of teaching. Instructors take note of any weaknesses or areas for
potential improvement revealed by the questionnaire data. In their subsequent teaching
they make efforts to remediate weaknesses and improve their teaching. The logical
outcome of this process would be an overall increase in the quality of teaching over time.
Ratings from student feedback questionnaires are also commonly made use of in
appraisal exercises. Decisions about tenure, contract renewal and promotion now
commonly require evidence of teaching ability as well as research output. In recent years
a number of university systems have also instituted schemes for regular staff appraisal,
which incorporate monitoring of teaching performance. Such exercises should also result
in an enhancement of teaching quality as those with poorer ratings have an inducement
to improve their teaching and the worst teachers could be weeded out.
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

The nal reason for having student feedback questionnaire schemes is that it is an
explicit requirement, or felt by university administrations to be an implicit obligation. In
Australia all universities are required to use the Course Experience Questionnaire to
evaluate their programs (Ramsden, 1992). It has become common to subject universities
to quality reviews in which they are required to demonstrate that they have in place
adequate procedures for ensuring teaching quality. Having a system for regularly
administering student feedback questionnaires would probably be the number one
requirement of most review panels.
These three reasons for making use of student feedback questionnaires can obviously
be interrelated, particularly if the nal reason is evident. The requirements of a quality
review process or implicit pressure can include some form of staff appraisal, linked to
a requirement to utilise student feedback questionnaires. Even if such systems were
introduced entirely because of external pressure, the respective university management
would no doubt publicly cite teaching quality improvement as a rationale for their
introduction.

Are the Schemes Effective?


Given that the use of student feedback questionnaires is now so widespread, there
appears to have been remarkably little attempt to answer the question posed in the title
of this article; does the exercise really have an impact by improving the quality of
teaching? Widespread and frequent use of feedback questionnaires must consume
considerable resources, particularly if staff and student time is taken into account, so an
investigation of cost effectiveness is warranted (Kember, 2000).
The study reported in this article aimed to see whether the student feedback
questionnaire system of a university in Hong Kong was resulting in an improvement in
the overall quality of teaching. The criterion chosen for answering the question was
based upon the logic of employing the questionnaires as a feedback mechanism. It is
generally accepted that student feedback questionnaires are a valid and reliable means of
gaining feedback on teaching and learning (Marsh, 1987). The large body of research
reviewed by Marsh lends considerable con dence to the belief that the ratings given by
students do correlate in a reliable way with the quality of teaching and learning. This is
the basis of their use in comparing teachers, departments and even universities. If the
questionnaires are accepted as reliable instruments for making these comparisons then it
follows that similar year-by-year comparisons should also be accepted.
If teaching quality for an individual or a department is improving over time, there
should be an accompanying rise in ratings. Finding that there was a rise in scores over
time would not necessarily imply that it had resulted from the questionnaire feedback
Student Feedback Questionnaires 413

system as in general correlation does not necessarily imply causality. In this speci c case
it is likely that other factors affect the quality of teaching over time. Nevertheless, the
presence of a signi cant rise in ratings would provide evidence that the quality of
teaching was improving, which is the object of the total quality assurance exercise. If it
were found that scores remained static or even fell over time it would certainly raise
questions over whether the resources devoted to the exercise of gathering feedback
through questionnaires could be justi ed. Alternatively it might pose questions about the
implementation of the system in the university in question, as it could be possible that
appropriate conditions or associated processes are needed for quality improvement to
occur.
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

Related Research
There have been some previous investigations of questionnaire ratings over time, though
most differ from the present study by concentrating on individuals, conducting short-
term experiments or examining the effectiveness of forms of counselling related to
questionnaire feedback. Marsh and Hocevar (1991) found evidence of stability over a
13-year period when looking at individual instructors. Hativa (1996) found stability in
both levels of ratings and the shape of strength/weakness pro les over four sets of
evaluation data. However, there was improvement from teachers who undertook special
improvement activities.
Investigations of changes in ratings after feedback have mostly been short-term
studies, though they do indicate that change can and does occur. Cohen (1980)
conducted a meta-analysis of studies that gave mid-term feedback and then examined
end-of-term ratings. Those who received the mid-term feedback averaged end-of-term
ratings one-third of a standard deviation higher than controls. Longer-term studies
have been rare but studies that coupled feedback with consultation have shown
longer-term effects (e.g. Marsh & Roche, 1993; Piccinin et al., 1999; Stevens &
Aleamoni, 1985).
It would appear that individuals relative strengths and weaknesses tend to be
reasonably consistent, but this does not imply that overall improvement is not possible.
There seems to be tentative evidence that the level of ratings will tend to be fairly stable
unless the feedback is accompanied by counselling or improvement activities. There are
suf cient studies with evidence of change by individual teachers to suggest that teaching
performance can improve over time. It seems safe to reject the notion that teaching
performance is inherently stable and improvement not possible.
There is a surprising lack of studies similar to the one reported here in which data for
a whole university have been investigated. One possible explanation is that most research
into student feedback questionnaires has been conducted within a positivist framework,
so the researchers prefer to have experimental designs so that effects can be attributed.
Marsh (1987, p. 342) illustrates this concern:

No research has examined the effects of continued feedback for student


evaluations over a long period of time with a true experimental design, and
such research will be very dif cult to conduct. The long term effects of
students evaluations may be amenable to quasi-experimental designs , but
the dif culties inherent in the intervention of such studies may preclude rm
generalizations.
414 D. Kember et al.

It is true that, without some form of control, effect cannot be unequivocally attributed
to a cause. Controlling feedback is not realistically feasible or ethical, though, at whole
university level. Yet investigation is still important from a naturalistic perspective to see
whether, in real situations, improvement in teaching quality does accompany the use of
student feedback questionnaires.

The Student Feedback Questionnaire


The university in which the study was conducted started using student feedback
questionnaires on a voluntary basis over 12 years ago. When staff appraisal was
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

introduced in 1995, use was made compulsory and an instrument known as the Student
Feedback Questionnaire (SFQ) was introduced. This instrument was developed from the
one used previously in the voluntary scheme. The original instrument contained six
scales derived initially from the extensive literature on the topic (e.g. Feldman, 1976;
Marsh, 1987). The items and dimensions were subsequently modi ed in the light of
feedback from teachers using the voluntary scheme about the type of student feedback
that was most valuable.
On the introduction of staff appraisal the voluntary instrument was modi ed. The
wording of items was changed so that the focus was upon the instructor, to re ect the
appraisal orientation of the instrument. A review subsequently recommended some
changes to the instrument, retaining the six dimensions but cutting the number of items
per dimension from three to two. The six dimensions or subscales of the SFQ are:
learning outcomes; interaction; individual help; organisation and presentation; motiv-
ation; and feedback. A copy of the SFQ instrument used to gather the data analysed in
this study is included as Appendix 1.
A previous study examined the reliabilities of the six scales and reported high values
ranging from 0.93 to 0.97 (Kwan, 1999). Each item required respondents to indicate the
extent of their agreement with a particular statement on a 5-point Likert scale ranging
from strongly agree to strongly disagree. The two items for each subscale were then
summed to produce a measure of the dimension. Hence, the scores for the subscales
ranged from 2 to 10. A high rating indicates a high level of student satisfaction with the
particular aspect of teaching being evaluated.
The SFQ also had two standard open-ended questions. Additional items, either closed
or open-ended, could be added to the questionnaire depending on the needs of the
individual department or staff.
It was university policy that a minimum of two classes per year were to be selected
to ll in the SFQ for each member of the teaching staff. Some departments permitted or
required extra classes to be evaluated. The administration of the questionnaires to the
selected classes was handled by departmental administrative staff and the optical mark
reader forms were processed by a central unit.
The teaching staff received a report on the means, standard deviations and percentage
distributions of the ratings of the six composite measures and the individual items for
each of the selected classes. The departmental averages of the ratings of the six subscales
and the individual items were reported to the department head, who also received a copy
of the data for each member of the department. Table 1 shows means and standard
deviations for the six scales of the questionnaire computed from the overall database
used for the study with department as the unit for analysis.
Student Feedback Questionnaires 415

TABLE 1. Means and standard deviation s of the six scales across department s over
3 years

Year 1 Year 2 Year 3

Construct Mean SD Mean SD Mean SD

Learning outcomes 7.3 0.31 7.3 0.36 7.2 0.39


Interaction 7.6 0.38 7.5 0.38 7.5 0.41
Individua l help 7.4 0.30 7.4 0.28 7.3 0.32
Organisation
and presentatio n 7.2 0.33 7.2 0.33 7.1 0.39
Motivation 7.0 0.28 6.9 0.29 7.0 0.38
Feedback 6.7 0.25 6.8 0.27 6.8 0.33
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

Method
Permission was obtained to make use of the SFQ database for purposes of evaluating the
instrument itself and the quality assurance system of which it is a part. For this purpose
the le was stripped of both individual and departmental identi ers.
Data from 25 departments of the university which made use of the SFQ were collected
for consecutive years. Due to a change of questionnaire, 4-year data was available for
19 departments and 3-year data for six departments. For each department, the average
scores for the six dimensions of the SFQ were calculated across the classes for each year.
The mean SFQ scores were compared across years by multivariate analysis of variance
(MANOVA) for each of the 25 departments.
The department was chosen as an appropriate unit for analysis in that it caters for two
mechanisms for improving the quality of teaching. Firstly enough individuals could
improve their ratings by a suf cient margin that the department overall registered a
signi cant increase. Alternatively individuals with low ratings might not have contracts
renewed and be replaced by others who subsequently have higher ratings. A combination
of these two mechanisms is also possible. Had there been signi cant changes in
departmental scores the intention was to have looked at individual scores to determine
the mechanism.
In our study the sample was limited to one university by the practical constraint of
gaining access to such a wide body of sensitive data in a university other than ones own.
Generalisability is clearly a relevant issue. It can be questioned whether it is possible to
generalise from the nding in one university to suggest that the use of feedback
questionnaires in other universities will lead to (or not lead to) an indication of
improvement in teaching quality.
A study of one university clearly cannot lead to formal inference, as the sample was
insuf cient and not random. Eisner (1991, ch. 9) argues, though, that inferences from
small samples and even single cases can be made through attribute analysis and image
matching. In this particular case the process requires the reader to make a judgement as
to whether the university, the questionnaire, the administration system and the use made
of the feedback are suf ciently similar to those in other universities for there to be the
possibility of similar ndings.
For this reason we have tried, in various parts of the paper, to make transparent the
situation in the university in question to provide the reader with the evidence to make
a valid judgement. We argue that the questionnaire was closely related to those used by
many universities in that it incorporated dimensions commonly accepted in the literature.
416 D. Kember et al.

The procedures for its use are described in the following sections so the reader can judge
how similar they are to practices in other universities. If the level of attribute matching
indicates that the context and procedures are related to those described in this study then
there is the possibility that similar outcomes would be found if the same type of study
were conducted. The study does seem to provide a justi cation for other universities to
examine their own data.

Results
Results of the MANOVA were shown in Table 2 which reported the Wilks l, the
corresponding F value and the associated p-value. Besides the statistical signi cance, we
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

also checked the practical difference among the mean scores for each department. We
considered a department had practical signi cant change if its mean scores across years
changed greater than a range of 1 /- 0.2 which is 5% of the feasible scale range of 2
to 10. The use of practical signi cance levels is appropriate because it is well known that
the very large sample size would mean that even tiny differences could be statistically
signi cant (e.g. Harris, 1998). The results for practical signi cant change were also
reported in Table 2. The MANOVA results show that these changes were too small to
be statistically signi cant. The results show, therefore, that 14 out of 25 departments
have no statistically signi cant change in their mean scores for any of the six dimensions
of the SFQ at the 5% level of signi cance in the 3- or 4-year period. For the 11
departments that did show a statistical change on one or more of the six dimensions, only
ve of them showed signi cant practical changes. Three of those ve departments had
a sudden drop in the last year of observation and the other two had rises and falls during
the period. The overall conclusion is that the SFQ evaluation process produces no
evidence of an improvement in the quality of teaching during the 4-year period.
To give some feel for the data average scores for the six subscales by year are plotted
for two typical departments in Figure 1.

Supporting Qualitative Data


Evidence to back the conclusions from the questionnaire data comes from two qualitative
studies that were initially separate from this investigation. Kwan (2000) conducted a
qualitative study in the same university of the way students approached the completion
of the student feedback questionnaire and their interpretation of the items. The other
study conducted 53 interviews with students in seven universities in Hong Kong
covering a broad range of topics including perceptions of the quality of teaching
(Kember & Wong, 2000).
Both of these studies uncovered a signi cant number of statements from students
complaining that providing feedback through completion of questionnaires did not result
in any noticeable improvement in teaching performance. One typical quotation from each
study is given below.
I feel that the performance of the lecturers is more or less the same [after the
evaluation]. I can see no differences at all. I think the students are quite
frustrated because they could not see any effects of their evaluation, and that
is why a lot of them are not interested in completing the evaluation forms.
Their performance in teaching is obviously not acceptable. I wont blame their
teaching method as they may not have received teacher training. However,
Student Feedback Questionnaires 417

TABLE 2. Results for MANOVA and practical signi cance of the mean SFQ scores across
years for the 25 department s

Department Wilks l F value p-value Practical signi cance

1 0.76 1.80 0.02 Yes


2 0.88 4.36 0.00 Yes
3 0.68 3.01 0.00 Yes
4 0.90 2.08 0.01 Yes
5 0.93 2.17 0.00 Yes
6 0.95 5.00 0.00 No
7 0.87 7.01 0.00 No
8 0.90 1.84 0.02 No
9 0.91 2.19 0.01 No
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

10 0.92 1.79 0.05 No


11 0.27 2.13 0.01 No
12 0.97 1.32 0.17 No
13 0.86 1.39 0.13 No
14 0.91 1.26 0.21 No
15 0.94 1.04 0.42 No
16 0.92 1.04 0.41 No
17 0.91 1.41 0.12 No
18 0.92 1.02 0.43 No
19 0.93 0.97 0.49 No
20 0.96 1.21 0.24 No
21 0.84 1.16 0.29 No
22 0.94 1.03 0.42 No
23 0.98 0.86 0.58 No
24 0.86 1.07 0.37 No
25 0.94 1.52 0.07 No

Note: Practical signi cance: Mean scores differences are greater than 5% of the feasible
range.

they should write clearly, speak at a normal and attainable speed and lend the
transparencies to the students. Although I have spoken up in the course
evaluation, no improvement has been made.
Both studies were qualitative and aimed for interpretation and understanding. It is not
therefore possible to give a precise measure of the extent of such sentiments, particularly
since these views emerged from indirect questioning. The students quoted were certainly
not isolated cases though, so this does seem to be quite a common belief. A search of
the transcripts of both studies produced no statements from students with evidence that
the student feedback questionnaires had made a positive impact on teaching.

Possible Explanations
As there had not been any signi cant changes in the student intake or the evaluation
policies and procedures of the university over the period under investigation, it was
highly unlikely that they would negate any increase in ratings had there been any
improvement in the overall teaching quality. If anything, given the importance of the
ratings, departments and instructors tended to adapt to the evaluation system by choosing
classes that would raise rather than lower their ratings.
The following discussion aims to explore the reasons why the SFQ may not have
contributed to an improvement in the quality of teaching and learning. The feasibility of
418 D. Kember et al.
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

FIG . 1. Average mean scores for the six scales by year for two departments .
Keys: Learn: Learning outcomes; Interact: Interaction ; Ind. Help: Individua l help; Organ: Organisatio n and
presentation ; Motiv: Motivation ; Feedback: Feedback.
Student Feedback Questionnaires 419

each potential reason is discussed in the light of evidence in the literature and other
available contextual information.

Teaching Quality Has Reached a Stable Plateau


It seems plausible that if there is any tendency towards stability in teaching ratings
it may be a differential effect. Early in a career improvement might be expected but
over time habits become ingrained and feedback ratings might level out to a
plateau. Feldman (1983) reviewed studies on the relationship between instructors years
of experience and student ratings, and found that there were nil or slightly negative
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

relations between the two. Similarly the effect of quality assurance measures could
have an early impact which would wane over time. This possibility of university-
wide ratings reaching a stable plateau does not appear to have been investigated, but
if there were such an effect it would clearly be most prevalent in the most stable
situations.
However, there are a number of indicators that suggest that the university in which
this study was conducted would be less likely than most to have reached a stable plateau.
Until the recent Asian economic downturn there had been a higher staff turnover than
in most comparable western universities and the university had a younger staff pro le
than many. Recent years have also seen many innovations in teaching funded by
teaching development grants (Kember, 2000). Overall, there is insuf cient evidence to
conclude that teaching quality cannot be improved and no compelling reason to suggest
that the university in which the study was conducted might have reached a mature stable
plateau.

Feedback Data Were Not Used Effectively


Of the short-term studies of the effect of feedback on ratings, those that augmented
feedback with consultation resulted in substantially larger differences (Cohen, 1980).
The small number of longer-term experiments that have demonstrated improvements
have accompanied the return of questionnaire data with counselling (Brinko, 1993;
Marsh, 1987). It is possible that insuf cient attention was being paid to making use of
SFQ data to identify areas in need of remediation. It is also possible either that
counselling was not provided or, if it was being provided, it was not in an effective form
to help the teacher to develop his or her teaching ability.
Results from the SFQ went to each instructor with a copy to his or her department
head. According to university policy, responsibility lay within individual departments to
devise ways to make use ofor not make use ofthe feedback. Anecdotal evidence
suggested that in some departments there was no discussion of, or consultation about, the
results. In other cases there does appear to have been some discussion between the
instructor and the head or a nominee. Whether the type of discussion can be closely
equated to the informed counselling discussed by Brinko (1993) is questionable.
It is also possible that the feedback provided by the SFQ was ignored by many staff
members. With the frequency of data collection and the number of classes sampled, staff
were provided with copious amounts of statistical data, so some may have stopped
looking at it seriously.
420 D. Kember et al.
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

FIG. 2. Percentage s of response s to the question Good teaching is properly rewarded in the University.
(N 5 201).
Note: The 5-point Likert scale for response : 1 5 Strongly agree, 2 5 Agree, 3 5 Neutral, 4 5 Disagree and
5 5 Strongly disagree.

There Was No Incentive to Use the Data


Whether or not the feedback data were used effectively is likely to be related to whether
there was incentive to make use of the data to try to improve teaching quality. In a
positive sense, if staff feel that good teaching or effort put in to improve teaching quality
had a signi cant chance of being rewarded, there would be an incentive to look carefully
at the feedback and take steps to deal with revealed weaknesses. In a more negative
sense, the feedback could also lead to better teaching if there was perceived to be a
reasonable possibility of sanctions against those with poor ratings. The obvious examples
in this respect are tenure not being granted or contracts not being renewed.
Evidence on this potential explanation comes from a survey of all academic staff in
the university on forms of evaluation that were felt to be effective and appropriate. The
return to the survey was only about 20%, but the results of interest here are quite
unequivocal and entirely consistent with ndings from elsewhere. One question asked
staff to respond to the statement Good teaching is properly rewarded in the [University]
on the common 5-point Likert scale from strongly agree to strongly disagree. The
responses to this question are given in Figure 2.
The majority of respondents either disagreed or strongly disagreed, with about a
quarter in the neutral position. Only a small minority felt that good teaching was
properly rewarded.
In this respect the university in question was in a common position because a number
of extensive surveys conducted in several parts of the world have come to the similar
conclusion that academics do not feel that their universities value or reward teaching as
well as they should. A large survey of academics in six Australian universities (Ramsden
& Martin, 1996) found that although 95% thought that teaching should be highly valued
by their institution, only 37% thought that it was. In Boyers (1990) large survey of US
Student Feedback Questionnaires 421

academics, 68% agreed that their institutions needed better ways, beside publications, to
evaluate scholarly performance (p. 34).
Like most others, the university in the study had an of cial policy that teaching quality
was taken into account in staff appraisal, contract renewal and promotion decisions.
There was also an annual scheme for honouring excellent teachers. Clearly the survey
results indicate that the academics perceived a mismatch between policy and practice, or
felt that the measures did not go far enough in rewarding good teaching. Again in this
respect the university was certainly not unusual. Many, if not most, universities now
have policy statements stressing the importance of teaching and schemes that are meant
to put the policy into practice. However, the results of the international surveys cited
above indicate high levels of cynicism among academics as to whether their universities
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

really value teaching.


What is signi cant about these expressed views is that it is these perceptions that will
determine what use is made of feedback data by individual academics. If they feel that
teaching is valued then there is some motivation to make use of feedback. However, if
their perception is that their university does not reward teaching or take it seriously, there
is little incentive to take any action based upon the questionnaire data. If there is a
perception that research is rewarded more than teaching then effort is more likely to be
put into research activities than making use of questionnaire feedback to improve
teaching.

The Appraisal Emphasis Negated Teaching Improvement


The SFQ was introduced as part of a staff appraisal process to give a measure of an
individuals teaching performance. The instrument was, therefore, speci cally designed
for appraisal and used as part of a process that focused upon appraisal. A system with
an appraisal emphasis is unlikely to have much impact upon teaching quality unless staff
perceive the appraisal process to reward good teaching. The results in the section above
suggest that this was not the case.
It is possible that the judgemental emphasis was detrimental to the potential develop-
mental role of evaluating teaching and learning. If this was the case then it is possible
that redeveloping both the instruments and the process with a greater developmental
emphasis might convince staff that the developmental aim is paramount so that it is taken
more seriously.
It is feasible to have different procedures for appraisal and developmental purposes.
Under the procedures in force at the time of the study, appraisal data were gathered
every year for all academic staff even though only a small proportion were subject to
contact renewal or promotion decisions in any given year.

The Questionnaire May Have Lacked Flexibility and Appropriate Focus


Recent writing has criticised the selection of dimensions in the most widely used US
instruments as being based upon too narrow models and modes of teaching and learning.
Marsh (1987) describes a typical approach to the selection of items for good instruments
as being based upon a logical analysis of the content of effective teaching, supplemented
by literature reviews of dimensions others have used. The most frequently cited set of
dimensions is probably that of Feldman (1976), which appears to have in uenced the
design of a number of other instruments. A hint to the issue can be gained just by
looking at Feldmans list of 19 instructional rating dimensions (Feldman, 1976). No
422 D. Kember et al.

fewer than 11 of the 19 begin with the word Teachers, which adds credence to
Centras claim (1993, p. 47) that the typical student rating form is devised to re ect
effectiveness in lecture, lecture and discussion and other teacher-centred methods.
DApollonia and Abrami (1997) argued that typical feedback questionnaires are based
upon models of instruction focusing upon traditional didactic teaching. McKeachie
(1997) pointed out that student rating forms gather information about conventional
classroom teaching. Almost all ignore the learning that takes place outside the classroom,
which is probably the majority for many students. Kolitch and Dean (1999) examined a
typical US evaluation instrument against two models of teaching. They found it
compatible with a transmission model but not with an engaged-critical one. The article
went on to question the neutrality of instruments that did not acknowledge forms of
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

education involving high levels of student involvement.


The extent to which questionnaires focus upon teacher-centred models of education
depends on whether they are designed primarily to provide feedback or are used
judgementally. Questionnaires for judgemental purposes are more likely to focus upon
the instructor and his or her teaching. Following the analysis of Kolitch and Dean (1999),
the end result is more likely to be a questionnaire compatible with a transmission model
of teaching than an engaged-critical one.
The items in the SFQ were possibly less teacher-centred than in many instructor rating
instruments, but the staff appraisal emphasis placed responsibility for teaching upon the
staff member. The corresponding model of teaching and learning implies a didactic one
primarily involving an instructor lecturing to a captive audience.
Evidence that this and similar standard questionnaires can be seen as too teacher-cen-
tred and inappropriate for innovative teaching comes from an initiative known as the
Action Learning Project, which supported 90 action research projects in the universities
in Hong Kong (Kember, 2000). In these, academics introduced a curriculum or teaching
innovation and were then required to evaluate the outcomes. Virtually none of these
made any use of the standard student feedback questionnaire used by their university.
The standard questionnaires did not include questions applicable to the more innovative
forms of teaching and learning introduced in the projects.
The custom-designed instruments and evaluation procedures used in the projects
usually showed that the projects were achieving high quality in teaching and learning.
A signi cant number of projects demonstrated quality improvement either between two
action research cycles or using a pre- and post-test design (Kember et al., 1997). It is
possible that permitting exibility in the design of questionnaires might enable teachers
to use instruments and procedures more suited to innovative forms of teaching than the
SFQ, which assumes a teacher-centred model of teaching and learning. Utilising a
questionnaire that presupposes a particular model of teaching may result in conservatism,
particularly when the system is geared towards appraisal.
Making the staff member responsible for evaluation might result in greater interest in,
and use made of, the resulting data. There are questions over credibility and ownership
of data collected by an imposed instrument.

Conclusion
Employing a teaching evaluation system that does not appear to demonstrate any overall
improvement in teaching quality cannot be considered satisfactory. Several potential
reasons have been given, all of which may have played some part, but it was not clear
which, if any, predominated. If it is not possible to discover systemic factors that are
Student Feedback Questionnaires 423

discouraging improvements, there has to be a question over the continued use of the
student feedback questionnaires. Their regular use is expensive in terms of both funds
and time. If a quality assurance system is not effective then it is hard to justify its
continuation.
The study has been conducted in one university on one student feedback system. No
formal generalisation is possible but it is of interest to speculate whether similar results
might be found in other universities. The questionnaire was similar to those used to rate
instructors in many universities, and the results of Tagamori and Bishop (1995) and
Kwan (1999) suggest that it was better designed than many. The questionnaire was used
as part of a staff appraisal system, which again is a common situation.
Of the reasons suggested for the questionnaire not contributing to an improvement in
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

teaching quality, it would appear that many would be widely applicable. The procedures
for making use of feedback data were probably fairly common, as few institutions appear
to offer widely available specialised counselling when feedback data are returned. The
perception that teaching was insuf ciently rewarded is certainly widespread, so there
may well be other universities where there is a perceived lack of incentive to make use
of the feedback.
Overall there is no obvious reason why the university in which this study was
conducted could be seen as differing markedly in evaluation practice from a wide range
of others. This does suggest that there is good reason for others to examine data from
their own universities to see whether their student feedback questionnaire systems are
contributing to an improvement in the quality of teaching and learning.
Studies that have shown improvement have either coupled the use of questionnaires
with specialised counselling (Marsh & Roche, 1993; Piccinin et al., 1999; Stevens &
Aleamoni, 1985) or encouraged teachers to devise their own ways of evaluating their
teaching innovations (Kember, 2000; Kember et al., 1997). In both cases there is an
implication that there is concern for improvement. In universities in which it is perceived
that good teaching is not valued or adequately rewarded there would appear to be a
possibility of also nding a lack of improvement over time as instructors lack incentive
to make use of the feedback from the compulsory standard questionnaires.

Acknowledgement
In the period since the data for this article were gathered, the university in which the
study was conducted has changed the policies and procedures associated with the
evaluation of teaching, so the observations in the article may no longer be applicable.
The rst-named author was in the Educational Development Centre of the Hong Kong
Polytechnic University at the time the study was conducted.

Note on Contributors
DAVID KEMBER is the Chief Educational Development Of cer, DORIS Y. P. LEUNG
is a Research Fellow, and K. P. KWAN is a Senior Educational Development Of cer
in the Educational Development Centre, The Hong Kong Polytechnic University.
Correspondence: Dr K. P. Kwan, EDC, Hong Kong Polytechnic University, Hung
Hom, Hong Kong.
424 D. Kember et al.

REFERENCES
BOYER, E. L. (1990) Scholarshi p reconsidered : Priorities of the professorat e (San Francisco, CA, The
Carnegie Foundation for the Advancement of Teaching).
BRINKO, K. T. (1993) The practice of giving feedback to improve teaching : What is effective? , Journal
of Higher Education, 64 (5), pp. 575593.
CENTRA, J. (1993) Re ective faculty evaluatio n (San Francisco, CA, Jossey Bass).
COHEN, P. A. (1980) Effectivenes s of student-ratin g feedback for improving college instruction : a
meta-analysis, Research in Higher Education, 13, pp. 321341.
DAPPOLLONIA, S. & ABRAMI, P. C. (1997) Navigating student ratings of instruction , American Psychol-
ogist, 52 (11), pp. 11981208.
EISNER, E. W. (1991) The enlightene d eye: Qualitative inquiry and the enhancemen t of educationa l
practice (New York, Macmillan Publishing) .
FELDMAN , K. A. (1976) The superior college teacher from the students view, Research in Higher
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

Education, 5, pp. 243288.


FELDMAN , K. A. (1983) The seniority and instructiona l experience of college teachers as related to the
evaluation s they receive from their students, Research in Higher Education, 18, pp. 3124.
HARRIS, M. B. (1998) Basic statistics for behavioura l science research (Boston, MA, Allyn & Bacon).
HATIVA , N. (1996) University instructors ratings pro les: Stability over time, and disciplinar y differ-
ences, Research in Higher Education, 37 (3), pp. 341365.
KEMBER, D. (2000) Action learning and action research : Improving the quality of teaching and learning
(London, Kogan Page).
KEMBER, D & WONG , A. (2000) Implications for evaluation from a study of students perceptions of good
and poor teaching , Higher Education, 40 (1), pp. 6997.
KEMBER, D., LAM, B. H., YAN, L., YUM, J. C. K. & LIU, S. B. (1997) Case studies of improving teaching
and learning from the Action Learning Project (Hong Kong, Action Learning Project).
KOLITCH, E. & DEAN, A. V. (1999) Student ratings of instructio n in the USA: Hidden assumptions and
missing conception s about good teaching , Studies in Higher Education, 24 (1), pp. 2742.
KWAN, K. P. (1999) How fair are student ratings in assessing the teaching performance of universit y
teachers?, Assessment and Evaluation in Higher Education, 24 (2), pp. 181195.
KWAN, K. P. (2000) How University Students Rate Their Teachers: A study of the attitudes and rating
behaviour s of universit y students in teaching evaluations . Unpublishe d Ed.D. thesis, University of
Durham, UK.
MARSH, H. W. (1987) Students evaluation s of universit y teaching : research ndings, methodologica l
issues, and direction s for future research, Internationa l Journal of Educational Research, 11, pp. 253
388.
MARSH, H. W. & HOCEVAR, D. (1991) Students evaluation s of teaching effectiveness : The stability of
mean ratings of the same teachers over a 13-year period, Teaching and Teacher Education, 7,
pp. 303314.
MARSH, H. W. & ROCHE, L. (1993) The use of students evaluations and an individually structure d
interventio n to enhance universit y teaching effectiveness , American Educationa l Research Journal, 30
(1), pp. 217251.
MCKEACHIE, W. (1997) Student ratings: The validity of use, American Psychologis t, 52 (11), pp. 1218
1225.
PICCININ, S. CRISTI, C. & MCCOY, M. (1999) The impact of individua l consultatio n on student ratings of
teaching, Internationa l Journal for Academic Development, 4 (2), pp. 7588.
RAMSDEN, P. (1992) Learning to teach in higher education (London, Routledge) .
RAMSDEN, P. & MARTIN, E. (1996) Recognition of good university teaching : Policies from an Australian
study, Studies in Higher Education, 21 (3), pp. 299316.
STEVENS , J. J. & ALEAMONI , L. M. (1985) The use of evaluativ e feedback for instructiona l improvement :
A longitudina l perspective , Instructiona l Science, 13, pp. 285304.
TAGAMORI, H. & BISHOP, L. (1995) Student evaluatio n of teaching : Flaws in the instruments, Thought
and Action: the National Education Association Higher Education Journal, 11, pp. 6378.
Student Feedback Questionnaires 425

Appendix 1
The Student Feedback Questionnair e
Please ll in the appropriat e circle to indicate your attitude to the following statements.

Learning Outcomes
1. I have understoo d the subject matter taught by the staff member.
2. The staff members method of teaching has helped my understanding .

Interaction
3. The staff member gave students opportunitie s to ask questions and discuss ideas.
4. The staff member encourage d active participatio n in class.
Downloaded by [Swinburne University of Technology] at 08:59 26 August 2014

Individua l Help
5. The staff member provided appropriat e help for students with learning problems.
6. Assistance was available from the staff member when necessary.

Organisation & presentatio n


7. The staff members teaching was well-organised .
8. The staff member presente d the subject material clearly.

Motivation
9. The staff member explaine d the signi cance of what was taught.
10. The staff members teaching stimulated my interest in the subject.

Feedback
11. The staff member gave me regular feedback on my progress .
12. The feedback from the staff member was helpful and constructive .

You might also like