You are on page 1of 12

Original article

doi: 10.1111/j.1365-2729.2011.00473.x

bs_bs_banner

Can online course-based assessment methods be fair and equitable? Relationships between students preferences and performance within online and ofine assessments
jcal_473 488..498

C. Hewson
Psychology Department, The Open University, Milton Keynes, UK

Abstract

To address concerns raised regarding the use of online course-based summative assessment methods, a quasi-experimental design was implemented in which students who completed a summative assessment either online or ofine were compared on performance scores when using their self-reported preferred or non-preferred modes. Performance scores were found not to differ depending on whether the assessment was completed in the preferred or non-preferred mode. These ndings provide preliminary support for the validity of online assessment methods. Future studies could help determine the extent to which this nding generalizes beyond the assessment procedures and type of sample used here. Suggestions for follow-up studies are offered, including exploring the validity of more complex computer-related online assessment tasks and investigating the impact of using preferred and non-preferred modes upon the quality of the student experience. equity, fairness, online assessment, performance, preferences.

Keywords

Introduction

The rapid increase over the last decade or so of the use of online assessment methods has led to questions regarding the validity of these methods. There is now a substantial body of research relating to the validity of online psychological assessments (e.g. psychometric scales), which, for the most part, has shown that these measures are able to provide results comparable with those derived using ofine approaches (e.g. Buchanan & Smith 1999; Davis 1999; Epstein et al. 2001; Cronk & West 2002; Meyerson & Tryon 2003; Hewson & Charlton 2005; Herrero & Meneses 2006). The validity of online course-based assessment methods, however,
Accepted: 22 November 2011 Correspondence: Claire Hewson, Psychology Department, The Open University, Walton Hall, Old Trafford, Milton Keynes MK7 6AA, UK. Email: c.m.hewson@open.ac.uk

has received less attention to date, although these methods are becoming increasingly widely used among practitioners (e.g. Buchanan 2000; Hemard & Cushion 2003; Henley 2003; Clarke et al. 2004; Aisbitt & Sangster 2005; Marriott 2009; Sieber 2009). The present paper reviews existing work relevant to the validity of online course-based assessment, and then describes a study that provides evidence that online assessment methods in this domain also can provide a valid measure of performance, equivalent to that obtained using ofine methods. Such evidence is important in motivating and justifying the use of online course-based assessments, especially in summative contexts, and avenues for further research in this area are outlined. The recent increasing number of reports of online forms of course-based assessment is not surprising, given the potential benets of this approach. These benets include (compared with traditional pen and paper
Journal of Computer Assisted Learning (2012), 28, 488498

488

2012 Blackwell Publishing Ltd

Equity of online course-based assessment

489

modes of delivery) cost and time savings because of automated delivery, scoring and storing of responses; scope for providing tailored and/or immediate feedback, which may have pedagogical benets; enhanced levels of student engagement because of the relative novelty and appeal of the approach; enhanced exibility, for example in allowing students to be able to submit an assessment or assignment remotely without coming into college; and enhanced validity because of automation of the marking process that can reduce scope for human error (e.g. Buchanan 2000; Cassady & Gridley 2005; Hewson et al. 2007; Angus & Watson 2009; Jordan & Mitchell 2009). Such benets are appealing. However, questions concerning the reliability and validity of this relatively new assessment medium emerge. Reliability issues emerge from the necessary reliance on computing technologies to implement online assessment methods, including computer networking technologies. The potential for server crashes, as well as local hardware and software failures introduces scope for problems that would normally not be present with traditional ofine approaches (e.g. Hewson et al. 2007; Warburton 2009). Validity concerns emerge in relation to the question of whether online assessments are able to provide a fair and accurate measure of what is intended, i.e. in this context, course-related learning outcomes (e.g. Hewson et al. 2007; Dennick et al. 2009; Whitelock 2009). Two key issues can be identied here: (1) does the mode of administration itself impact upon performance; that is, might there be features of the online testing medium that lead to performance differences compared with using traditional pen and paper tests; and (2) is there an interaction between mode of administration and individual difference factors in particular, computerrelated attitudes such as computer anxiety, and perceptions of and attitudes towards online assessments (Hewson et al. 2007). Both of these questions are of critical importance in properly evaluating the extent to which online summative assessments are able to provide a fair and valid measure of course-related knowledge and skills. While perhaps most salient in relation to summative forms of assessment (where performance outcomes contribute in determining the overall course result), validity concerns are also relevant to formative online assessment methods; thus, the equity of formative online assessment methods is also brought into question
2012 Blackwell Publishing Ltd

if certain groups of students (e.g. the more highly computer anxious) might procure fewer pedagogical benets from using these resources because of factors unrelated to course-specic learning outcomes. Students with less positive computer-related attitudes may actively avoid using online assessment methods, for example, and thus be at a disadvantage if alternative forms of formative assessment resources are not also offered. While in some contexts computer-related skills and competencies may also be among the specied learning outcomes for a course using online and computer-assisted testing methods, this often may not be the case. Hence, the potential impact of computerrelated attitudes, perceptions and preferences when using online modes needs to be considered (Hewson et al. 2007).
Validity of online course-based assessment methods

While the potential impact of students computerrelated attitudes, perceptions and preferences on the use of formative online assessment is important to consider (as just noted), the focus of the present paper is on how such factors may impact upon performance on summative course-based assessments. Arguably, it is here that this issue is most pressing, because performance contributes directly in determining the overall course result obtained. Indeed, practitioners have been generally more reluctant to adopt online methods for administering high-stake summative assessments (Boyle & Hutchison 2009), although low-stake usages have been more common, at least within the natural sciences and more numerate disciplines (e.g. Aisbitt & Sangster 2005; Walker et al. 2008; Angus & Watson 2009; Jordan & Mitchell 2009). Within the social sciences and humanities, however implementation of online summative assessment methods remains less prevalent (Clarke et al. 2004), authors tending to express concerns about using online methods for summative assessment (e.g. Roy & Armarego 2003; Cassady & Gridley 2005). Two key issues were identied earlier in relation to the validity of online assessment methods the extent to which the mode of delivery itself may have an impact on performance, and the extent to which computer-related attitudes, preferences and perceptions may interact with delivery mode to inuence

490

C. Hewson

performance. Of interest in relation to the latter question is the extensive existing body of literature concerned with the relationship between computer attitudes (e.g. computer anxiety, computer engagement, etc.) and performance on computer-related tasks (e.g. see Brosnan 1998a). While these studies have tended to involve primarily laboratory-based computerized tasks, rather than the type of online tasks involving greater exibility in time and location of completion, which are of primary interest here, their ndings may nevertheless give some indication of what might also be expected in relation to unproctored online coursebased assessments. For present purposes, the term unproctored will be used when referring to situations where online assessments are made available via a computer network for access from a range of locations at the respondents convenience, and which the respondent will (most likely) complete at the computer screen; proctored will be used when talking about in-class, supervised, exam-type online assessment contexts. While other authors have tended to use the term online to refer to either or both of these assessment contexts, the distinction is sometimes important to clarify. Although the ndings from the aforementioned computer attitudes literature have been equivocal, there is some evidence that while computer attitudes can have an impact upon performance on computer-related tasks in certain contexts (e.g. Brosnan 1998b), such effects may be diminished or absent when using course-related computer-based assessment; one suggested explanation for this nding is that motivation and academic ability become more salient variables in a course-related assessment context (Mahar et al. 1997). These results are thus promising when considering the validity of online course-related assessment methods, although as several authors have pointed out there is a clear need for further evidence relating to the impact of online assessment practices on student performance in a course-related context (Buchanan 2000; Cassady & Gridley 2005). Because the concern of the present paper is essentially with course-based online assessment, the literature on general computer attitudes, which has been primarily concerned with computer-related assessments in a non-course-based context, will not be considered in any greater depth here. In what follows, evidence is considered for rstly the role of mode effects, and secondly the role of online

assessment-related preferences and perceptions, in a course-related assessment context.


Mode effects

Only very few studies have directly addressed the issue of the impact of mode effects in relation to online and ofine forms of course-based assessment. Of those studies that have reported evidence for mode effects, most have made use of in-class proctored online contexts as opposed to the type of unproctored contexts of primary interest here. Evidence for mode effects within an exam setting has been reported by Goldberg and Pedulla (2002), who found that a pen and paper group outperformed a computer group when taking the graduate record exam. Level of computer experience was also found to have an impact more experienced students outperforming those with less experience; however, perhaps counter-intuitively, this effect was observed only for pen and paper and computer without editorial control groups. A computer with editorial control (the ability to skip, review and change items and answers) group showed no effect of computer experience. Goldberg and Pedulla also report nding evidence that having a set, limited amount of time in which to complete the exam had more of a negative impact on performance in the computerized delivery modes, than the pen and paper mode, concluding that it is important when implementing computer-delivered assessments to consider whether more time may be needed than for traditional pen and paper modes. Overall these results are interesting, and follow-up studies investigating some of these factors further would be useful. One possibility worth exploring is that having the ability to review, skip and change items might be a key factor inuencing performance on computerized assessments. One study that has suggested that such ner-grained design features in computerized testing methods may impact upon performance has been presented by Ricketts and Wilks (2002) who found that being required to scroll through questions in a (proctored) computerized multiple choice question (MCQ) exam led to worse performance than receiving questions one at a time via a computer screen, or receiving a pen and paper version. Because of the apparent presence of a number of confounds in this study (e.g. exam question content), as well as a lack of information regarding sample size, effect sizes and statistical signi 2012 Blackwell Publishing Ltd

Equity of online course-based assessment

491

cance levels, further research is needed to verify this result. Other studies have reported nding a lack of mode effect when using online and ofine assessment approaches (e.g. Cassady & Gridley 2005; Hewson et al. 2007). Hewson et al. employed an experimental design in which students were assigned to take a summative MCQ assessment (unproctored) either online or ofine. As well as nding no overall mode effect, they report nding no evidence that levels of computer anxiety or computer engagement (measured using standardized tests) interacted with mode to inuence performance. Because this study employed an experimental design where the online and ofine assessments were comparable in all other key respects than delivery mode, it does not suffer from problems relating to confounding variables and thus offers more conclusive ndings than some other studies that have investigated mode effects to date. To summarize, a handful of studies have considered the impact of delivery mode online/computerized versus pen and paper upon course-based assessment performance, and in some cases also the way in which general computer-related attitudes and levels of experience may interact with mode to inuence performance. The available evidence to date is inconclusive, but if anything would seem to suggest that modality and general computer-related attitudes do not have a major impact upon performance in a course-based context, at least when using relatively straightforward computerrelated tasks (e.g. Cassady & Gridley 2005; Hewson et al. 2007). Although some studies have also suggested such effects may exist, these results presently remain largely inconclusive because of various design issues (e.g. the presence of confounding factors). Some studies have suggested that perhaps particular ner-grained features of the way computerized/online tests are administered, such as whether scrolling is required, may impact upon performance (e.g. Goldberg & Pedulla 2002, Ricketts & Wilks 2002). An additional related question, which has received very limited attention to date, concerns the role of preferences, attitudes and perceptions relating to online assessment methods themselves. Existing evidence relating to this issue is considered below, before the present study, which focuses on this question, particularly in relation to preferences for online or traditional pen and paper assessments, is presented.
2012 Blackwell Publishing Ltd

Online assessment-related preferences and perceptions

Presently, there seem to be no studies that have explored how preferences, perceptions and attitudes relating to online assessment methods themselves may be related to actual performance when using these methods. However, studies that have attempted to measure attitudes towards and perceptions of online assessment methods do exist. Several studies have suggested that students hold relatively favourable attitudes towards online (including in-class computerized tests) assessment methods, or at least not overly negative attitudes (e.g. Dermo 2009; Marriott 2009), and some authors have reported a majority preference for online over ofine assessment methods (e.g. Sheader et al. 2006; Marriott 2009). Conclusions from these studies may remain tentative, however, because of the presence of confounding factors (e.g. phased versus non-phased assessments), as can often be the case in practice-based educational research in this area. Thus, further studies to try and tease out these various inuences would be of value. Liu et al. (2001) present a study that looked at students attitudes and anxiety about an in-class multimedia exam, which highlights the importance of considering the relationship between teaching/learning delivery and assessment methods. They found that students who were taking an online course had lower anxiety about a multimedia exam than those taking the same course delivered in the classroom. Also, the online group were found to have more positive attitudes towards the exam. These ndings, however, should also be treated tentatively because of the presence of confounding factors (including levels of computer experience, the availability of practice tests and the nature of the course content). Follow-up studies on this theme would be of value. There is also evidence to suggest that while students may have some reservations about the use of online assessment practices, these reservations can be somewhat alleviated by actually taking an assessment online (e.g. Sheader et al. 2006). The above review has shown that some useful exploratory work exists, which suggests that students may hold fairly positive, and at least not overly negative, attitudes towards online assessment methods. However, there is also a clear need for further studies investigating students attitudes, perceptions and preferences in relation to online assessment methods, because

492

C. Hewson

the evidence that exists to date remains largely inconclusive; the problem of confounding variables often makes it impossible to assess the unique inuence of mode when considering students attitudes and preferences.

The present study

One salient omission within the literature on online course-based assessment methods to date concerns the question of how online assessment-related perceptions, preferences and attitudes may actually impact upon performance when using online and ofine assessment methods. Although there is existing evidence relating to the role of more general computer-related attitudes, such as computer anxiety, in inuencing students performance when using online assessment methods in a course-based context (e.g. Hewson et al. 2007), no studies to date have considered the role of attitudes, perceptions and preferences relating to online assessment methods themselves (to the best of the authors knowledge). Thus, the present paper sets out to explore this issue. The primary research question investigated in this study was: Does performance on an assessment differ depending on whether it is taken in a preferred or nonpreferred (online or pen and paper) mode? This question addresses concerns raised within the existing literature in this area that students required to take an assessment in a non-preferred mode, especially where this mode is online, may be at a disadvantage (e.g. Matsumura & Hann 2004). Also of interest was the prevalence of students preferences for online or ofine methods.
Method Participants

the Social Sciences) prior to participation. All students were enrolled on a BSc(Hons) Psychology course, due to complete this approximately 18 months from the date of participating in the present study. Recruitment involved presenting a written participation invitation, placed at the end of an assessment which all students were required to pass in order to successfully complete the course. The participation response items themselves were presented directly below the invitation to participate. Participation was entirely voluntary, and no incentive or reward was offered. Assessment scores were also used as data, but only for those students who had agreed to participate in the study.
Materials

Seventy-four students at the University of Bolton all taking the same undergraduate year 2 advanced research methods in psychology course participated (11 males, 59 females, four non-responses; age range 1855 years, mean 26.25, standard deviation 8.97).1 Thirtythree students took the course during 0304 (cohort 1), and 41 during 0405 (cohort 2). All participants had already successfully completed a previous introductory undergraduate year 1 methods course, and a year 2 methods course, which meant they had acquired moderate to good computer-literacy and numeracy skills (including familiarity with the Statistical Package for

Materials consisted of a MCQ assessment, containing 20 single-response selection items and generating a score ranging from 0 to 20, and a set of six additional items to measure experiences, preferences and perceptions relating to aspects of online assessments. The assessment tested knowledge (as used in the Blooms taxonomy sense) related to research methods in psychology.2 An online version of the MCQ assessment was constructed using a dedicated tool (Web-MCQ) developed at the University of Bolton for this purpose.3 The 20 MCQ items were presented on one page that students were required to scroll through to complete. The ofine version of the assessment was constructed in pen and paper format and maintained an identical layout to the online version, with just the response option differing (i.e. circling with a pen as opposed to using a mouse radio-button click to select the correct answer). The pen and paper version spanned a total of four pages. For both the online and ofine versions of the assessment, the six additional items were appended at the end (preceded by the invitation to answer these additional noncompulsory items). The rst item asked whether students had ever previously completed an assignment4 online; the second item asked whether students would prefer to complete an assignment online, ofine or had no strong preference. The remaining four items were single-item attitudinal statements (e.g. If doing an assignment online I would feel more worried that my answers may not be sent properly and my assignment may get lost) requiring a Likert-type response on a 5-point scale ranging from strongly agree to strongly disagree. Because the primary focus of the present
2012 Blackwell Publishing Ltd

Equity of online course-based assessment

493

paper is on how students perform when required to use their preferred or non-preferred assessment mode, these Likert-type response items will not be considered further here.

Table 1. The total number of participants, mean and SD age, and gender distribution, for both the online and ofine groups. Mode of completion N Mean age (SD) Gender Male Female Online Ofine Overall

Design and procedure

43 28.19 (9.87) 5 37

31 23.45 (6.71) 6 22

74 26.25 (8.97) 11 59

Students were pseudorandomly assigned to take the compulsory course-based assessment (described earlier) either online or ofine. This allocation to either condition was based on surname, and counterbalanced between the cohorts (for Cohort 1, surnames AK were assigned to the online condition, and LZ the ofine condition; for Cohort 2 this was reversed). The assessment was given out to students within the rst few weeks of the advanced methods course, and they were allowed 3 weeks to complete and return it by the specied date. Participants assigned to the pen and paper condition were handed a copy to take away and complete in their own time; those assigned to the online condition were provided with the web address (URL) to allow them to access and complete the assessment from any computer with Internet access at their convenience. Pen and paper assessments were submitted via a coursework box on campus and marked manually by a tutor; online assessments were submitted electronically and answers were scored and stored by an automated computer program (Common Gateway Interface script). The additional non-compulsory six items appended to the end of the assessment were answered by students who choose to accept the invitation to do so, and these were stored electronically for the online condition and in hard copy format for the pen and paper condition. Most likely (but not necessarily), students would have answered these items after having rst completed the assessment.
Results Demographics

SD, standard deviation.

Cramers V = 0.128). However, a t-test indicated that the mean age of participants in the online condition (m = 28.19) was signicantly higher than the mean age of participants in the ofine condition (m = 23.45) [t (69.97) = 2.38, P = 0.02, Cohens d = 0.57]. Only a small proportion of students (14%, n = 10) had previously completed an assignment online (86%, n = 64, had not). The proportions of students expressing a preference for online (27%, n = 20), ofine (34%, n = 25), or either (no strong preference: 39%, n = 29) method of assessment were similar (and did not differ signicantly, according to a one-way chi-square analysis: chi-square = 1.65, n = 74, P = 0.439, w = 0.147). No difference in the mean age of students expressing a preference for online (m = 28.7), ofine (m = 26.7), or either method (m = 24.5) was observed [F (2,56.08) = 1.23, P = 0.299, eta squared = 0.035].5

Performance in preferred and non-preferred modes

Table 1 shows the total number of participants assigned to take the assessment online and ofine, and for each of these groups, the mean and standard deviation age (excluding two non-responses to this question), and gender distribution (excluding four non-responses to this question). There was no signicant difference in the distribution of male and female students in each condition (chi-square = 1.15, n = 70, P = 0.283,
2012 Blackwell Publishing Ltd

The primary question of interest here was whether students required to use their non-preferred mode to complete an assessment would perform any differently than those required to use their preferred mode. Essentially, this question asked whether there would be an interaction between preferred and actual mode of taking an assessment. To investigate this a 3 2 multifactor analysis of variance was performed with self-reported preferred mode and actual mode of completion as independent variables, and test performance (MCQ score) as the dependent variable. Table 2 shows the mean and standard deviation MCQ scores for each main and interaction effect. No signicant effects were observed. Thus there was no difference in MCQ test scores for the groups who took the assessment online (m = 13.8) and ofine (m = 13.1)

494

C. Hewson

Mode of completion MCQ score Preference Online Ofine No preference Total

Online

Ofine

Total

SD

SD

SD

Table 2. MCQ test scores (out of 20), and standard deviations, for students taking the assessment online and ofine, grouped according to self-reported preferred mode (online, ofine or no preference).

13.4 13.9 13.9 13.8

3.1 2.8 3.5 3.1

12 14 17 43

14.1 12.4 13.3 13.1

1.0 2.5 2.3 2.2

8 11 12 31

13.7 13.2 13.6

2.5 2.8 3.0

20 25 29

MCQ. mean multiple choice question; SD, standard deviation.

[F (1,68) = 0.599, P = 0.442, partial eta squared = 0.009], indicating no overall main effect of actual mode on assessment performance scores. Neither was there any difference in the assessment scores of the groups expressing a preference for either online (m = 13.7) or ofine (m = 13.2) modes or no strong preference (m = 13.6) [F (2,68) = 0.323, P = 0.725, partial eta squared = 0.009], indicating no overall main effect of preferred mode on performance. There was also no interaction between preferred and actual mode of test completion in determining test scores [F (2,68) = 0.842, P = 0.435, partial eta squared = 0.024], indicating that students who completed the assessment in their self-reported preferred mode did not perform any better or worse than students who completed the assessment in their non-preferred mode. As can be seen with reference to Table 2, the mean scores for each of the actual and preferred mode subgroups were very similar. The possibility that students who had performed well in the assessment may have been more inclined to report preferring the mode in which they took it thus creating a potential confound can be assessed by considering whether assessment scores were higher for those who completed it in their self-reported preferred mode, than for those using their self-reported non-preferred mode. As already noted, and can be veried with reference to the mean values in Table 2, there was no interaction between actual and preferred mode, and so there is no evidence that this was the case. In case the mode to which participants were assigned might have inuenced their self-reported preferences in any way (as they likely completed the assessment prior to answering the question about their preferences), a 3 2 chi-square analysis was carried out to test for an association between self-reported preferred mode and

actual mode of completion. No signicant association was observed (chi-square = 0.078, n = 74, P = 0.962, Cramers V = 0.033); referring to the frequencies in Table 2, the proportion of students reporting a preference for online, ofine and no strong preference did not seem to differ depending on the mode to which participants had been assigned.
Discussion

This study has contributed evidence to support the validity of online assessment methods by showing that the performance of students taking an online or ofine assessment does not differ depending on whether they are required to use their preferred or non-preferred mode. This result goes some way towards alleviating concerns raised by previous authors about the use of online assessment approaches for summative forms of assessment (e.g. Roy & Armarego 2003; Cassady & Gridley 2005). Recommendations that students should be given a choice of online or ofine assessment and/or assessment feedback methods in order not to disadvantage students whose preference is for ofine methods (e.g. Matsumura & Hann 2004) seem premature, based on the present ndings. These ndings are thus encouraging. They add further support to existing research, which has indicated that more general computer-related attitudes also are not related to performance when taking an assessment online (Hewson et al. 2007). Because in the present study, care was taken to control any salient potential confounding variables, which have been problematic in many previous studies in this area (e.g. comparability of assessment or assignment content and teaching and learning experiences; prior levels of computer-related experience; location and timescale for completing an assessment or assignment), the present
2012 Blackwell Publishing Ltd

Equity of online course-based assessment

495

reported ndings would appear to be relatively trustworthy. One possibility considered in relation to these key ndings, however, was that the mode to which a student was assigned might have inuenced the mode preference they reported, because (most likely) students would have answered the question about their preferred mode after having completed the assessment. In fact, no relationship was observed between preferred and actual mode of completion (see results section), which would suggest that no such inuence occurred. This point is related to questions about the possible inuence on preferences and attitudes of prior experience of taking online assessments. In this case, there was no evidence that having taken an online assessment immediately prior to answering questions about preferences had any effect on the answers given. However, there is some existing evidence that experiences of online assessments can be associated with changes in attitudes (Sheader et al. 2006). The question of how and when taking an assessment online might lead to changes in opinions, attitudes or preferences is worthy of further investigation. It is encouraging that in the present study, although most participants had not previously taken an assignment online, there was no evidence to suggest that being required to use a non-preferred (either online or ofine) assessment mode had any negative impact upon performance. One might speculate that with greater prior experience of online assessment methods such effects may become even less likely, following changes in both competency levels and attitudes. A further possibility in relation to the present ndings is that students who felt they had done well on the assessment may have been more inclined to report preferring the mode in which they had taken it; because there was no evidence that performance was better in the preferred than non-preferred mode (see results section) this possibility also seems unlikely. It is worth bearing in mind that had students been observed to perform better in their self-reported preferred mode, the aforementioned explanation is one possible interpretation for this nding; future studies should be designed with such possibilities in mind (e.g. students could be asked to report their preferences prior to taking an assessment, in order that performance will not potentially inuence these self-reports). All in all, it would seem that the condition to which students were allocated to take the assessment in the present study, and
2012 Blackwell Publishing Ltd

how well they performed in this condition, did not inuence their reported preferences. Future research could usefully pick up on this theme, and explore further the factors which might potentially inuence students expressed preferences for online or ofine modes of assessment (such as, for instance, their more generic computer-related attitudes and experiences). The nding in the present study that preference for either online or ofine assessment methods does not seem to impact upon performance in either mode is simple, but non-trivial, particularly within a context of reluctance by practitioners (especially in the social sciences and humanities) to use online methods for summative course-based assessments. The present nding, alongside existing research (e.g. Hewson et al. 2007), indicates that such methods can offer a fair and valid alternative to traditional pen and paper approaches, and thus allows practitioners to more condently adopt such methods, taking advantage of the various benets they can offer. In the present higher-education nancial climate, the need for cost-effective teaching, learning, and assessment methods is clearly pressing, and this makes online approaches a highly attractive option. The present study also provides data concerning the prevalence of students self-reported preferences for either online or ofine assessment (assignment) modes. Whereas previous authors have reported a majority preference for online methods (e.g. Sheader et al. 2006; Marriott 2009), the present study found a fairly even split in terms of preference for online, ofine, or either (no strong preference) modes (a slight majority 39% reported having no strong preference). Various factors may be responsible for these differing ndings (aside from the presence of confounding factors in previous studies), including perhaps the greater mean age (26.25) of the sample of students taking part in the present study, compared with some previous samples (e.g. Dermo 2009). The scope of the present study did not extend to considering the issue of students attitudes towards online assessment methods in any depth; however, this topic is important and worthy of investigation, particularly regarding how such attitudes might be related to performance when using online methods. Future studies could use fully explore responses to a broad range of attitudinal statements; indeed, future researchers may nd it fruitful to pursue the theme of multi-item measures of attitudes towards online assessments. To the authors

496

C. Hewson

knowledge, no such validated attitudinal scale measure has yet been devised. That preference for taking an assignment online or ofine was found not to be related to age is also noteworthy follow-up studies with less computer-literate samples would be useful in verifying the extent of generalizability of this nding. In summary, the main nding of the present study that students performance on online and ofine assessments is not related to their self-reported preference for taking an assignment online or ofine is certainly encouraging and provides some support for the validity of online assessment methods. Several caveats should be noted, however. First, the assessment used in the present study was relatively straightforward, in computer-related terms; that is, students were required to answer MCQs by selecting an answer using radio buttons. Thus, no complex computing-related skills were required. Future studies should follow up on this nding in order to explore the extent to which it may generalize to more complex computing tasks, e.g. editing and manipulating a database, or using multimedia-based applications. The assessment used in the present study was purposely designed to be low stakes (not least because it was being piloted in an online assessment experiment!). In high stakes contexts, using more demanding computer-based tasks, different ndings may emerge. Secondly, computer experience levels were relatively high in the present sample, and likely also in previous studies that have found a lack of effect of mode/individual difference factors (students as a population tending to be relatively computer literate). Future studies may usefully explore the extent to which the present ndings may generalize to less computer-literate populations. It is also unlikely that the present sample, or those used in the studies cited here, included students with high levels of computer anxiety. Further studies using populations with higher levels of computer anxiety, or computer phobia (Thorpe & Brosnan 2007), would be informative. Also noteworthy is that the present study used an unproctored online assessment context, whereas many previous studies have made use of proctored online assessment tasks (in-class exams). Given the potential benets of exible delivery in an e-learning context, more studies exploring the use of online assessments where students are free to complete these with exibility of time and location would be worthwhile.

A nal noteworthy point relates to the levels of statistical power achieved in the present study, and in previous studies in this area. In the present study, the question of primary interest was whether students would perform differently when required to use their preferred or nonpreferred mode; for the related analysis (3 2 analysis of variance), the sample size of N = 74 was sufcient to detect a large effect (f = 0.37) at the 0.8 level, but the power to detect a medium effect did not reach this level. Although lower than 0.8, power is common in this area, for practicable reasons relating to limited access to large samples (in fact power is rarely reported, but can be estimated from reported sample sizes), future studies using larger sample sizes would be highly valuable in further conrming the present results. It should be noted, however, that although power was lower than ideally desirable here, only a very small effect size was observed nevertheless. In conclusion, the validity of online (summative) assessment practices which are still in their infancy but becoming increasingly widely adopted is clearly a highly important consideration. The present study lends support to the validity of these approaches to assessment, at least within the context of the type of assessment and sample used here. While validity is clearly a crucial issue, future research may also usefully explore the quality of the student experience when required to use online assessment methods. Even if these methods are found to be valid, in the sense that they can effectively measure course-related learning outcomes without interference from factors relating to computer-related attitudes and preferences (where these do not form part of the intended learning outcomes), the question of the extent to which students feel comfortable and condent when using these methods also deserves consideration. If students do not hold positive attitudes towards and experiences of using online assessment methods then this may call into question whether such methods are appropriate. Of course, in the long term, attitudes can change, and experience and training with new methods and approaches can help to initiate such change. The author would not wish to suggest that students attitudes should play a determining role in deciding whether to introduce online assessment practices, but surely they should be a consideration if the quality of the student experience is to be given the emphasis that it deserves.
2012 Blackwell Publishing Ltd

Equity of online course-based assessment

497

Author note

The author gratefully acknowledges the Higher Education Academy Psychology Network who provided a grant to support this research.

Notes
1 This sample comprised of two cohorts (Cohort 1, n = 33; Cohort 2, n = 41) taking the same identical course, delivered by the same lecturers, in consecutive academic years. 2 For example, one question asked which of a number of P values would be considered signicant; another asked which of a selection of statements best represented the way that social constructionists view language. 3 A full description of this tool, and implementation details, is provided in Hewson (2007); it consists of an HTML form with embedded javascript commands and a CGI script which processes the data submitted. 4 The term assignment was used in the questionnaire to maintain consistency with the module descriptions of the assessment elements. Elsewhere in this paper, the term assessmentis used for consistency with the relevant existing literature, which would normally use this term to refer to multiple choice type assessments of the variety presented here, assignment being reserved for essay or report type assessments. 5 A one-way analysis of variance and Levenes test indicated a lack of homogeneity of variances, so the BrownForsythe test statistic was used.

References
Aisbitt S. & Sangster A. (2005) Using Internet-based on-line assessment: a case study. Accounting Education: An International Journal 14, 383394. Angus S.D. & Watson J. (2009) Does regular online testing enhance student learning in the numerical sciences? Robust evidence from a large data set. British Journal of Educational Technology 40, 255272. Boyle A. & Hutchison D. (2009) Sophisticated tasks in e-assessment: what are they and what are their benets? Assessment & Evaluation in Higher Education 34, 305 319. Brosnan M.J. (1998a) Technophobia. Routledge, London. Brosnan M.J. (1998b) The impact of computer anxiety and self-efcacy upon performance. Journal of Computer Assisted Learning 14, 223234. Buchanan T. (2000) The efcacy of WWW-mediated formative assessment. Journal of Computer Assisted Learning 16, 193200. Buchanan T. & Smith J.L. (1999) Using the Internet for psychological research: personality testing on the world-wide web. British Journal of Psychology 90, 125144. Cassady J.C. & Gridley B.E. (2005) The effects of online formative and summative assessment on test anxiety and performance. Journal of Technology, Learning, and Assess-

ment 4, 130. Available at: http://ejournals.bc.edu/ojs/ index.php/jtla/article/view/1648/1490 (last accessed 9 October 2009). Clarke S., Lindsay K., McKenna C. & New S. (2004) INQUIRE: a case study in evaluating the potential of online MCQ tests in a discursive subject. ALT-J, Research in Learning Technology 12, 249260. Cronk B.C. & West J. (2002) Personality research on the Internet: a comparison of web-based and traditional instruments in take-home and in-class settings. Behavior Research Methods, Instruments & Computers 34, 177180. Davis R.N. (1999) Web-based administration of a personality questionnaire: comparison with traditional methods. Behavior Research Methods, Instruments & Computers 31, 572577. Dennick R., Wilkinson S. & Purcell N. (2009) Online eAssessment: AMEE guide no. 39. Medical Teacher 31, 192206. Dermo J. (2009) E-assessment and the student learning experience: a survey of student perceptions of e-assessment. British Journal of Educational Technology 40, 203214. Epstein J., Klinkenberg W.D., Wiley L. & McKinley L. (2001) Insuring sample equivalence across Internet and paper-and pencil assessments. Computers in Human Behavior 17, 339346. Goldberg A.L. & Pedulla J.J. (2002) Performance differences according to test mode and computer familiarity on a practice graduate record exam. Educational and Psychological Measurement 62, 10531067. Hemard D. & Cushion S. (2003) Design and evaluation of an online test: assessment conceived as a complementary CALL tool. Computer Assisted Language Learning 16, 119139. Henley D.C. (2003) Use of web-based formative assessment to support student learning in a metabolism/nutrition unit. European Journal of Dental Education 7, 116122. Herrero J. & Meneses J. (2006) Short web-based versions of the perceived stress (PSS) and Center for Epidemiological Studies-Depression (CESD) Scales: a comparison to pencil and paper responses among Internet users. Computers in Human Behavior 22, 830846. Hewson C. (2007) Web-MCQ: a set of methods and freely available open source code for administering online multiple choice question assessments. Behavior Research Methods 39, 471481. Hewson C. & Charlton J.P. (2005) Measuring health beliefs on the Internet: a comparison of paper and Internet administrations of the Multidimensional Health Locus of Control Scale. Behavior Research Methods, Instruments & Computers 37, 691702. Hewson C., Charlton J. & Brosnan M. (2007) Comparing online and ofine administration of multiple choice

2012 Blackwell Publishing Ltd

498

C. Hewson

question assessments to psychology undergraduates: do assessment modality or computer attitudes inuence performance? Psychology Learning and Teaching 6, 37 46. Jordan S. & Mitchell T. (2009) e-Assessment for learning? The potential of short-answer free-text questions with tailored feedback. British Journal of Educational Technology 40, 371385. Liu M., Papathanasiou E. & Yung-Wei H. (2001) Exploring the use of multimedia examination formats in undergraduate teaching: results from the elding testing. Computers in Human Behavior 17, 225248. Mahar D., Henderson R. & Deane F. (1997) The effects of computer anxiety, state anxiety, and computer experience on users performance of computer based tasks. Personality & Individual Differences 22, 683692. Marriott P. (2009) Students evaluation of the use of online summative assessment on an undergraduate nancial accounting module. British Journal of Educational Technology 40, 237254. Matsumura S. & Hann G. (2004) Computer anxiety and students preferred feedback methods in EFL writing. The Modern Language Journal 88, 403415. Meyerson P. & Tryon W.W. (2003) Validating Internet research: a Test of the psychometric equivalence of Internet and in-Person Samples. Behavior Research Methods, Instruments & Computers 35, 614620. Ricketts C. & Wilks S.J. (2002) Improving student performance through computer-based assessment: insights from

recent research. Assessment & Evaluation in Higher Education 27, 475479. Roy G.G. & Armarego J. (2003) The Development of On-Line Tests Based on Multiple Choice Questions. In D. Taniar and W. Rahayu (Eds) (2003) Web-Powered Databases, Hershey (Penn): Idea Group, pp 121143. Available at: http: / / eng.murdoch.edu.au/~jocelyn/papers/WPDBv2.pdf (last accessed 30 March 2005). Sheader E., Gouldsborough I. & Grady R. (2006) Staff and student perceptions of computer-assisted assessment for physiology practical classes. Advances in Physiological Education 30, 174180. Sieber V. (2009) Diagnostic online assessment of basic IT skills in 1st-year undergraduates in the Medical Sciences Division, University of Oxford. British Journal of Educational Technology 40, 215226. Thorpe S.J. & Brosnan M.J. (2007) Does computer anxiety reach levels which conrm to DSM IV criteria for specic phobia? Computers in Human Behavior 23, 1258 1272. Walker D.J., Topping K. & Rodrigues S. (2008) Student reections on formative e-assessment: expectations and perceptions Learning, Media and Technology 33, 221234. Warburton B. (2009) Quick win or slow burn: modelling UK HE CAA uptake. Assessment & Evaluation in Higher Education 34, 257272. Whitelock D. (2009) Editorial: e-assessment: developing new dialogues for the digital age. British Journal of Educational Technology 40, 199202.

2012 Blackwell Publishing Ltd

Copyright of Journal of Computer Assisted Learning is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

You might also like