You are on page 1of 15

Week 1

Introduction to language
testing/assessment
What do you think are the
purposes of assessment?
Give the teacher information about where the Ss are at the
moment, to help decide what they need to teach next;
Give the Ss information about what they know, so that they also
have an awareness of what they need to learn or review;
Assess current teaching
Motivate Ss to learn or review specific material;
Get a noisy class to keep quiet and concentrate;
Provide a clear indication that the class has reached a station in
learning, such as the end of a unit;
Get Ss to make an effort (in doing the test itself), which is likely
to lead to better results and a feeling of satisfaction;
Provide Ss with a sense of achievement and progress in their
learning.
ALTERNATIVE
INFORMAL TESTS
ASSESSMENT

FORMAL

ASSESSMENT

TEACHING
Formal assessment
Alternative Assessments
Evaluates students knowledge or skills in a context that approximates
the real world or real life. The emphasis in on the process as well as
the product.
Examples include .
Model of a Turkish village with a written description.
Family tree with ancestor anecdotes.
Creating an Italian food menu and a meal.

Tests
A method of measuring a persons ability,
knowledge, or performance in a given domain
The instrument, a set of techniques or procedures
Answers the question How much?
Testing helps students better understand
standards and quality in terms of their production.
It is primarily for students.
Alternative Assessment Traditional Assessment -
Testing
Direct examination of student Indirect examination of Ss
performance and knowledge on performance and knowledge; test
real-life like tasks. items represent competence
formative summative
Open-ended, creative answers Focus on the right answer
Continous long-term assessment one-shot standardized exams
(process-oriented) (product-oriented)

Untimed, free-response format, Timed, MC, T/F, or matching


e.g. essays, oral presentation, format
portfolios
Multiple modes of assessment, e.g. Modes of assessment usually
conduct research, write, revise, limited to paper-and-pencil, one-
discuss papers, provide oral analysis answer questions (Non-interactive
of an event or reading (interactive performance)
performance)
Contextualized communicative Decontextualized test items
tasks
Individualized feedback Scores as feedback
Kinds of tests
Language tests can be described from
different perspectives, depending on
our criteria of classification.
Dimension One: How test scores are reported?
Characteristic Norm-Referenced Criterion-Referenced

Type of Relative (A ss performance is Absolute (A Ss performance is


interpretation compared to that of all other Ss in compared only to the amount or
percentile terms) percentage of material learned)

Type of To measure general language abilities To measure specific objectives-


measurement or proficiencies based language points

Distribution of Normal distribution of scores around Varies, usually nonnormal (SS who
scores the mean know all the material should score
100%)
Purpose of testing Spread Ss out along a continuum of Assess the amount of material
general abilities or proficiencies known, or learned by each student

Test structure A few relatively long subtests with a A series of short, well-defined
variety of question contents subtests with similar question
contents

Knowledge of Ss have little or no idea what content Ss know exactly what content to
questions to expect in questions expect
Dimension two: Purpose/Content
Norm - Referenced Criterion - Referenced
Test Qualities Proficiency Placement Achievement Diagnostic

Detail of info. Very general General but Specific Very specific


according to
program
Language ability Learning points all Terminal objectives Terminal and
regardless of any levels and skills of course or enabling objectives
Focus
training program program of courses

Purpose of Overall comparison of To find each To determine the To inform students


decision an individual with other students degree of learning and teachers of
individuals appropriate level for advancement or objectives needing
graduation more work

Relationship to Comparisons with other Comparisons wihin Directly related to Directly related to
program institutions program objectives of objectives still
program needing work

When Before entry & Beginning of End of courses Begining and/or


administered sometimes at exit program middle of courses

Interpretation Spread of scores Spread of scores Number & amount Number & amount
of scores of objectives learned of objectives learned
Dimension Three: Tasks to be performed
in a test
Direct Tests Indirect Tests
Candidate performs precisely Measures the abilities
the skill being performed
which underlie the skills
Straightforward assessment of
performance and interpretation being measured (e.g.
of scores Testing writing or
Positive backwash effect pronunciation through
Easier with productive skills tasks that require
(i.e. Writing & speaking) recognition, identification
Small sample of tasks and skills
teaching objectives Allows for a representative
(generalizability problem)
sample of tasks and
Better for final achievement
and proficiency testing as long teaching objectives
as a wide sample of tasks is Weak relationship between
used. the performance on the
task and on the skill
Dimension Four: What language
components are tested
Discrete-point Tests Integrative Tests
Testing of one element at a time; Combine many language elements
performance in very restricted in the completon of the task (e.g.
areas of TL use. writing a composition, dictation,
cloze tests)

Based on the assumption that Based on the assumption that


language can be broken down into language proficiency is indivisible
its component parts (Ollers unitary trait hypothesis)

Tend to be indirect tests Tend to be direct

Used for diagnostic purposes Used to test overall language


ability
Dimension Five: How tests are scored
(methods of scoring)

Subjective Tests Objective Tests


Judgement is required Do not involve judgement

There are degrees of


subjectivity Do not require special expertise
in the content area on the part
of the scorer
Reliable (objectified) subjective
scoring is possible through the
use of a) precise rating scales,
b) multiple independent raters

Require special expertise in


content area on the part of the
scorer
Dimension Six: Test
formats
Written
Oral
Dimension Seven: What
technology is used in test
administration

Computer-based testing (CBT):


Electronic equivalent of the traditional paper and
pencil based tests. Measurement of test quality
and student scores often use the classical
testing theory (which will be discussed in this
course).
Computer adaptive testing (CAT):
The computer selects the next test item for the
student based on his/her response of the
previous item. It uses Item Response Theory.
Backwash/Washback/Impact
The effect of testing on teaching and learning (If a test is regarded
as important, then preparation for the test will dominate all teaching
and learning activities)

Beneficial backwash if the test is supportive of good teaching


practice, and exerts a corrective influence on bad teaching.

Harmful backwash if test content and test techniques do not match


with the course objectives.

Teachers should know that the way they teach and the way they test
should go hand in hand. There should be a strong fit between
teaching and testing
Achieving Beneficial Backwash
Test the abilities whose development you want to
encourage
Sample widely and unpredictably (reduce the
guessing factor)
Use direct testing
Base tests on objectives rather than on detailed
teaching and textbook content
Ensure that the test format is known and
understood by Ss and teachers (SS must be
familiar with task types).

You might also like