You are on page 1of 26

Week 2 Sutrisno Sadji Evenddy, M.Pd.

Practicality Reliability Validity Authenticity Washback

Not expensive, Within appropriate time constraint, Relatively easy to administer, A scoring/evaluation procedure that is specific and time-efficient.

1. Are administrative details clearly established before the test? 2. Can students complete the test reasonably within the set time frame? 3. Is the cost of the test within budget limits?

Consistency of assessment results (Linn & Gronlund).

A test is reliable if: You give the same test to the same student or matched students on two different occasions, the test should yield similar results. (Brown,2004)

Students-related reliability Rater reliability Test administration reliability Test reliability

The most common learnerrelated issue in reliability is caused by temporary illness, fatigue, a bad day, anxiety, and other physical or psychological factors.

Inter-rater reliability: When two or more scorers yield inconsistent scores of the same test. Factors: lack of attention to scoring, inexperience, inattention, etc.

Intra-rater Scoring criteria, fatigue, bias toward particular good and bad students, or simple carelessness.

It can be caused by administration factors. e.g. noisy from outside, photocopying variations, room condition, even condition of desks and chair.

Factors cause unreliability: If a test too long, test takers may become fatigued by the time they reach the later items and hastily respond incorrectly. Ambiguous items.

Measuring what should be measured


o o o o o

Content-related evidence Criterion-related evidence Construct-related evidence Consequential validity Face validity

If a test samples the subject matter about which conclusions are to be drawn. If a test requires the test-taker to perform the behavior that is being measured.

is used to demonstrate the accuracy of a measure or procedure by comparing it with another measure or procedure which has been demonstrated to be valid.

Example imagine a hands-on driving test has been shown to be an accurate test of driving skills. By comparing the scores on the written driving test with the scores from the hands-on driving test, the written test can be validated by using a criterion related strategy in which the hands-on driving test is compared to the written test.

Concurrent validity/ empiric validity if a test result is supported by other concurrent performance beyond assessment itself.
1.

e.g. the validity of a high score on the final exam of a foreign language course will be substantiated by actual proficiency in the language.

2. Predictive validity

to assess (and predict) a test takers likelihood of future success. e.g SNMPTN

How well performance on the assessment can be interpreted as meaningful measure of some characteristics or quality.

How well use of assessment results accomplishes intended purposes and avoids unintended effect.

It refers to the degree to which a test looks right, and appears to measure the knowledge or ability it claims to measure, based on the subjective judgment of the examinees who take it, the administrative personnel who decide on its use, and other psychometrically unsophisticated observers (Mousavi in Brown, 2004)

The language as natural as possible. Items contextualized rather than isolated. Topics meaningful (relevant, interesting) for the learner. Some thematic organization to items is provided, such as through a story line or episode. Tasks represent, or closely approximate, real-world tasks.

Contextualized
Going to
1. What _______ this weekend? a. you are going to do b. are you going to do c. your gonna do

Decontextualized
1. There are three countries I would like to visit. One is Italy. a. The other is New Zealand and other is Nepal b. The others are New Zealand and Nepal c. Others are New Zealand and Nepal

Contextualized

Contextualized

2. Im not sure. 2. When I was twelve _______ anything years old, I used special? ______every day. a.Are you going to a. swimming do b. to swimming b.You are going to do c. to swim c. Is going to do

The effect of testing on teaching and learning (Hughes in Brown, 2004). Generally refers to the effects tests have on instruction in terms of how students prepare for the test (Brown, 2004).

You might also like