You are on page 1of 13

Reliability

The consistency of scores obtained from one instrument to another or from the same instrument over different groups. Test retest Equivalency Inter rater Internal consistency Split half

Test retest reliability


Method for testing the same subject at a later date, ensuring that there is a correlation between the results. An educational test taken after a month should yield the same results as the original.

Equivalency reliability
method which two items measure identical concepts at an identical level of difficulty. is determined by relating two sets of test scores to one another to highlight the degree of relationship.

Inter-rater reliability
Method use two or more independent judges score the test. The scores are then compared to determine the consistency of the rater estimates. One way to test inter rater reliability is to have each rater assign each test item a score.

Internal consistency reliability


Method used to judges the consistency of result across items on the same test. Compare test items that measure the same construct to determine the tests internal consistency. Example, when a question is very similar to another test question, it may indicate that the two questions are being used to gauge(measure) reliability.

Split-half reliability
A measure of consistency where a test is split into two and the scores for each half of the test is compared with one another. If the test is consistent, it leads the experimental to believe that it is most likely measuring the same thing.

Factors affecting reliability


Test length - the longer a test is, the more reliable Speed - when a test is a speed test, reliability can be problematic. Group homogeneity - the more heterogeneous the group of students who take the test, the more reliable the measure will be. Item difficulty - reliability will be low if a test is so easy or difficult. Objectivity - objectively scored tests show a higher reliability than subjectively scored tests. Test-retest interval -the shorter the time interval between two administrations of a test, the higher the reliability will be. Variation with the testing situation. Errors in the testing situation can cause test scores to vary.

Relationship between validity and reliability


Reliability is an essential component of validity but on its own, is not sufficient measured of validity. A test can be reliable but not valid, whereas a test cannot be valid yet unreliable.

Validity
Validity refers to the accuracy of an assessment -whether or not it measures what it is supposed to measure. Even if a test is reliable, it may not provide a valid measure. Lets imagine a bathroom scale that consistently tells you that you weigh 130 pounds. The reliability (consistency) of this scale is very good, but it is not accurate (valid) because you actually weigh 145 pounds (perhaps you re-set the scale in a weak moment) Since teachers, parents, and school districts make decisions about students based on assessments (such as grades, promotions, and graduation), the validity inferred from the assessments is essential -- even more crucial than the reliability.

Example 1 :

Example 2:

Type of Validity Content The

Definition extent to which content of the test matches that the instructional objectives. only

Example the A semester or quarter exam includes content

covered during the last six weeks is not a valid measure of the course's overall

objectives -- it has very low content validity. Criterion The extent to which scores on If the end-of-year math tests the test are in agreement with in 4th grade correlate highly (concurrent validity) or predict with the statewide math tests, (predictive validity) an external they criterion. Construct The extent to which would have high

concurrent validity. an If you can correctly

assessment corresponds to hypothesize that the students other variables, as predicted will perform differently on a by some rationale or theory. reading test than English-

speaking students (because of theory), the assessment may have construct validity.

Nature of the group

Consistency of the validity coefficient for subgroups which differ in any characteristic (e. g. age, gender, educational level, etc, )

Factors that influences validity assessment

Criterionpredictor relationship
There must be a linear relationship between predictor and criterion.

Moderator variables
Variables like age, gender, personality characteristics may help to predict performance for particular variables only

You might also like