Descriptive Stats

A
Descriptive Statistics
A cheat sheet to help you navigate through the statistics required for understanding as a special educator.
Key Terms Standard Deviation A unit of measurement that represents the typical amount that a score can be expected to vary from the mean in a given set of data. SE Relevance: Average scores fall between one standard deviation of the mean. Reliability The dependability or consistency of an instrument across time or items. SE Relevance: Must have condence that the same score would be given to the same student if the assessment were given more than once. Best tests in SE are those with high reliability. Distribution Includes normal distribution, the bell curve, where 100 is the mean, median and mode. May also be positively or negatively skewed, which occurs with a smaller score sample. SE Relevance: Bell curve can help describe student achievement. Correlation A statistical method of observing the degree of relationship between two sets of data on two variables. SE Relevance: Reliability determined based upon admin. of instrument and one other variable. There should be a relationship, positive or negative. Variance Describes the total amount that a group of scores varies in a set of data. SE Relevance: This is the rst step to nding out the standard deviation. It describes how much variability exists in the scores of the data set. Standard Error of Measurement The amount of error determined to exist using a specic instrument, calculated using the instruments standard deviation and reliability. SE Relevance: An instrument with a large SEM would be less desirable than a small SEM. Standard Score Derived scores that represent equal units. SE Relevance: Used mostly with standardized assessment, in a normal distribution situation, where scores within 85 and 115 are considered average (within 1 standard deviation.) Condence Interval The range of scores for an obtained score determined by adding and subtracting standard error of measurement units. SE Relevance: Describes how condent an assessor can be about where in the range of scores a students true scores falls. Percentile Scores Scores that express the percentage of students who scored as well as or lower than a given students score based upon 100%. SE Relevance: Used in standardized assessment and curriculum based measures. Validity The quality of the test; he degree to which an instrument measures what it was designed to measure. SE Relevance: If special educator is seeking particular information about a students ability, he/ she would want to know the validity of the instrument for the particular skill. Z Scores Derived scores that are expressed in standard deviation units. SE Relevance: Used in psych. testing, where the mean is described as 0 and other scores are described as +1 (1 SD above mean) and so on.
Key terms related to gathering data to determine the distribution of scores, as well as variance and standard deviation.
Examples of normal distribution, negatively skewed distribution, and positively skewed distribution.
a) no correlation, b) strong positive, c) strong negative, d) and e) low correlation with little change in the variable, f) spurious high correlation.
Sierra Nevada College SPED 510
Standard Error of Measurement

Error always exists in measurement. It accounts for the slight differences between who is assessing the student, the students mood on a given day, differences in maturity, the time of day of an assessment, etc. There are many variables that can affect how a test is taken. To account for this, an equation was developed to determine a range of scores that would be appropriate to a student rather than a specic score. Standard error of measurement is the amount of error determined to exist using a specic instrument, calculated using the instruments standard deviation and reliability. It is determined by looking at ones obtained score and subtracting their true score to determine the error. Once the standard error of measurement has been determined, a range of scores for the obtained score can be calculated by adding and subtracting the standard error of measurement units This is known as the condence interval. The standard error of measurement is important because it can erase differences in scores that may be seen as signicant. That is why this data is important data to collect and evaluate when looking to qualify a student for special education.
Reliability and Validity Test Validity

When an instrument measures the same item across time and individuals, it is considered reliable. When an instrument measures the skill or trait it is supposed to measure, it is considered valid. Reliability is the dependability or consistency of an instrument across time or items. Reliability is determined based upon the correlation of the assessment and another variable. The ways that reliability is determined include: test-retest reliability, equivalent or alternate forms of reliability, and through internal consistency measures. Test-retest reliability is a study that employs the readministration of a single instrument to check consistency across time. If the trait being measured remains constant, the readministration of the instrument results in similar scores across time. The readministration should be completed within a fairly short period of time to control inuencing variables. Equivalent forms of reliability nds the consistency of the test to measure some domain, traits or skill using like forms of the same instrument. (Having more than one form of the assessment, where each form of the assessment has items that measure the same trait or skill set.) These scores are then compared, and if they achieve similar scores, the correlation between the scores indicates if there is a high degree of reliability. Internal consistency measures are used when there is only one test, and one form of that test. Examples of internal consistency measures include split-half and Kuder-Richardson. In the split-half reliability measure, the test is split and individual items said to measure the same trait are compared. For KuderRichardson (K-R) 20, a formula is used to check consistency across items. Finally, it is important that examiners agree about how tests are scored. Interrater reliability is the consistency of a test to measure a skill, domain, or trait across examiners. This is typically used for writing assessments. Validity is concerned not with repeated, desirable results (as reliability) but rather with the degree of good results for the purpose of the test. In other words, does the test measure what we would expect the test to measure? The following are methods to determine the degree to which he instrument measures what it was intended to measure. Criterion-related validity is a method of comparing an instruments ability to measure a skill, trait, or domain with an existing instrument or other criterion. It does so by comparing the scores in an instrument with other criteria known to be indicators of the same trait. There are two main types of criterion related validity differentiated by time factors: concurrent validity and predictive validity. Concurrent validity is a comparison of one instrument to another in a short period of time, even a single day, where if the validity coefcient is similar in both assessments, the instruments are said to be measuring the same trait. Predictive validity is a measure of a specic instruments ability to predict performance on another measure at a later date. For example, how will the SAT scores predict how a student will do in similar subtests of the GRE? Content validity occurs when the items contained within the test are representative of the content purported to be measured. Just because a test is named something doesnt mean it is going to test that specic skill. It may only test a component of the skill. A good representation of content will contain all forms of the individual skill. Construct validity is the ability of the instrument to measure psychological constructs. A construct, in psychological terms, is used to describe a psychological trait, personality trait, psychological concept, attribute, or theoretical characteristic. These constructs must be clearly dened, and are typically abstract like intelligence or creativity.
Formulas
Variance: 1: nd mean of data set, subtract mean from each score. 2: nd the square of each of the difference scores. 3: add all the squares up, and divide by the total number of scores. Standard Deviation (SD): 1: square root of the variance Standard Error of Measurement (SEM): 1: Obtained score - true score = error 2: SEM = SD multiplied by the square root of the number 1 minus the reliability coefcient (r)
Wiedenmayer

Descriptive Stats

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Descriptive Stats

Uploaded by

Copyright:

Available Formats

A

Sierra Nevada College SPED 510

Standard Error of Measurement

Reliability and Validity Test Validity

You might also like