You are on page 1of 13

FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS

Lu Ann Aday, Ph.D. The University of Texas School of Public Health

STEPS IN DESIGNING AND CONDUCTING HEALTH SURVEYS

TYPES OF SURVEY ERRORS

Systematic Error (bias): Difference (positive or negative) between survey estimate and actual population value Variable (random) Error: Range or variation (variance) in values of an estimate across observations (or cases)

EFFECTS OF SURVEY ERRORS ON SURVEY ESTIMATES

OVERVIEW OF SURVEY ERRORS


SURVEY PROCEDURE
Study Design Sample Design Data Collection Questionnaire Design Measurement Data Preparation Analysis Plan

SYSTEMATIC ERROR
Poor internal validity Poor external validity

VARIABLE ERROR
Design specification ambiguity Standard errors Design effects

Noncoverage bias Weighting errors

Unit nonresponse bias Item nonresponse bias

Interviewer variability Mode effects

Under/over-reporting Yea-saying

Order & context effects Low or poor reliability Data coding, editing, or data entry errors Low statistical precision or power

Low or poor validity Imputation/Estimation errors Poor statistical conclusion validity

SURVEY ERRORS: Study Design

Systematic

Variable

Poor internal validity. The


study design does not adequately and accurately address the studys hypotheses, particularly with respect to demonstrating a causal relationship between the independent (predictor) and dependent (outcome) variables.

Design specification ambiguity. The

Poor external validity.

Findings based on the study design cannot be widely or universally applied to related populations or subgroups.

statement of the study objectives and related concepts to be measured in the survey are not clearly and unambiguously stated, particularly in relationship to the underlying study design and data analysis plan for the study.

SURVEY ERRORS: Sample Design

Systematic

Variable

Noncoverage bias. All units


of the target population (e.g., households, individuals) are not included in the sampling frame (frame bias) or the respondent is not selected randomly (respondent selection bias).

Standard errors. The

standard error measures random sampling variation in an estimate (e.g., mean or proportion) across all possible random samples of a certain size that could theoretically be drawn from the target population.

Weighting errors.

Respondents are disproportionately represented in the survey sample by failing to weight each of the cases by the disproportionate probability of their falling into the sample (sampling fraction).

Design effects. The design

effect, computed as the ratio of the variance of a complex sample to that of a simple random sample, measures the increase in random sampling variation in an estimate due to the complex nature of a sample design.

SURVEY ERRORS: Data Collection

Systematic

Variable

Unit nonresponse bias.

Selected units of the study sample (e.g., households, individuals) are not included in the final study due to respondent refusals or unavailability during the data collection process.

Interviewer variability.

Survey interviewers or data collectors vary in how they ask or record answers to the survey questions.

Mode effects. The

Item nonresponse bias.

Selected questions on the survey questionnaire are not answered due to respondent refusals or interviewer or respondent errors or omissions during the data collection process.

responses to comparable questions by respondents vary across different data collection methods (e.g., personal interview, telephone interview, mail self-administered questionnaire, web survey, etc.). [Note: If these effects differ in a particular direction across mode, they become systematic errors.]

SURVEY ERRORS: Questionnaire Design

Systematic

Variable

Under/over-reporting. An

estimate (e.g., mean or proportion) across samples differs in a particular (negative or negative) direction from the underlying actual (or true) population value for the estimate, i.e., is lower (underreporting) or higher (overreporting).

Order & context effects.

Answers to selected survey questions vary depending on whether they are asked before or after other questions and/or appear at the beginning or the end of the survey questionnaire.

Yea-saying. Respondents

tend to agree rather than disagree with statements as a whole (acquiescent response set) or with what are perceived to be socially desirable responses (social desirability bias).

SURVEY ERRORS: Measurement

Systematic

Variable

Low or poor validity.

Systematic departures exist in answers to the content of a survey question from the meaning of the concept itself (content validity), a criterion for what constitutes an accurate answer based on another data source (criterion validity), and/or hypothesized relationships of the concept being measured with other measures or concepts (construct validity).

Low or poor reliability.

Random variation exists in answers to a survey question due to when it is asked (test-retest reliability), who asked it (inter-rater reliability), and/or that it is simply one of a number of questions that could have been asked to obtain the information (internal consistency reliability).

SURVEY ERRORS: Data Preparation

Systematic

Variable

Imputation/Estimation errors. Procedures for

assigning values to survey questions for which answers are not available (missing values due to item nonresponse) either from data internal or external to the survey (imputation or estimation, respectively) introduce systematic errors (biases) in estimating or examining relationships between variables.

Data coding, editing, or data entry errors. Data


coding, editing, or data entry personnel or procedures introduce random errors in producing data files based on the survey questionnaires.

SURVEY ERRORS: Analysis Plan

Systematic

Variable

Poor statistical conclusion validity. The

accuracy of statistical conclusions is compromised due to the application of statistical procedures that do not meet underlying assumptions related to the study design and objectives, level of measurement of study variables, and/or the underlying population distribution.

Low statistical precision or power. There are

insufficient cases in the study sample to estimate population parameters with a reasonable level of precision or to have enough statistical power to detect statistically (and substantively) significant relationships between variables if they do exist.

You might also like