Professional Documents
Culture Documents
• File: cherry.sav
⇒ y = f (x1, x2)
• Model:
• File: air2.sav
• Diagnosing
– influential observations
– problematic observations (outliers)
Specific Issues in Multiple Regression
• Variable selection
• Numerical problems
Diagnostic Tools:
• Dependent: body fat percentage
• (Deleted) residuals
• Independent:
• Influence measures: – age, height, weight
– 10 body measurements (circumferences)
– DfBeta: change in parameter for each observation removed
– DfFit: change in fit for each observation removed
• 252 men between 22 and 81
• Distance measures:
• File: fat2.sav
– Mahalanobis-distance, leverage
– Cook-distance
• ’All models are wrong, some are useful.’ (G. Box) • Using the F-statistic (ANOVA)
• Selection of variables: • Stepwise – add one variable per step, remove another one if necessary
– automatic
– manual
• Collinearity:
• known causal relationships
– correlation between independent variables
• t-statistics and confidence intervals for coefficients – calculations too sensitive to small changes in data
– due to rounding errors
– Symptoms:
• Correlations with dependent variable:
∗ ’wrong’ sign of coefficients
– (0-order) ∗ important variables have small t-statistic
– partial – correlation after removing effects of other variables ∗ large standard errors
– part – correlation after removing effects of other variables from the indepen-
dent variable • Measures:
– tolerance, variance inflation factor
– eigenvalues