Professional Documents
Culture Documents
ROBUST REGRESSION
where
Yi is the value of the response variable in the ith trial
Xij are the known constants; the value of the jth independent
variable on the ith trial
0, 1, 2, ..., k are parameters
i is a random error term,
(i = 1, 2, ..., n and j = 0, 1, 2, ..., k)
Robust Regression
DISADVANTAGE:
Highly influenced by outliers. OLS is not robust to outliers.
Note: Regression outliers (either in x or in y) pose a serious threat
to standard least squares analysis.
Robust Regression
Robust Regression
ROBUST REGRESSION
An alternative method to the Ordinary least Squares
method
A regression method that is not as sensitive to
outliers when errors are not normally distributed as
opposed to usual assumption that errors in regression
models are normally distributed
Robust Regression
ROBUST REGRESSION
Robustness is the insensitivity to small deviations from the
assumptions the model imposes on the data (Huber, 1981)
Robust Regression
ROBUST REGRESSION
The Breakdown Point of an estimate is the smallest
fraction of the data that can be changed by an arbitrarily
large amount and still cause an arbitrarily large change in
the estimate.
Robust Regression
Robust Regression
Robust Regression
Robust Regression
10
Minimize
Procedure: (one-dimensional case):
1.
2.
3.
4.
11
Exercises:
Cuteness Rating was taken from random people of
engineering. Here are the observations:
Robust Regression
12
13
Remarks:
1. Unlike LS, LMS does not have a closed form formula.
2. Since the median is an order or rank statistic, it is not
amenable to calculation via derivatives or other
calculations that rely on continuous functions.
3. LMS estimator may not be the estimator with the
smallest variance, but it generalizes to multiple
regressions.
4. The position of the LMS estimate lies where the points are
concentrated, not in the center of good observations.
5. The LMS is similar to a mode estimator.
6. For n=3, the LMS estimator is not satisfactory since the
two points have the tendency to be close to each other by
chance making the 3rd one an outlier.
7. The LMS has a 50% breakdown point
Robust Regression
14
15
Robust Regression
16
Exercises
Nursing Board Exam 2015, Philippine Dairy Inquirer
Robust Regression
17
18
Remarks:
1. The number of observations can be drastically
reduced by using the mean of the preceding half.
2. The residuals are squared first and then ordered.
3. According to Rousseeuw (1998), the LTS
procedure is more efficient than the LMS.
4. The objective function of LTS is similar to the
objective function of LS. The only difference is that
the largest squared residuals are not used in the
summation, thus, allowing the fit to stay away from
the outliers.
5. The LTS also has a 50% breakdown point.
Robust Regression
19
References:
Rousseeuw, P. J., & Leroy, A. M. (1987). Robust
Regression and Outlier Detection. Canada.
Jacoby, Bill. Regression III: Advanced Methods.
Michigan State University
Garner, Will. Robust Regression
Simons, Kenneth. (2013). Useful Stata Commands
(for Stata version 12)
Robust Regression
20