Professional Documents
Culture Documents
Transforms Revisited
Now
were going to consider a more general
Strategy
First, transform Y.
If that doesn't work, transform the
predictors, but not Y.
Keep in mind
Don't remove outliers, influential points,
etc. until the transforming is done.
Keep in Mind
Transform Y
Basic idea: What if
E(Y |X) 6=
1 x1
+ ... +
p xp
but instead:
E(Y |X) = g(
1 x1
+ ... +
p xp )
E(Y |X) = g(
1 x1
+ ... +
Transform Y: 2 approaches
p xp )
(E(Y |X)) = g
Ynew =
(g(
1 x1
1 x1
+ ... +
+ ... +
p xp
p xp ))
> m1=lm(ozone~temperature+pressure,data=ozonetext)
> plot(m1)
> library(alr3)
> invResPlot(m2)
1
2
3
4
lambda
0.3658881
-1.0000000
0.0000000
1.0000000
RSS
1989.771
3412.912
2082.377
2196.992
Note log transform isnt to different from optimal
Ynew = Y
0.365881
transformed
original
transform
transformed
original
original
> summary(m2)
Call:
lm(formula = ozone.t ~ temperature + pressure, data = ozone.t1)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.4004629 0.1774149 -2.257
0.0256 *
temperature 0.0423812 0.0027663 15.321
<2e-16 ***
pressure
-0.0001918 0.0010937 -0.175
0.8610
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 0.3794 on 138 degrees of freedom
Multiple R-squared: 0.6688,
Adjusted R-squared: 0.664
F-statistic: 139.3 on 2 and 138 DF, p-value: < 2.2e-16
Another approach:
Box-Cox
Choose a transform of Y,
(Y )
where
(Y ) = gm(Y )1
(Y
(Y ) = gm(Y )log(Y )
1)/
for
=0
(Y ) = gm(Y )1
(Y
1)/
ni=1 Yi
To find lambda....
maximum likelihood estimation of lambda.
> library(MASS)
> boxcox(m1)
or
> library(alr3)
>summary(powerTransform(y~x1+x2,data=))
> boxcox(m1)
1/3
> summary(powerTransform(m1))
bcPower Transformation to Normality
which confirms
our previous
transformation
using lambda = .37
Y1
Transform Predictors
You can use BoxCox to transform
>library(alr3)
> summary(powerTransform(ozone~temperature+height,data=o2.mini))
box.cox Transformations to Multinormality
Est.Power Std.Err. Wald(Power=0) Wald(Power=1)
1.1383
0.3246
3.5070
0.426
18.9126
4.5176
4.1864
3.965
LRT df
p.value
LR test, all lambda equal 0 25.50600 2 2.893633e-06
LR test, all lambda equal 1 17.30179 2 1.749703e-04
temperature
height
(probably not)
residuals: no transform
transformed
predictors
once again,
Y
Y 1/3
looks best.
This is consistent with the 1/3 power of ozone, a 20th power for
height, and no change (raise to the 1 power) for temp.