LAD Estimation With Random Coefficient Autocorrelated Errors

Computational Statistics & Data Analysis 36 (2001) 511–523
www.elsevier.com/locate/csda
LAD estimation with random

coe$cient autocorrelated errors
Marilena Furno ∗
Department of Economics, Universita di Cassino, Italy
Received 1 July 1998; accepted 1 August 2000
Abstract
In this paper we compare the performance of LAD and OLS in the linear regression model with
errors which are randomly autocorrelated. This model yields thick-tailed error distributions which make
pro4table to estimate the model by LAD. The LAD estimator for randomly autocorrelated errors is
proved to be asymptotically normal. The Monte Carlo results show that LAD improves upon OLS,
unless we revert to a constant autocorrelation model, where the two methods are comparable. c 2001
Elsevier Science B.V. All rights reserved.
Keywords: Thick-tailed distributions; Least absolute deviation (LAD); Random coe$cient autocorrela-
tion (RCA); Conditional heteroskedasticity (ARCH).
0. Introduction
This paper considers a linear regression model with random coe$cient autocorre-
lated (RCA) errors. As discussed in Tsay (1987), the RCA model is characterized by
changing conditional variance. In a time series setting, the conditional heteroskedas-
ticity caused by RCA is function of the past observations of the variable under study,
while in the ARCH model the variances depend upon the lagged errors of the equa-
tion. When we consider, however, a linear regression with randomly autocorrelated
errors, the di>erence between the two models disappears, and the conditional variance
is function of past innovations in the RCA just as in the standard ARCH case.
In the ARCH literature, there is a wealth of empirical evidence discussing how con-
ditional heteroskedasticity a>ects the unconditional error distribution, causing non-
normality like leptokurtosis and=or skewness (Engle and Gonzales–Rivera, 1991).
∗
Correspondence address: Via S. Lucia 173, 80132 Napoli, Italy.
E-mail address: furnoma@tin.it (M. Furno).
0167-9473/01/$ - see front matter c 2001 Elsevier Science B.V. All rights reserved.
PII: S 0 1 6 7 - 9 4 7 3 ( 0 0 ) 0 0 0 5 0 - 5
512 M. Furno / Computational Statistics & Data Analysis 36 (2001) 511–523
This leads us to set aside the maximum-likelihood estimator and to consider a dis-
tribution free estimator. The presence of thick tails suggests the choice of a robust
estimator, which provides e$ciency gains with respect to least squares. In this paper,
to deal with thick-tailed distributions, we propose to implement the least absolute
deviation (LAD) estimator.
LAD coincides with maximum likelihood when the errors follow a double ex-
ponential distribution. In all the other cases, LAD is less a>ected by observations
coming from the tails, since it minimizes the absolute value and not the squared
value of the errors. This is particularly useful with leptokurtic error distributions.
LAD has already been considered in models with constant autocorrelation (Weiss,
1990). We propose to implement LAD in the presence of random autocorrelation
and we prove its asymptotic normality. The simulations we perform show that LAD
improves upon OLS in case of RCA errors, both in terms of bias reduction and
e$ciency gains. However, when we revert to the constant autocorrelation model,
our results agree with Weiss (1990) 4ndings. His Monte Carlo study shows that the
LAD-based procedure is not particularly advantageous, especially in small samples,
since its sampling distribution di>ers from the asymptotic one. In addition, the OLS-
and LAD-based procedures yield results which are comparable in many respects,
thus discouraging the use of LAD.
The 4rst section of the paper brieHy reviews the relevant literature. The linear-
regression model with random autocorrelation of the 4rst order and the corresponding
LAD objective function, are in Section 2.1. In Section 2.2 we discuss the asymptotic
distribution of the LAD estimator here considered. Section 3 presents more general
random coe$cient ARMA models for the error term, analyzing the resulting condi-
tional heteroskedasticity. A Monte Carlo experiment is described in Sections 4 and
5, while the 4nal section draws the conclusions.
1. Review of the literature

1.1. Random coe.cient autocorrelation
Consider the linear regression model
yt = xt + et ; (1)
et = t et−1 + at ; at ∼ i:i:d: (2)
where xt is a (k; 1) vector of exogenous variables, yt is the dependent variable, and
is a (k; 1) vector of unknown parameters. The error term et is characterized by zero
mean and 4rst-order random autocorrelation. Tsay (1987) shows that model (1) can
be rewritten as yt = xt +et−1 +rt et−1 +at = xt +et−1 +vt , where || ¡ 1; rt is i.i.d.
and is independent of at . The conditional variance of vt is given by var(vt =It−1 ) = a2 +
r2 et−1
2
, which de4nes an ARCH process (Engle, 1982).
In model (1) the random correlation, and therefore the conditional heteroskedas-
ticity, is de4ned in its simplest form. The following section will brieHy consider
the case of higher-order serial correlation and more general ARMA errors. It is
not so easy, however, to generalize the conditional heteroskedasticity to GARCH
M. Furno / Computational Statistics & Data Analysis 36 (2001) 511–523 513
(Bollerslev, 1987), EGARCH (Nelson, 1991) and more sophisticated conditionally

heteroskedastic processes, since the structure of the conditional heteroskedasticity is
linked to the assumed form of random autocorrelation.
In the paper, we avoid any distributional assumption about et . The errors are gen-
erally assumed to be normal in the literature. However, it is well known that, even
with a Gaussian conditional distribution, the unconditional one has tails larger than
normal. This leads Bollerslev and Wooldridge (1992) to propose the quasi-maximum
likelihood (QML) estimator, which consistently estimates the model under the as-
sumption of normality, even if this assumption is false. In addition, by adjusting
the covariance matrix estimator in order to account for the discrepancy between the
assumed and the real distribution, they present a LM test, which is robust to non-
normality. Unfortunately, Engle and Gonzales-Rivera (1991) show that the QML ap-
proach is highly ine$cient, and propose a semi-parametric method: the error density
is estimated non-parametrically and then maximum likelihood can be implemented
by assuming the density computed in the sample. The e$ciency gain they obtain
is only up to 50%, and their approach requires a very large sample, at least 500
observations. The sample size requirement can be a serious limitation, particularly
in non-4nancial problems.
A di>erent approach is to assume a thick-tailed distribution for the error terms,
like Student-t or mixture of normals. However there is no rigorous result about the
consistency of QML built on a non-normal distribution, let alone in a model of
conditional heteroskedasticity.
To deal with this problem we propose to implement LAD, which does not rely
on a distributional assumption.
1.2. The LAD estimator with 2xed autocorrelation
In case of 4xed autocorrelation, Weiss (1990) proposes a generalized LAD (GLAD)

estimator. The basic LAD estimator, presented by Koenker and Bassett (1978), is
a robust procedure, which computes the median conditional regression, just as OLS
considers the mean conditional regression. LAD is more e$cient than OLS when
the error distribution has thick tails, that is when [2f(0:5)]−2 ¡ 2 , where f(0:5) is
the height of the error density at the median while 2 is the variance of the errors.
GLAD is the LAD estimator implemented on the transformed variables. The trans-
formation purges the 4xed autocorrelation from the data: yt∗ = yt − yt−1 for the
dependent variable, and xit∗ = xit − xi; t−1 for the explanatory variables, where is
the coe$cient of autocorrelation, t refers to the time period and i to the explanatory
variables. GLAD is the analogue of generalized-least squares (GLS), which coincides
with a standard OLS regression computed using the transformed variables (yt∗ ; xit∗ ).
The presence of serial correlation is generally veri4ed by means of a lagrange mul-
tiplier (LM) test. 1
1
The LM test is very easy to implement since it coincides with nR2 , where n is the sample size
and R2 is the coe$cient of determination in the regression of a function of the residuals from the
main equation as dependent variable and the lagged residuals as explanatory variable. The nR2 test is
asymptotically distributed as a 2 with one degree of freedom, under the null of zero autocorrelation.
Furno (2000) investigates the performance of LAD residuals to build LM tests for
AR and=or ARCH processes. In case of non-normal distributions LAD-based tests
have greater power than the same tests built on OLS residuals. In addition, Machado
and Silva (2000) show that the Glejser test for heteroskedasticity improves with the
use of LAD residuals, even if the error distribution is skewed.
The characteristics of the LAD estimator, together with the good performance of
the LAD residuals in terms of testing procedures (Furno, 2000; Machado and Silva,
2000) lead us to believe that LAD can improve upon OLS.
2. LAD with random autocorrelation
2.1. The estimator
In model (1) the estimation of the autocorrelation coe$cients is feasible only

for the 4xed term . Indeed, the use of the transformed variables yt∗ = yt − yt−1
and xit∗ = xit − xi; t−1 does not purge the random autocorrelation and the resulting
conditional heteroskedasticity
yt∗ = xt∗ + rt et−1 + at = xt∗ + vt ; (3)
var(vt =It−1 ) = ht = a2 + r2 et−1

2 2
= 0 + 1 et−1 : (4)
Eq. (4) coincides with the auxiliary regression de4ning the pattern of the conditional
heteroskedasticity. It can be estimated by replacing the unknown ht with a function
of vt . The latter are the residuals computed in Eq. (3), that is after purging the 4xed
2
autocorrelation. The et−1 in Eq. (4) are the lagged errors of Eq. (1), which is in
terms of the original variables yt and xt .
The terms et and vt can be computed by implementing LAD in Eqs. (1) and (3),
respectively. When we estimate Eq. (1) by LAD, we minimize
(i) t |yt − xt | = t |et |:
When in Eq. (3) we transform the variables to account for the 4xed correlation,
the objective function is
(ii) t |(yt − yt−1 − xt + xt−1

)| = t |vt |:
In addition, but this is not implemented in this paper, we can purge the conditional
heteroskedasticity as well, and the objective function is given by
√ √
(iii) t |(yt − yt−1 − xt + xt−1

)= ht | = t |vt = ht |:
Eq. (4), which describes the pattern of conditional heteroskedasticity, produces
an important by-product. The slope coe$cient 1 provides an estimate of r2 =
[2fr (0:5)]−2 , and the constant term estimates a2 = [2fa (0:5)]−2 . Both a2 and r2
are very useful in computing the variance covariance matrix of the coe$cients of
the main equation, thus simplifying the problem of estimating f(0:5), which usually
involves non-parametric estimators. Eq. (4) can be estimated by LAD as well. This
implies the minimization of the objective function
2
(iv) t |ht − 0 − 1 et−1 |;
which di>ers from the formulation proposed by Koenker and Zao (1996) for a closely
related problem. 2
2.2. Asymptotic distribution
Weiss (1990) proves the asymptotic normality of LAD and GLAD in case of 4xed
serial correlation.
To estimate the RCA model of Eqs. (3) and (4), we minimize t |vt | for the main
2
equation, and t ||vt |−0 −1 et−1 | for the auxiliary regression, where we approximate
the term ht with |vt |. The parameters of interest are ’ = (; ; 1 ), and their normal
equations are given by
n−1=2 t (vt )xt∗ = 0;
n−1=2 t (vt )et−1 = 0;
n−1=2 t (|vt | − 0 − 1 et−1

2 2
)et−1 = 0; (5)
where (:) = sign(:) is the directional derivative
√ ˆ of the absolute value function.
To prove the asymptotic normality of n( − ) de4ne the following functions:
 
(t−1 (yt∗ − F −1 (0:5) − n−1=2 xt∗ (b − ))xt∗
 
g0 = n−1=2 t  (t−1 (et − F −1 (0:5) − n−1=2 et−1 (ˆ − ))et−1  ; (6)
(|vt | − F −1 (0:5) − n−1=2 t−1 et−1
2 2
(a1 − 1 ))et−1
 
(yt∗ − xt∗ b)xt∗
 
gn = n−1=2 t  (et − et−1 )e ˆ t−1  ; (7)
2 2
(|vt | − a1 et−1 )et−1
where F(0:5) is the value of the cumulative distribution at the median.
By Lemma A3 of Ruppert and Carroll (1980) one has
√
sup||gn − g0 + f(0:5)M n(ˆ − )|| = 0p (1);
where
 
xt∗ xt∗ =t xt∗ et−1 =t xt∗ et−1
2
=t
 
M = lim n−1 t  xt∗ et−1 =t 2
et−1 =t 3
et−1 =t  (8)
xt∗ et−1
2
=t 3
et−1 =t 4
et−1 =t
2
Koenker and Zao (1996), in the model yt = 0 + i = 1; p i yt−i + et , present the quantile regression
estimator of the auxiliary equation de4ning the ARCH process et = (0 + 1 |et−1 | + · · · + q |et−q |)!t .
Assuming su$cient conditions for the stationarity and ergodicity of yt and et , and provided a consistent
estimate of the coe$cients of the main equation, they prove the asymptotic normality of the coe$cients
i of the auxiliary regression.
√
and n(ˆ − ) = [f(F −1 (0:5))]−1 M −1 g0 + 0p (1). This allows to state the asymp-
totic distribution of the vector, which is normally distributed with zero mean and
covariance matrix
W = [f(F −1 (0:5))]−2 M −1 AM −1 ;
where
 
xt∗ xt∗ xt∗ et−1 xt∗ et−1
2
 
A = lim n−1 t  xt∗ et−1 2
et−1 3
et−1 :
xt∗ et−1
2 3
et−1 4
et−1
3. Extensions
The 4rst possible generalization is to assume that the errors follow a pth-order
random correlation process. After purging the constant autocorrelation, the errors are
de4ned as vt = i = 1; p rit et−i + at . This implies the following conditional variance:
var(vt =It−1 ) = a2 + i = 1; p r i2 et−i
2
+ i = 1; p j = i r ij et−i et−j ; (9)
which de4nes the augmented ARCH (AARCH) process (Bera et al., 1992).
If the errors follow a random coe$cient ARMA(p; q) process, after purging the
constant correlation, the errors become vt = j = 1; p rjt et−j + at + i = 1; q git at−i , with
conditional variance
var(vt =It−1 ) = j = 1; p r j2 et−j
2
+ s = 1; p j = 1; p r sj et−s et−j + a2
+i = 1; q g i2 a2t−i + i = 1; q k = i g ik at−i at−k : (10)

The above equation can be related to the CHARMA model in Tsay (1987), where
the terms in at−h of Eq. (10) are replaced by the forecast errors in Tsay (1987).
4. Monte Carlo
In a sample of 35 observations, the dependent variable is de4ned as yt = 0:3 +

0:6xt + et . The independent variable is given by the realizations of a uniform dis-
tribution de4ned on the unit interval. The errors et = t et−1 + at = ( + rt )et−1 + at ,
are computed by choosing for the values (0:0; 0:3; 0:6; 0:9) and by equating rt to
the realizations of the following distributions: standard normal, Student-t with four
degrees of freedom, 2 with four degrees of freedom, contaminated normal. The lat-
ter is computed as a mixture of a standard normal and a contaminating normal. The
contaminating normal has zero mean and a standard error equal to 10. The degree
of contamination, that is the percentage of observations coming from the normal
with larger variance, is 5%(CN(5%; 10) in the tables). The Student’s-t-distribution has
been chosen to model leptokurtosis, as in Bollerslev (1987). The 2 is chosen to
consider skewness, which is an additional concern with conditional heteroskedastic-

ity. 3 Finally, the mixture of normals is one way of creating fat-tailed distributions.
The et ’s and the at ’s follow two independent standard normal distributions. 4 For
each experiment we implement 500 replicates.
The main equation of the model is estimated by GLS and GLAD, using the
Cochrane–Orcutt procedure. Thus we estimate the equation yt = yt−1 + xt −

xt−1 + vt .
In order to assess the validity of GLAD with respect to GLS, we compare the
mean and the standard error of the distributions of the estimated coe$cients bi and
of the estimated autocorrelation ˆ over the 500 replicates, part (a) in the tables.
Then, we consider the average bias of the ’s, E(bi − i ), and of , E(ˆ − 1=nt t ),
over the 500 replicates, section (b) of the tables. Finally, we look at the average
value over 500 replicates of the ratio between GLS and GLAD standard errors of
the estimates, E(seGLS =seGLAD ), part (c) in the tables.
The use of random variables in the de4nition of the serial correlation poses a
stationarity problem. As we move away from normality, the probability of large
shocks in rt becomes higher and higher. This rises problems of stability in the model,
since the total serial correlation (4xed plus random) gets quickly above unity. This is
why, for the Student-t and the 2 distributions, we run a di>erent set of experiments
where the autocorrelation coe$cient is de4ned as t = + 'rt , with ' ¡ 1. For the
standard normal and the contaminated normal experiments there is no need to set
' ¡ 1. Indeed, the average level of serial correlation in each experiment does not
exceed one, as can be seen in the second column of Tables 1 and 2.
The choice of ' ¡ 1 implies a reduction of the random correlation. By gradually
reducing the impact of the random component, we end up restoring the 4xed coef-
4cient case. Indeed, in some experiments we set ' as small as 0.1 and 0.05. Such
small values are needed to keep the average value of t at a reasonable level. For
instance, if we do not control for the random component rt and set ' = 1, when rt
follows a Student-t with 4 degrees of freedom and = 0:6, the average serial corre-
lation is equal to 1.59. This makes greatly unstable the model and meaningless any
estimating procedure.
To balance the amount of randomness with stability issues, we reduce the value of
' as we increase the 4xed component . Their values are chosen so that the average
correlation does not exceed unity. 5
5. Results
Table 1 presents the summary statistics for GLS and GLAD when the random
correlation follows a standard normal distribution. In this set of experiments the
3
Engle et al. consider a gamma distribution to analyse skewness, but the 2 is just a special case of
the gamma distribution.
4
We could choose error terms et following other non-normal distributions. However, Furno (2000)
shows that is the t distribution to have the greatest inHuence on the results.
5
Herce (1996) presents the asymptotic distribution of LAD in the presence of unit roots.
Table 1
Random autocorrelation t = + rt ; rt is distributed as a standard normal
(a) Mean and standard deviation of the distributions of the estimated coe.cients
E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

ˆ
GLS 0.0 0.00 0.30 (1.5) 0.73 (2.3) −0:03 (0:2)

GLAD 0.0 0.00 0.30 (1.1) 0.62 (1.3) −0:02 (0:3)
GLS 0.3 0.29 0.15 (1.9) 0.81 (3.1) 0.17 (0.2)

GLAD 0.3 0.29 0.20 (1.1) 0.59 (1.4) 0.20 (0.3)
GLS 0.6 0.59 −0:08 (4:2) 1.25 (8.0) 0.36 (0.2)

GLAD 0.6 0.59 0.10 (1.6) 0.65 (2.0) 0.41 (0.3)
GLS 0.9 0.89 −1:84 (24) 3.28 (33) 0.53 (0.3)

GLAD 0.9 0.89 0.14 (4.2) 0.60 (3.5) 0.62 (0.3)
(b) Average bias over 500 replicates
E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )
GLS 0.0 0.00 0.006 0.13 −0.03

GLAD 0.0 0.00 0.006 0.02 −0.01
GLS 0.3 0.29 −0:14 0.21 −0.12

GLAD 0.3 0.29 −0:09 −0:002 −0.08
GLS 0.6 0.59 −0:38 0.65 −0.23

GLAD 0.6 0.59 −0:19 0.05 −0.18
GLS 0.9 0.89 −2:14 2.68 −0.36

GLAD 0.9 0.89 −0:15 0.004 −0.27
(c) Average ratio of GLS versus GLAD standard errors
E(1=nt t ) E(seGLS =seGLAD )b0 E(seGLS =seGLAD )b1 E(seGLS =seGLAD )ˆ
0.0 0.00 1.16 1.17 1.21
0.3 0.29 1.26 1.27 1.32
0.6 0.59 1.55 1.55 1.63
0.9 0.89 2.66 2.67 2.85
autocorrelation is de4ned as t = + rt . The 4rst column of the table de4nes the

di>erent values of the constant component of the autocorrelation, . The second
column gives the average over 500 replicates of the mean autocorrelation computed
within each replicate, 1=500j = 1; 500 (1=35t = 1; 35 t )j , E(1=nt t ) in the tables.
In these experiments, by increasing the degree of autocorrelation, ˆ increasingly
underestimates the true value. This can be seen by comparing its average over 500
Table 2
Random autocorrelation t = + rt ; rt is distributed as a contaminated normal
E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

ˆ
GLS 0.0 0.03 −11(385) −0:19 (253) −0:001 (0:7)

GLAD 0.0 0.03 −0:32 (10) 1.06 (10) −0:02 (0:3)
GLS 0.3 0.34 −4:29 (980) −6:30 (376) 0.18 (0.8)

GLAD 0.3 0.34 0.50 (8.2) 0.47 (5.3) 0.17 (0.3)
GLS 0.6 0.56 −4550 (106170) 3093 (76051) 0.32 (0.5)

GLAD 0.6 0.56 −0:04 (9:3) 1.59 (19) 0.39 (0.4)
GLS 0.9 0.93 147 (3874) −726 (18802) 0.48 (0.7)

GLAD 0.9 0.93 25 (458) −70 (1414) 0.58 (0.7)
E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )
GLS 0.0 0.03 −12 −0:79 −0.03

GLAD 0.0 0.03 −0:62 0.46 −0.06
GLS 0.3 0.34 −4:59 −6:90 −0.15

GLAD 0.3 0.34 0.20 −0:12 −0.16
GLS 0.6 0.56 −4551 3093 −0.24

GLAD 0.6 0.56 −0:34 0.99 −0.17
GLS 0.9 0.93 147 −726 −0.44

GLAD 0.9 0.93 25 −70 −0.34
E(1=nt t ) E(seGLS =seGLAD )b0 E(seGLS =seGLAD )b1 E(seGLS =seGLAD )ˆ
0.0 0.03 14 13 21
0.3 0.34 17 17 29
0.6 0.56 63 66 82
0.9 0.93 38 39 72
replicates, presented in the last column of the table, with the 4rst column of the
table reporting its true value. Table 1(b) shows that the GLAD bias is lower than
GLS, particularly in the slope coe$cient. Section (c) shows that, on average, GLAD
is more e$cient than GLS, since the ratios between the standard errors are always
greater than one.
In sum, with random autocorrelation following a normal distribution, GLAD im-
proves upon GLS in terms of both bias and e$ciency. This is due to the conditional
heteroskedasticity induced by RCA, and thus to the thick-tailed nature of the uncon-
ditional distribution.
Table 2 reports the results for RCA with a random component following a con-
taminated distribution. By comparing the 4rst with the last column of the table, once
again we can see that the degree of autocorrelation is increasingly underestimated as
increases. GLAD presents a reduced bias and a greater e$ciency than GLS, and
the improvements (bias reduction and e$ciency gains) are enhanced with respect to
those reported in the previous table. With a contaminated normal the GLS procedure
is quite unreliable in all the experiments here considered. GLAD becomes less reli-
able only in the highly correlated experiments, when the average correlation is equal
to 0.93.
The last two tables present a di>erent kind of experiments. In order to preserve
stationarity, the impact of the random component of the autocorrelation is strongly
reduced: t = + 'rt . The value of ' is reported in the second column of Tables 3
and 4.
Table 3 summarizes the results for rt following a Student-t distribution with 4
degrees of freedom. The 4xed correlation increases from 0 to 0.8, while the coe$-
cient controlling the impact of the random correlation, ', decreases from 1 to 0.1.
Therefore, the 4rst row of each section in this table provides the results for a fully
randomly correlated experiment, while the fourth row of each section presents the
case of an almost 4xed autocorrelation coe$cient. GLAD has a smaller bias than
GLS in all but the fourth experiment, where the two estimators are comparable. In
terms of e$ciency, GLAD is preferable in the 4rst two experiments, where prevails
the random correlation. GLS instead is more e$cient in the last two experiments,
where the 4xed serial correlation dominates (last two rows of the table). The 4xed
autocorrelation coe$cient is overestimated in the 4rst two rows of the table, where
the random correlation prevails.
In Table 4, rt follows a 2 distribution with 4 degrees of freedom. This table
con4rms the results of the previous set of experiments. When the random correlation
dominates, GLAD estimated coe$cients have smaller bias and greater e$ciency than
GLS. When the 4xed correlation prevails, instead, the two estimators are comparable
in terms of bias, while GLS is more e$cient than GLAD. The 4xed autocorrelation
coe$cient is overestimated throughout the table.
This con4rms the results of Weiss (1990): when serial correlation has a 4xed
coe$cient, LAD and OLS based procedures are comparable. However we 4nd that,
when the correlation has random component, GLAD improves upon GLS and can
be pro4tably implemented.
6. Conclusions
This study compares the behavior of OLS and LAD procedures in the context
of randomly autocorrelated errors. The performance of the two estimators has been
analyzed by Weiss (1990) in case of constant correlation. Weiss 4nds that the two
estimators yield very similar results so that LAD, being more cumbersome than OLS,
is not really advisable.
Table 3
Random autocorrelation t = + ' rt ; rt is distributed as a t(4)
' E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

ˆ
GLS 0.0 1.0 1.0 −0:15 (5:7) 0.59 (8.1) 0.63 (0.1)
GLAD 0.0 1.0 1.0 0.06 (1.1) 0.66 (1.3) 0.60 (0.2)
GLS 0.3 0.6 0.9 0.02 (1.1) 0.61 (1.3) 0.69 (0.1)
GLAD 0.3 0.6 0.9 0.10 (0.8) 0.60 (1.0) 0.66 (0.1)
GLS 0.6 0.2 0.8 0.12 (0.5) 0.58 (0.6) 0.68 (0.1)
GLAD 0.6 0.2 0.8 0.14 (0.6) 0.59 (0.8) 0.68 (0.1)
GLS 0.8 0.1 0.9 0.05 (0.6) 0.62 (0.6) 0.77 (0.13)
GLAD 0.8 0.1 0.9 0.04 (0.7) 0.62 (0.7) 0.76 (0.16)
' E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )
GLS 0.0 1.0 1.0 −0:45 −0:001 −0.38

GLAD 0.0 1.0 1.0 −0:23 0.06 −0.40
GLS 0.3 0.6 0.9 −0:27 0.01 −0.20

GLAD 0.3 0.6 0.9 −0:19 0.002 −0.24
GLS 0.6 0.2 0.8 −0:17 −0:01 −0.11

GLAD 0.6 0.2 0.8 −0:15 −0:004 −0.11
GLS 0.8 0.1 0.9 −0:24 0.02 −0.13

GLAD 0.8 0.1 0.9 −0:25 0.02 −0.13
' E(1=nt t ) E(seGLS =seGLAD )b0 E(seGLS =seGLAD )b1 E(seGLS =seGLAD )ˆ
0.0 1.0 1.0 1.97 1.98 2.25
0.3 0.6 0.9 1.08 1.07 1.13
0.6 0.2 0.8 0.81 0.81 0.83
0.8 0.1 0.9 0.81 0.81 0.82
However, we show that in case of random coe$cient autocorrelation the two

estimators behave quite di>erently. The improvement granted by LAD is linked to
the presence of conditional heteroskedasticity caused by the RCA model, that is to
the thick tails of the error distribution.
We prove the asymptotic normality of LAD with RCA errors. Our simula-
tions show that, in case of RCA with random coe$cient following a normal or a
Table 4
2
Random autocorrelation t = + ' rt ; rt is distributed as a (4)
' E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

ˆ
GLS 0.0 0.2 0.80 0.13 (1.0) 0.54 (1.4) 0.60 (0.1)
GLAD 0.0 0.2 0.80 0.10 (0.8) 0.62 (1.0) 0.58 (0.2)
GLS 0.2 0.2 0.99 −0:27 (1:8) 1.38 (2.2) 0.75 (0.1)
GLAD 0.2 0.2 0.99 0.12 (1.1) 0.64 (1.3) 0.73 (0.1)
GLS 0.6 0.1 0.99 0.01 (1.6) 0.69 (1.8) 0.83 (0.1)
GLAD 0.6 0.1 0.99 0.02 (1.1) 0.67 (1.03) 0.81 (0.1)
GLS 0.8 0.05 0.99 0.04 (0.8) 0.63 (0.7) 0.85 (0.12)
GLAD 0.8 0.05 0.99 0.02 (0.9) 0.66 (0.9) 0.84 (0.14)
' E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )
GLS 0.0 0.2 0.80 −0:16 −0:05 −0.19

GLAD 0.0 0.2 0.80 −0:19 0.02 −0.21
GLS 0.2 0.2 0.99 −0:57 0.78 −0.23

GLAD 0.2 0.2 0.99 −0:17 0.04 −0.25
GLS 0.6 0.1 0.99 −0:28 0.09 −0.16

GLAD 0.6 0.1 0.99 −0:27 0.07 −0.17
GLS 0.8 0.05 0.99 −0:25 0.3 −0.14

GLAD 0.8 0.05 0.99 −0:27 0.6 −0.15
' E(1=nt t ) E(seGLS =seGLAD )b0 E(seGLS =seGLAD )b1 E(seGLS =seGLAD )ˆ
0.0 0.2 0.80 1.00 1.01 1.04
0.2 0.2 0.99 1.31 1.32 1.36
0.6 0.1 0.99 0.97 0.97 0.98
0.8 0.05 0.99 0.82 0.82 0.83
contaminated normal distribution, LAD provides a sizable bias reduction and a rele-
vant improvement in e$ciency with respect to least squares. In the experiments with
random coe$cients following a Student-t or a 2 distribution there are stability issues
involved, since these distributions render the average correlation greater than one.
Therefore, we need to balance the 4xed and the random component of the serial cor-
relation in order to keep the average value of the serial correlation below unity. This
4ne-tuning allows seeing that, when the random component prevails, LAD substan-
tially improves upon least squares. On the other hand, when the random component
is small, least squares and LAD yield similar results, thus con4rming Weiss (1990)
4ndings.
Summarizing, LAD-based procedure can be seen as an insurance policy against
undetected RCA. In case of 4xed correlation OLS and LAD are comparable, but
LAD is more cumbersome and possibly less e$cient. In case of random correlation,
the LAD procedure induces bias reduction and e$ciency gain with respect to OLS.
These considerations make highly recommendable its implementation.
References
Bera, A., Higgins, M., Lee, S., 1992. Interaction between autocorrelation and conditional
heteroscedasticity: A random coe$cient approach. J. Bus. Econom. Statist. 10, 133–142.
Bollerslev, T., 1987. A conditional heteroskedastic time series model for speculative prices and rates
of returns. Rev. Econom. Statist. 69, 542–547.
Bollerslev, T., Wooldridge, J., 1992. Quasi-maximum likelihood estimations and inference in dynamic
models. Econometric Rev. 11 (2), 143–172.
Engle, R., 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of the UK
inHation. Econom. 50, 987–1008.
Engle, R., Gonzales-Rivera, G., 1991. Semiparametric ARCH models. J. Bus. Econom. Statist. 9, 345–359.
Furno, M., 2000. LM tests in the presence of non-normal error distributions. Econometric Theory 16,
249–261.
Herce, M., 1996. Asymptotic theory of LAD estimation in a unit root process with 4nite variance
errors. Econometric Theory 12, 129 –153.
Koenker, R., Bassett, G., 1978. Regression quantiles. Econometrica 46, 33–50.
Koenker, R., Zao, Q., 1996. Conditional quantile estimation and inference for ARCH models.
Econometric Theory 12, 793–813.
Machado, J., Silva, J., 2000. Glejser’s test revisited. J. Econometrics 97, 189–202.
Nelson, D., 1991. Conditional heteroskedasticity in asset returns: a new approach. Econometrica 59,
307–346.
Ruppert, D., Carroll, R., 1980. Trimmed least-squares estimation in the linear model. J. Amer. Statist.
Assoc. 75, 828–838.
Tsay, R., 1987. Conditional heteroscedastic time series models. J. Amer. Statist. Assoc. 82, 590 – 604.
Weiss, A., 1990. Least absolute error estimation in the presence of serial correlation. J. Econometrics
44, 127–158.

LAD Estimation With Random Coefficient Autocorrelated Errors

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LAD Estimation With Random Coefficient Autocorrelated Errors

Uploaded by

Copyright:

Available Formats

Computational Statistics & Data Analysis 36 (2001) 511–523

LAD estimation with random

Received 1 July 1998; accepted 1 August 2000

1. Review of the literature

(Bollerslev, 1987), EGARCH (Nelson, 1991) and more sophisticated conditionally

1.2. The LAD estimator with 2xed autocorrelation

In case of 4xed autocorrelation, Weiss (1990) proposes a generalized LAD (GLAD)

2. LAD with random autocorrelation

2.1. The estimator

In model (1) the estimation of the autocorrelation coe$cients is feasible only

yt∗ = xt∗  + rt et−1 + at = xt∗  + vt ; (3)

var(vt =It−1 ) = ht = a2 + r2 et−1

2.2. Asymptotic distribution

n−1=2 t (vt )et−1 = 0;

n−1=2 t (|vt | − 0 − 1 et−1

+i = 1; q g i2 a2t−i + i = 1; q k = i g ik at−i at−k : (10)

In a sample of 35 observations, the dependent variable is de4ned as yt = 0:3 +

consider skewness, which is an additional concern with conditional heteroskedastic-

 E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

GLS 0.0 0.00 0.30 (1.5) 0.73 (2.3) −0:03 (0:2)

GLS 0.3 0.29 0.15 (1.9) 0.81 (3.1) 0.17 (0.2)

GLS 0.6 0.59 −0:08 (4:2) 1.25 (8.0) 0.36 (0.2)

GLS 0.9 0.89 −1:84 (24) 3.28 (33) 0.53 (0.3)

 E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

GLS 0.0 0.00 0.006 0.13 −0.03

GLS 0.3 0.29 −0:14 0.21 −0.12

GLS 0.6 0.59 −0:38 0.65 −0.23

GLS 0.9 0.89 −2:14 2.68 −0.36

0.0 0.00 1.16 1.17 1.21

0.3 0.29 1.26 1.27 1.32

0.6 0.59 1.55 1.55 1.63

0.9 0.89 2.66 2.67 2.85

autocorrelation is de4ned as t =  + rt . The 4rst column of the table de4nes the

 E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

GLS 0.0 0.03 −11(385) −0:19 (253) −0:001 (0:7)

GLS 0.3 0.34 −4:29 (980) −6:30 (376) 0.18 (0.8)

GLS 0.6 0.56 −4550 (106170) 3093 (76051) 0.32 (0.5)

GLS 0.9 0.93 147 (3874) −726 (18802) 0.48 (0.7)

 E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

GLS 0.0 0.03 −12 −0:79 −0.03

GLS 0.3 0.34 −4:59 −6:90 −0.15

GLS 0.6 0.56 −4551 3093 −0.24

GLS 0.9 0.93 147 −726 −0.44

 ' E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

 ' E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

GLS 0.0 1.0 1.0 −0:45 −0:001 −0.38

GLS 0.3 0.6 0.9 −0:27 0.01 −0.20

GLS 0.6 0.2 0.8 −0:17 −0:01 −0.11

GLS 0.8 0.1 0.9 −0:24 0.02 −0.13

0.0 1.0 1.0 1.97 1.98 2.25

0.3 0.6 0.9 1.08 1.07 1.13

0.6 0.2 0.8 0.81 0.81 0.83

0.8 0.1 0.9 0.81 0.81 0.82

However, we show that in case of random coe$cient autocorrelation the two

 ' E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

 ' E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

GLS 0.0 0.2 0.80 −0:16 −0:05 −0.19

GLS 0.2 0.2 0.99 −0:57 0.78 −0.23

GLS 0.6 0.1 0.99 −0:28 0.09 −0.16

GLS 0.8 0.05 0.99 −0:25 0.3 −0.14

0.0 0.2 0.80 1.00 1.01 1.04

0.2 0.2 0.99 1.31 1.32 1.36

0.6 0.1 0.99 0.97 0.97 0.98

0.8 0.05 0.99 0.82 0.82 0.83

yt∗ = xt∗ + rt et−1 + at = xt∗ + vt ; (3)

var(vt =It−1 ) = ht = a2 + r2 et−1

n−1=2 t (vt )et−1 = 0;

n−1=2 t (|vt | − 0 − 1 et−1

+i = 1; q g i2 a2t−i + i = 1; q k = i g ik at−i at−k : (10)

E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

autocorrelation is de4ned as t = + rt . The 4rst column of the table de4nes the

E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

' E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

' E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )

' E(1=nt t ) E(b0 ); 0 = 0:3 E(b1 ); 1 = 0:6 E()

' E(1=nt t ) E(b0 − 0 ) E(b1 − 1 ) E(ˆ − 1=nt t )