Local Influence For Student-T Partially Linear Models

Computational Statistics and Data Analysis 55 (2011) 1462–1478
Contents lists available at ScienceDirect
Computational Statistics and Data Analysis

journal homepage: www.elsevier.com/locate/csda
Local influence for Student-t partially linear models

Germán Ibacache-Pulgar a,∗ , Gilberto A. Paula b
a
Instituto de Matemática e Estatística, Universidade de São Paulo, Brazil
b
Instituto de Matemática e Estatística, Universidade de São Paulo, USP, Caixa Postal 66281 (Ag. Cidade de São Paulo), CEP 05314-970 São Paulo, SP, Brazil
article info abstract

Article history: In this paper we extend partial linear models with normal errors to Student-t errors.
Received 5 November 2009 Penalized likelihood equations are applied to derive the maximum likelihood estimates
Received in revised form 2 June 2010 which appear to be robust against outlying observations in the sense of the Mahalanobis
Accepted 7 October 2010
distance. In order to study the sensitivity of the penalized estimates under some usual
Available online 20 October 2010
perturbation schemes in the model or data, the local influence curvatures are derived and
some diagnostic graphics are proposed. A motivating example preliminary analyzed under
Keywords:
Student-t distribution
normal errors is reanalyzed under Student-t errors. The local influence approach is used to
Nonparametric models compare the sensitivity of the model estimates.
Maximum penalized likelihood estimates © 2010 Elsevier B.V. All rights reserved.
Robust estimates
Sensitivity analysis
1. Introduction
Diagnostic methods for parametric regression models have been largely investigated in the statistical literature. The
majority of the works have given emphasis on studying the effect of eliminating observations on the results from the fitted
model, particularly on the parameter estimates. This approach has also been extended to nonparametric and semiparametric
models. For example, Eubank (1984, 1985) derived influence diagnostic measures based on the leverage and residuals for
spline regression. Silverman (1985) discussed the application of residuals in spline regression. Eubank and Gunst (1986)
derived some influence diagnostic measures for penalized least-squares estimates from a Bayesian perspective. Eubank
and Thomas (1993) proposed diagnostic tests and graphics for assessing heteroscedasticity in spline regression. Kim (1996)
discussed the application of residuals, leverage and Cook-type distance for smoothing splines. Wei (2004) presented some
influence diagnostic and robustness measures for smoothing splines. Kim et al. (2002) derived influence measures for the
partial linear models based on residuals and leverage for the estimates of the regression coefficients and the nonparametric
function suggested in Speckman (1988). Recently, Fung et al. (2002) studied influence diagnostics for normal semiparametric
mixed models with longitudinal data. They considered the single influential case or subject for the maximum penalized
likelihood estimates suggested in Zhang et al. (1998).
Case deletion does not directly reflect the impact of other perturbations in the model. Alternatively, Cook (1986) has
proposed an interesting method, named local influence, to assess the effect of small perturbations in the model (or data)
on the parameter estimates. The local influence analysis does not involve recomputing the parameter estimates for every
case deletion, so it is often computationally simpler. Several authors have extended the local influence method to various
regression models. Beckman et al. (1987) applied the approach to detect influential observations in normal linear mixed
models with emphasis on studying the influence of single observations. Lesaffre and Verbeke (1998) extended the local
influence methodology to normal linear mixed models in repeated-measurement context and under the case-weight
∗ Corresponding author. Tel.: +55 11 30916129; fax: +55 11 30916130.

E-mail addresses: germanp@ime.usp.br, germaury@hispavista.com (G. Ibacache-Pulgar), giapaula@ime.usp.br (G.A. Paula).
0167-9473/$ – see front matter © 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.csda.2010.10.009
G. Ibacache-Pulgar, G.A. Paula / Computational Statistics and Data Analysis 55 (2011) 1462–1478 1463
perturbation scheme. Ouwens et al. (2001) applied the local influence approach in generalized linear mixed models. Galea
et al. (1997), Liu (2000, 2002) and Díaz-Garcia et al. (2003) extended the local influence methodology to elliptical linear
regression models. Galea et al. (2005a,b) applied the local influence method in functional and structural comparative
calibration models under elliptical t-distributions. Paula et al. (2003) developed local influence for symmetrical nonlinear
models. Osorio et al. (2007) derived local influence curvatures under various perturbation schemes for elliptical linear
models with longitudinal structure. Other results can be found in Paula (1993).
In nonparametric and semiparametric regression models local influence diagnostics are quite rare. Among them, Thomas
(1991) constructed local influence diagnostics for the smoothing parameter and, more recently, Zhu et al. (2003) extended
the works by Cook (1986) to provide local influence measures under different perturbation schemes in normal partially
linear models.
The aim of this paper is to apply the approach of local influence in Student-t partial linear models. As typically considered
in the literature, the relevance of using the t-distribution is related to its capability of down-weighting outlying observations
(see, for instance, Lange et al., 1989). This paper is organized as follows. In Section 2 Student-t partial linear models are
defined. Section 3 contains the estimation and inference procedures of the regression coefficients, nonparametric function
and scale parameter. In Section 4 the main concepts of local influence are considered and normal curvatures for some
perturbation schemes are derived. An illustration of the methodology is presented for real data set in Section 5 and finally
in Section 6 some concluding remarks are given.
2. Student-t partial linear models
In this section we define partial linear models with Student-t errors and we describe the penalized function method,
which is often required for maximizing the penalized likelihood function.
2.1. The model
Semiparametric or partial linear models (PLMs) have become an important tool in modeling economic and biometric data
and are considered a flexible generalization of the linear model by including a nonparametric component of some covariates.
Such models assume that the relationship between the response variable and the explanatory variables can be represented
as
yi = xTi β + f (ti ) + ϵi (i = 1, . . . , n), (1)
where yi denotes the response from the experiment, xi is a (p × 1) vector of explanatory variable values, β is a (p × 1) fixed
parameter vector, ti is a scalar, f is a smooth function, and ϵi are random independent errors, for i = 1, . . . , n. Alternatively,
we can write model (1) as
y = Xβ + Nf + ϵ, (2)
where y = (y1 , y2 , . . . , yn ) is an (n × 1) vector of observed responses, X is an (n × p) design matrix with rows xTi ,
T
T
f = f (t01 ), . . . , f (t0q ) with t01 , . . . , t0q being the distinct and ordered values of ti , N is an (n × q) incidence matrix whose

(i, j)th element equals the indicator function I (ti = t0j ) for j = 1, . . . , q, and ϵ = (ϵi , . . . , ϵn )T is an (n × 1) vector of random
errors.
The partial linear model (1) has been studied by various authors. For example, in balanced cases of covariance analysis,
Heckman (1986) established asymptotic normality for the estimator of the regression coefficients and showed that their
bias is asymptotically negligible. Rice (1986) showed that the bias of the regression coefficient estimators can asymptotically
dominate the variance in unbalanced cases where the covariates are correlated, unless f is under-smoothed. Green (1987)
studied the asymptotic behavior of the maximum penalized likelihood estimators and derived appropriate definitions of
deviance, degrees of freedom and residuals for a general semiparametric regression model as well as presented quadratic
approximations for all the required statistics. Speckman (1988) used two methods for estimating β and f , one related to
partial smoothing splines and the other motivated by partial residual analysis. In addition, under suitable assumptions, the
asymptotic bias and variance are obtained for both methods, and it is showed that estimating β by partial residuals improves
the bias with no asymptotic loss in variance. Heckman (1988) proposed two minimax linear estimators for β and showed that
for each estimator the maximum mean squared error is of order n−1 , even if the covariates are highly correlated. Pitrun et al.
(2006) derived smoothing splines based on tests for nonlinearity in a partial linear model. Bianco et al. (2006) considered
the problem of hypothesis testing for the regression coefficients and studied their asymptotic distributions. Finally, Liang
(2006) studied some inferential topics and proposed two tests for assessing the linearity hypothesis of the nonparametric
component.
2.2. Penalized function
We will assume that ϵi (i = 1, . . . , n) are independent random variables such that ϵi follows a Student-t distribution of
mean zero, dispersion parameter φ and νi degrees of freedom, namely ϵi ∼ t(0, φ, νi ). Therefore,
yi ∼ t(µi , φ, νi ) (3)
1464 G. Ibacache-Pulgar, G.A. Paula / Computational Statistics and Data Analysis 55 (2011) 1462–1478
has a density function given by
φ −1/2 Γ ( 1+ν
1+νi
)
i  −( 2
)
fy (yi ) = 2 νi  1 + νi−1 δi , (4)
(π νi )1/2 Γ 2
where Γ (·) denotes the gamma function, δi = φ −1 (yi − µi )2 with µi = xTi β + nTi f, nTi is the ith row of N, and νi denotes
the degrees of freedom. For simplicity, we will assume that νi = ν , for i = 1, . . . , n. Then, the log-likelihood function can
be expressed as
n
−
L(θ) = Li (θ), (5)
i=1
where
 
Γ ( 1+ν ) 1+ν
 
1
Li (θ) = log 2
− log φ − log{1 + ν −1 δi }, (6)
(π ν) Γ ν2
/
 
1 2 2 2
∗
and θ = (βT , fT , φ)T ∈ Θ ⊆ Rp , with p∗ = p + q + 1.
The direct maximization of (5) without imposing restrictions over the function f may cause over-fitting and non-
identification of β (see, for instance Green, 1987). A well-known procedure that is based on the idea of log-likelihood
penalization consists in incorporating a penalty function in the log-likelihood, such that Lp (θ, α) = L(θ)+α ∗ J (f ), where J (f )
denotes the penalty function over f that in general depends on some specific application and α ∗ = α ∗ (α) is a constant that
depends on the smooth parameter α > 0. The parameter α , known in the literature as the smoothing parameter, controls
the tradeoff between goodness of fit, measured by large values of L(θ), and the smoothness estimated function, measured by
small values of J (f ). Therefore, the determination of the parameter α is a crucial part in the estimation process and different
methods of choice are available in the literature. For a smoothing spline, for example, one may use the generalized cross
validation method (Green and Silverman, 1994, see, for instance). However, we will assume α fixed in this work.
In the semiparametric context different penalty functions have been proposed. Here, we will consider
∫ b
J (f ) = α ∗ [f (l) (t)]2 dt, (7)
a
where f (l) (t) = dl

dtl
f (t), t0j ∈ [a, b], the function f belongs to the Sobolev function space W2(l) [a, b] = {f : f (l) ∈ L2 [a, b],
  
f (1) , f (2) , . . . , f (l−1) b
abs. cont.}, where L2 [a, b] = f : a f 2 (t)dt < ∞ . When l = 2, the estimation of f leads to a smooth
cubic spline with knots at the points t0j , j = 1, . . . , q. According to Green and Silverman (1994, Theorem 2.1), we may express
∫ b 2
J (f ) = α ∗
f (2) (t) dt = α ∗ fT Kf,

(8)
a
2
where f (2) (t) = dt
d
2 f (t) and K ∈ R
q×q
is a nonnegative definite matrix that depends only on the knots. If we consider the
function L(θ) defined in (5) and J (f ) defined by (8) with α ∗ = −α/2, then the penalized log-likelihood function associated
with the model (1) can be expressed as
n
−
Lp (θ, α) = Lpi (θ, α), (9)
i=1
where
α
Lpi (θ, α) = Li (θ) − fT Kf, (10)
2n
with Li (θ) defined in Eq. (6).
The penalized log-likelihood method has been an important tool for fitting semiparametric models. Since under non-
gaussian models typically the estimating equations are nonlinear and require an iterative solution, various methods
have been studied in the literature. For example, Green (1987) proposed the Newton–Raphson algorithm to solve the
penalized estimating equations in the general semiparametric regression problems (see also Green and Silverman, 1994).
The EM algorithm (Dempster et al., 1977) has also been applied in semiparametric modeling. In this context, Green (1990)
presented an interesting discussion on the properties of the EM algorithm and proposed a modified version of this algorithm
for the penalized likelihood maximization. In addition, Segal et al. (1994) presented an interesting discussion on the
variance estimation of estimates from the EM algorithm. Recently, Rigby and Stasinopoulos (2005) proposed algorithms
for maximizing penalized likelihood functions in the context of generalized additive models in which the location, form and
shape parameters are estimated.
3. Parameter estimation
The estimation problem for model (1) has been discussed by various authors. For example, Heckman (1986) developed
spline-type methods by using a smoothing spline. Speckman (1988) developed an alternative method based on kernel
smoothing. Robinson (1988) studied the estimation problem and showed that the estimates of the regression parameters
based on incorrect parameterization
√ of the nonparametric function are generally inconsistent and proposed a least-squares
estimate of β which is n-consistent. He and Shi (1996) used bivariate tensor-product B-splines as an approximation of
the nonparametric function and consider M-type regression splines. Hamilton and Truong (1997) used local polynomial fit
techniques. He et al. (2002) proposed to approximate the nonparametric function by a regression spline and estimating both
the regression parameter β and the spline coefficients by an M-estimator under partially linear models for longitudinal data.
Recently, Gannaz (2007) developed an estimation method based on a wavelet expansion of the nonparametric part of the
partial linear model.
3.1. Score function
Assume that the function (9) is regular with respect to β, f and φ , and let D(v) = diag{v1 , v2 , . . . , vn }, vi = −2ζi ,
(ν+1)
ζi = − 12 (ν+δ i)
and µ = Xβ + Nf. Then, the penalized score function of θ is given by
 β 
Up (θ)
Up (θ) =  Ufp (θ)  , (11)
 
φ
Up (θ)
where
1
Uβp (θ) = XT D(v)(y − µ),
φ
1 (12)
Ufp (θ) = NT D(v)(y − µ) − α Kf and
φ
Uφp (θ) = (2φ)−1 {φ −1 (y − µ)T D(v)(y − µ) − n}.
The estimation equations given above can be extended to a larger class of distributions including normal and Student-
t distributions as special cases. This class is known in the statistical literature as elliptically contoured distributions
and includes, in addition to the distributions listed above, all the symmetrical continuous distributions such as power
exponential, Pearson VII and logistic, among others. For the power exponential distribution, for example, with shape
γ −1
parameter γ , one has vi = γ δi . For more details see, for instance, Cysneiros and Paula (2005).
3.2. Hessian matrix
Let θ = (βT , fT , φ)T and L̈p (θ) be the (p∗ × p∗ ) matrix with (j∗ , ℓ∗ )-element given by ∂ 2 Lp (θ, α)/∂θj∗ θℓ∗ , for j∗ , ℓ∗ =
1, . . . , p∗ . After some algebraic manipulations we find
L̈ββ L̈βp f L̈βφ
 
p p
L̈p (θ) =  L̈βp f L̈fpφ  ,
T
L̈ffp
 
βφ T f φT
L̈p L̈p Ł̈φφ
p
where
1
L̈ββ
p (θ) = − XT D(a)X,
φ
1
L̈βp f (θ) = − XT D(a)N,
φ
2
L̈βφ
p (θ) = XT b,
φ2
1
L̈ffp (θ) = − NT D(a)N − α K,
φ
2
L̈fpφ (θ) = NT b and
φ2
 
1 n 1
Ł̈φφ
p (θ) = + δT D(ζ ′ )δ − ϵT D(v)ϵ ,
φ2 2 φ
with D(a) = diag{a1 , . . . , an }, D(ζ ′ ) = diag{ζ1′ , . . . , ζn′ }, b = (b1 , . . . , bn )T , δ = (δ1 , . . . , δn )T , ai = −2 ζi + 2ζi′ δi ,

 
 
ν+1
bi = ζi + ζi′ δi ϵi , ζi′ = 1
, δi = φ −1 ϵi2 , ϵi = (yi − µi ) and µi = xTi β + nTi f, for i = 1, . . . , n.
 
2 (ν+δi )2
3.3. Existence of the MPLEs
Because f (t) is an infinite-dimensional parameter, we consider the maximum penalized likelihood estimate of θ , which
leads to a natural cubic spline estimate of f (t). Specifically, the value of θ that maximizes Lp (θ, α) over Θ, denoted by 
θ , is
called the maximum penalized likelihood estimate (MPLE) and satisfies
Lp (
θ, α) ≥ sup Lp (θ, α).
θ∈Θ
θ can be performed by considering successive maximizations as described, for instance,

The determination of the MPLE 
in Gourieroux and Monford (1995, Chap. 7). Specifically, for α fixed, the solution 
θ to the maximization problem
max Lp (θ, α) = max Lp (β, f, φ, α),
θ∈Θ β,f ,φ
can be obtained via the following three-step procedure:

(a) First, we maximize the function Lp (β, f, φ, α) over β by remaining fixed the parameters f and φ . The maximum value,
β(f, φ), is attained for values of β in a set B (f, φ) depending on the parameters f and φ . Thus, if β ∈ B (f, φ), the

penalized log-likelihood function value is
Lcp (f, φ, α) = max Lp (β, f, φ, α).
β
Here is called the concentrated penalized log-likelihood function in β.

Lcp
(b) Then, in the second step, we maximize the concentrated penalized log-likelihood function Lcp (f, φ, α) =
Lp (
β(f, φ), f, φ, α) over f by remaining φ fixed. The maximum value,  f(φ), is attained for values of f in a set F (φ)
depending on the parameter φ . Therefore, if f ∈ F (φ), the penalized log-likelihood function value is
Lcp (φ, α) = max Lcp (f, φ, α).
f
Here Lcp is called the concentrated penalized log-likelihood function in β and f.

(c) Finally, in the third step, we maximize the concentrated penalized log-likelihood function Lcp (φ, α) = Lp (
β(f, φ),
f(φ),
φ, α) over φ . The maximum value, 
φ , is attained on a set C of φ values.
To ensure the existence of the MPLE we have to study the concavity of the log-likelihood function Lp (β, f, φ, α) in β, f
and φ , following the sequence (a)–(c) above. The necessary and sufficient conditions to achieve the concavity are checked
case by case and are concerned with the Hessian matrices associated with the parameters β, f and φ .
(a′ ) In the step (a), the concavity (in β) of Lp (β, f, φ, α) is guaranteed if only if the matrix L̈pββ (θ) = − φ1 XT D(a)X ≤ 0
(negative semidefinite) or, equivalently, if only if −L̈pββ (θ) ≥ 0 (positive semidefinite). One has −L̈pββ (θ) ≥ 0 if the
matrix D(a) ≥ 0, that is, if ai ≥ 0, ∀i = 1, . . . , n.
(b′ ) Then, in the step (b), one has concavity (in f) of Lcp (f, φ, α) if only if the matrix L̈pff (θ) = − φ1 NT D(a)N + αφ K ≤ 0 or,
 
equivalently, if only if −L̈pff (θ) ≥ 0. Consequently, −L̈pff (θ) ≥ 0 if only if φ1 NT D(a)N ≥ 0 and αφ K ≥ 0. Since α and φ
are positive scalars and K ≥ 0, we have that αφ K ≥ 0. On the other hand, φ1 NT D(a)N ≥ 0 if D(a) ≥ 0, that is, if ai ≥ 0,
∀i = 1, . . . , n.
(c′ ) Finally, in the step (c), the concavity (in φ ) of Lcp (φ, α) is guaranteed if only if Ł̈pφφ (θ) < 0.
θ is the solution of the three-step procedure (a)–(c) then 

Therefore, if  θ is the MPLE of θ if ai ≥ 0, ∀i = 1, . . . , n, that is
equivalent to
√
ν ≥ |yi − µi |/ φ, ∀i = 1, . . . , n.

Thus, an interpretation we can draw from the expression above is that the concavity of Lp (θ, α) is more difficult to be
attained for small than for large values of the degrees of freedom. Nevertheless, we may attain concavity of Lp (θ, α) even
for some ai < 0. These results are in agreement with the ones presented by Pratt (1981) and Cysneiros and Paula (2005) in
the parametric case.
3.4. Finding the solution in practice: iterative process
The three-step procedure (a)–(c) is equivalent to solving the score functions

Uβp (θ) = 0, Ufp (θ) = 0 and Uφp (θ) = 0,
which from (12) lead to the following equations:
XT D(v)Xβ = XT D(v)(y − Nf),

(NT D(v)N + αφ K)f = NT D(v)(y − Xβ) and
(y − µ)T D(v)(y − µ) = nφ,
and consequently to the following backfitting algorithm with weight matrix given by D(v):
β(r +1) = (XT D(v(r ) )X)−1 XT D(v(r ) )(y − Nf(r ) ), (13)

(r +1) (r ) (r ) (r ) (r +1)
f = (N D(v )N + αφ K) N D(v )(y − Xβ
T −1 T
) and (14)
1
φ (r +1) = (y − µ(r ) )T D(v(r ) )(y − µ(r ) ), (15)
n
for r = 0, 1, . . .. We should start the iterative process (13)–(15) with initial values β(0) , f(0) and φ (0) , as for example, the
estimates from the normal model. If the conditions of Theorem 4.1 of Green and Silverman (1994, pp. 66–67) are satisfied
one has a unique solution and it corresponds to the limit of the iterative process (13)–(15). These conditions  mean that the
weight matrix D(v) should be positive definite, that is verified since vi > 0 ∀i = 1, . . . , n, and the matrix X, NT is of full
column rank, where T = 1, t0 , t0 = (t01 , . . . , t0q )T and 1 is an (q × 1) vector of 1’s.
 
3.5. Robustness of the MPLEs
Under Student-t partially linear models the quantities vi ’s can be interpreted as weights since vi > 0. Moreover, the
(r )
current weight vi from (13)–(15) is inversely proportional to the distance between the observed value yi and its current
(r )
predicted value µi = xT β(r ) + nTi f(r ) , so that outlying observations tend to have small weights in the estimation process.
Thus, we may expect that the MPLEs for the Student-t model are less sensitive to outlying observations than the MPLEs for
the normal model for which vi = 1.
In the parametric context Lucas (1997) developed an interesting study on the robust aspects of the Student-t M-estimator
in the univariate case using influence functions. He showed that the protection against outliers is preserved only if the
degrees of freedom parameter is kept fixed. Otherwise, if the degrees of freedom are also estimated by maximum likelihood,
the influence functions for φ and ν and the change-of-variance function of the location parameter are unbounded.
In this work we will keep fixed the degrees of freedom parameter for Student-t model and we will use a model selection
procedure based on the Schwarz information criterion (SIC) to choose the most appropriate value of ν .
3.6. Asymptotic results
We consider in this section the problem of how to derive the variance–covariance matrix of the MPLEθ . According to Segal
et al. (1994) the variance estimates for the MPLEs developed by Wahba (1983) and Silverman (1985), under the Bayesian
context, correspond to the inverse of the observed information matrix obtained by treating the penalized likelihood as
a usual likelihood. Therefore, if we obtain the MPLE of θ through the Fisher scoring algorithm, it is reasonable to derive
the variance–covariance matrix by using the inverse of the penalized Fisher information matrix. Thus, the asymptotic
variance–covariance matrix of  θ is given by
Var(
θ) ≈ Ip−1 (θ),
where the penalized Fisher information matrix Ip (θ) takes the form
 
XT WX XT WN 0
NT WX NT WN + α K 0
Ip (θ) =  ,

(16)
n(3κν − 1) 

0 0
4φ 2
with W = diag{ κφν κν

, . . . , } and κν = (ν + 1)/(ν + 3). Assuming that all the necessary inverses exist, some algebraic
φ
manipulations on Eq. (16) show that the inverse matrix of Ip (θ) assumes the following block diagonal form:
(XT Wx X)−1
 
E 0
−E T (NT Wf N + α K)−1 0
Ip−1 (θ) =  ,
 
(17)
 4φ 2 
0 0
n(3κν − 1)
where
E = (XT Wx X)−1 (XT WN)(NT WN + α K)−1 ,
Wx = W − WN(NT WN + α K)−1 NT W and
Wf = W − WX(XT WX)−1 XT W.
In particular, if we are interested in drawing inferences for β, f and φ the variance–covariance matrix can be estimated
by using the corresponding block diagonal matrix (17), that is,
Var(
β) ≈ (XT Wx X)−1 ,
Var(
f) ≈ (NT Wf N + α K)−1 and
4φ 2
Var(
φ) ≈ .
n 3κν − 1

4. Local influence
In this section we derive the ∆p matrix for different perturbation schemes. We will consider case-weight, scale parameter,
explanatory variable and response variable perturbations. Perturbation of case-weight is considered to detect observations
with large contribution to the likelihood function and that can exercise great influence on the MPLEs. Perturbation of the
scale parameter is used for evaluating the sensitivity of the MPLEs to small modifications of φ whereas perturbations on
explanatory variable values are used to detect observations whose explanatory variable values can exercise a great influence
on the MPLEs. Finally, perturbations on response variable values may have connections with generalized leverage as pointed
out, for instance, by Wei et al. (1998).
4.1. The method
Let ω = (ω1 , . . . , ωn )T be an n-dimensional vector of perturbations restricted to some open subset Ω ∈ Rn and the
logarithm of the perturbed penalized likelihood denoted by Lp (θ, α|ω). It is assumed that exists ω0 ∈ Ω , a vector of no
perturbation, such that Lp (θ, α|ω0 ) = Lp (θ, α). To assess the influence of minor perturbations on the MPLE θ , we can
consider the likelihood displacement
 
LD(ω) = 2 Lp (
θ, α) − Lp (
θ ω , α) ≥ 0,
where  θ ω is the MPLE under Lp (θ, α|ω). The measure LD(ω) is useful for assessing the distance between  θ and θ ω . Cook
(1986) suggests studying the local behavior of LD(ω) around ω0 . The procedure consists in selecting a unit direction ℓ ∈ Ω
(‖ℓ‖ = 1), and then to considering the plot of LD(ω0 + aℓ) against a, where a ∈ R. This plot is called the lifted line. Each
lifted line can be characterized by considering the normal curvature Cℓ (θ) around a = 0. The suggestion is to consider the
direction ℓ = ℓmax corresponding to the largest curvature Cℓmax (θ). The index plot of ℓmax may reveal those observations
that under small perturbations exercise a notable influence on LD(ω). According to Cook (1986), the normal curvature in
the unitary direction ℓ is given by
Cℓ (θ) = −2{ℓT ∆Tp L̈−

p ∆p ℓ},
1
(18)
where
∂ 2 Lp (θ, α)  ∂ 2 Lp (θ, α|ω) 
 
L̈p = and ∆p = .
∂θ∂θ T  θ=
θ ∂θ∂ωT  θ=
θ,ω=ω0
Note that −L̈p is the penalized observed information matrix evaluated at  θ (see Section 3.2) and ∆p is the penalized
perturbation matrix evaluated at  θ and ω0 . Cℓ (θ) denotes the local influence on the estimate  θ after perturbing the model
or data. Escobar and Meeker (1992) proposed to study the normal curvature in the direction ℓ = ei , where ei is an n-
dimensional vector with zero at the ith position and zeros at the remaining positions. In this case, the normal curvature,
called the total local influence of the ith individual, takes the form Cei (θ) = 2|cii | (i = 1, . . . , n), where cii is the ith principal
diagonal element of the matrix C = ∆Tp L̈− p ∆p .
1
4.2. Local influence on subvectors
Consider the partition of θ = (θ T1 , θ T2 )T , where θ 1 and θ 2 are subvectors of dimensions s and (p∗ − s), respectively. From
Cook (1986) the normal curvature for θ 1 in the unitary direction ℓ is given by
Cℓ (θ 1 ) = −2{ℓT ∆Tp (L̈−
p − G22 )∆p ℓ},
1
where
 
0 0
G22 = ,
0 L̈− 1
p22
with L̈p22 obtained from the partition of L̈p according to the partition of θ . In this case, the index plot of the eigenvector
ℓ = ℓmax , which corresponds to the largest absolute eigenvalue of the matrix G = ∆Tp (L̈− p − G22 )∆p , may indicate the
1
θ 1 . Alternatively, we can inspect the normal curvature Cℓ (θ 1 ) in the direction ℓ = ei .

points with a large influence on 
4.3. Conformal normal curvature
In order to have a curvature invariant under a uniform change of scale Poon and Poon (1999) proposed the conformal
normal curvature defined as
Cℓ (θ) ℓT ∆Tp L̈−
p ∆p ℓ
1
Bℓ (θ) =  = − . (19)
2 tr(∆Tp L̈−
p ∆p )
1 2 tr(∆Tp L̈−p ∆p )
1 2
This curvature is characterized to allow for any unitary direction ℓ that 0 ≤ Bℓ (θ ) ≤ 1. A suggestion is to consider the
direction ℓ = ℓmax corresponding to the largest curvature Bℓmax (θ) or, alternatively, evaluating the normal curvature in the
direction ℓ = ei and observing the index plot of Bei (θ).
4.4. Some cutoff criteria
In local influence analysis there is no definite rule that allows one to discriminate if an observation is influential or not.
However, several authors have proposed some criteria that may be useful in identifying such observations. For example,
Verbeke and Molenberghs (2000) and Poon and Poon (1999) proposed as cutoff criteria Ci > 2C̄ and Bi > 2B̄ , respectively,
where C̄ and B̄ are the mean of
C = {Ci = Cei (θ) : i = 1, . . . , n} and B = {Bi = Bei (θ) : i = 1, . . . , n}.
Zhu and Lee (2001) proposed the cutoff criteria Bi > B̄ + 2SE(B ) to take into account the variability of B , where
SE(B ) denotes the standard error of B . Recently, Lee and Xu (2004) suggested using Bi > B̄ + c ∗ SE(B ), with c ∗ selected
appropriately. Based on these works and since our main objective is to compare the influence of certain observations under
normal and Student-t models, we will use for comparative purposes the cutoff criteria Bi > 2B̄ and Bi > B̄ + c ∗ SE(B ), for
c ∗ = 2, 3.
4.5. Perturbation schemes
The (p∗ × n) ∆p matrix for each perturbation scheme assumes the form
∂ 2 Lp (θ, α|ω) 

∆p = , (20)
∂θ∂ωT θ=θ,ω=ω0
θ is the MPLE of θ and ω0 is the vector of no perturbation. We will present in the sequel the expression of ∆p for
where 
case-weight, scale parameter, explanatory variable and response variable perturbation schemes.
4.5.1. Cases-weight perturbation

Let us consider the attributed weights for the observations in the penalized log-likelihood function as
n
− α
Lp (θ, α|ω) = ωi Li (θ) − fT Kf, (21)
i=1
2
where ω = (ω1 , . . . , ωn )T is the vector of weights, with 0 ≤ ωi ≤ 1 (i = 1, . . . , n). In this case, the vector of no perturbation
is given by ω0 = (1, . . . , 1)T . Differentiating Lp (θ, α|ω) with respect to the elements of θ and ωi , we obtain after some
algebraic manipulation
∂ 2 Lpi (θ, α|ω) 

2
=−  ζi
ϵi xi ,
∂β∂ωi 
θ=θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω) 

2
=−  ζi
ϵi ni and
∂ f∂ωi 
θ=θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω) 

1 1
=− − ζi
 δi ,
∂φ∂ωi 
θ=θ̂,ω=ω0 2φ
 2 φ

for i = 1, . . . , n.
4.5.2. Scale perturbation

Model (3) is assumed to be homoscedastic, that is, the scale parameter of random errors is assumed to be constant
across observations. Under the scale parameter perturbation scheme it is assumed that yi ∼ t µi , ωi−1 φ, ν , where

ω = (ω1 , . . . , ωn )T is the vector of perturbations, with ωi > 0, for i = 1, . . . , n. In this case, the vector of no perturbation is
given by ω0 = (1, . . . , 1)T such that Lp (θ, α|ω) = Lp (θ, α). Taking differentials of Lp (θ, α|ω) with respect to the elements
of θ and ωi , we obtain after some algebraic manipulation
∂ 2 Lpi (θ, α|ω) 

2 ′
= − {ζi 
δi + 
ζi }
ϵi xTi ,
∂β∂ωi 
θ =θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω) 

2 ′
= − {ζi 
δi + 
ζi }
ϵi nTi and
∂ f∂ωi 
θ =θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω) 

1 1 ′
=− − { ζi 
δi +  ζi }
δi ,
∂φ∂ωi 
θ =θ̂,ω=ω0 2φ
 φ

for i = 1, . . . , n.
4.5.3. Explanatory variable perturbation

Here the dth explanatory variable, assumed continuous, is perturbed by considering the additive perturbation scheme,
namely xidω = xid + ωi (i = 1, . . . , n), where ω = (ω1 , . . . , ωn )T is the vector of perturbations such as ωi ∈ R. In this
case, the vector of no perturbation is given by ω0 = (0, . . . , 0)T and the perturbed penalized log-likelihood function is
constructed from (9) with xid replaced by xidω , that is,
α
Lp (θ, α|ω) = L(θ|ω) − fT Kf, (22)
2
where L(·) is given by (5) with δiω = φ −1 (yi − µiω )2 in the place of δi and µiω = xTi β + ωi βd + nTi f. Differentiating Lp (θ, α|ω)
with respect to the elements of θ and ωi , we obtain, after some algebraic manipulation, that
∂ 2 Lpi (θ, α|ω) 

4 ′ 2 
ζi 
βd
δi xi +  ζi βd xi − zd
ϵi ,

= 
∂β∂ωi 
θ =θ̂,ω=ω0 φ
 φ

∂ Lpi (θ, α|ω) 
2

2
βd 2 ζi 
δi + 
ζi ni and
 ′ 
= 
∂ f∂ωi 
θ =θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω) 

4
βd ζi 
δi + 
ζi ϵi ,
 ′ 
= 
∂φ∂ωi 
θ =θ̂,ω=ω0 φ
 2
for i = 1, . . . , n. Here zd denotes a (p × 1) vector with 1 at the dth position and zero elsewhere, and 
βd denotes the dth
element of β.
4.5.4. Response variable perturbation

To perturb the response variable values we consider yiω = yi + ωi (i = 1, . . . , n), where ω = (ω1 , . . . , ωn )T is the vector
of perturbations. Here, the vector of no perturbation is given by ω0 = (0, . . . , 0)T and the perturbed penalized log-likelihood
function is constructed from (9) with yi replaced by yiω , that is,
α
Lp (θ, α|ω) = L(θ|ω) − fT Kf, (23)
2
where L(·) is given by (5) with δiω = φ −1 (yiω − µi )2 in the place of δi . Differentiating Lp (θ, α|ω) with respect to the elements
of θ and ωi , we obtain, after some algebraic manipulation, that
∂ 2 Lpi (θ, α|ω) 

1
= − {4ζi′
δi + 2
ζi }xTi ,
∂β∂ωi 
θ =θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω) 

1
= − {4ζi′
δi + 2
ζi }nTi and
∂ f∂ωi 
θ =θ̂,ω=ω0 φ

∂ 2 Lpi (θ, α|ω)  ϵi

ζi′
δi + 2
ζi },

= − {2
∂φ∂ωi 
θ =θ̂,ω=ω0 φ
 2
for i = 1, . . . , n. Connection between the normal curvature Ci and generalized leverage have been studied, for instance,
by Wei et al. (1998) in generalized linear models and more recently by Galea et al. (2005a,b) and Cysneiros et al. (2007) in
symmetrical regression models. However, the study of this connection under Student-t partially linear models is beyond
the scope of this work.
a b
Fig. 1. Scatter plots: return versus IPSA (a) and return versus time (b).
Table 1
Estimate values of the penalized likelihood function logarithm and Schwarz
information criteria under the Student-t model, for different degrees of
freedom, fitted to the Chilean Stock Market data.
ν −2Lp (
θ, α) SIC(
θ)
1 617.34 1488.4
2 615.26 1486.3
3 601.82 1472.9
4 600.34 1471.4
5 600.98 1472.1
ν→∞ Normal
Table 2
Maximum penalized likelihood estimates (standard errors) under normal and Student-t (ν = 4) models fitted to the Chilean Stock Market data.
Normal Student-t
Parameter Estimate (SE) Lp (

θ, α) Estimate (SE) Lp (
θ, α)
β 7.924 (1.961) −315.32 7.752 (1.876) −300.17
φ 2.433 (0.045) 1.193 (0.121)
5. Application
In this section we will discuss the analysis of a data set from the Chilean Stock Market that corresponds to a monthly
return of the company Cuprum, responsible for the pension funds. The data correspond to the period from January 1990
to December 2003. The return of the Selective Index of Share Prices monthly (IPSA) and the time (months) were used
as explanatory variables. Fig. 1 displays the relationship between the return of the company Cuprum and IPSA, and the
relationship between the return of the company and time that represents the months of the year.
We notice from Fig. 1(a) a strong evidence of a linear tendency between the return and IPSA. In addition, Fig. 1(b) suggests
that the return depends on the time in a nonparametric fashion. Thus, we can assume the following semiparametric model:
yi = xi β + f (ti ) + ϵi , i = 1, . . . , 168, (24)
where yi denotes the value of the ith return and xi represents the value of the ith return of the IPSA, both at time ti , whereas
ϵi are independent errors. Using the notation given in Section 2.1 one has for model (24) that X is a (168 × 1) vector of values
x1 , . . . , x168 , N is a (168 × 168) incidence matrix and f = (f (t1 ), . . . , f (t168 ))T . For comparative purposes, in our analysis we
assume that the random errors follow normal and Student-t distributions, and we use the penalized likelihood method to
fit the models (in the penalized log-likelihood function K is a (168 × 168) smoothing matrix). In both cases, the generalized
cross-validation criterion, used to choose the smoothing parameter, provided values are close to α = 100. The degrees of
freedom (ν ) for the Student-t model was selected by the Schwarz information criterion and we found ν = 4 (see Table 1).
T
The MPLEs of θ = β, fT , φ and the corresponding log-likelihood function for the PLM under normal and Student-t

errors are presented in Table 2.
Comparing these values we notice a similarity between the regression coefficient estimates under normal and Student-t
models, but the standard error and the log-likelihood function of the Student-t model appear to be smaller. Fig. 2 displays
a b
Fig. 2. Normal probability plots of the transformed distance: normal model (a) and Student-t model (b).
a b
Fig. 3. Scatter plots of return versus estimated return: normal model (a) and Student-t model (b).
the transformed distance plots, see Lange et al. (1989), for the normal and Student-t models with ν = 4. The transformed
distance under the Student-t model seems to be closer to normality than under the normal model. Fig. 3 displays the graphics
of the return versus the estimated return from the two models, indicating suitable fits for both models.
In order to identify outlying observations under the fitted models, index plots of the distance  δi = φ −1 (yi − 
µi )2 are
displayed in Fig. 4(a), and (b). We can see from these figures that observations 22, 23, 52 and 105 appear as possible outliers.
Fig. 4(c) displays the estimated weights under the Student-t model and we notice that the estimated weights for the obser-
vations 22, 23, 52 and 105 take the smaller values, confirming the robust aspects of the MPLEs against outlying observations
under heavier-tailed error models.
5.1. Local influence analysis
In order to identify influential observations under the fitted models to Chilean Stock Market data, we present some index
plots of Bei (λ) (total local influence), for λ = β, f, φ , under three perturbation schemes discussed in previous sections. The
lines drawn on the graphs correspond to the cutoffs 2B̄ (bottom line), B̄ + 2SE(B ) (middle line) and B̄ + 3SE(B ) (top line),
with B = {Bi = Bei (λ) : i = 1, . . . , n}. In this application we will use the approach Bi > B̄ + 3SE(B ) to discriminate
whether an observation is influential or not. The index plots of Bei (λ) under scale perturbation are not presented here due
to the similarity with the results obtained under case-weight perturbation.
5.1.1. Case-weight perturbation

Figs. 5–7 present the index plots of Bi = Bei (λ), for λ = β, f, φ , respectively, for the case-weight perturbation scheme
under the two fitted models. The index plots of Bei (θ) are similar to the ones given in Figs. 5–7, so they were omitted here.
Considering Fig. 5, we notice that observations 22, 23, 49 and 105 are pointed out under the normal model and observations
a b
δi under normal (a) and Student-t (b) models and between the estimated weights and 
Fig. 4. Index plots of the distance  δi under Student-t model (c).
a b
β under case-weight perturbation under normal and Student-t models fitted to the Chilean Stock
Fig. 5. Index plots of Bi for assessing local influence on 
Market data.
51, 98 and 107 have the greatest values under the Student-t model. Based on Fig. 6, we notice that observations 22, 23 and
105 are more influential under the normal model whereas observation 2 appears with a small influence under the Student-t
model. Looking at Fig. 7 we observe the large influence of observations 105 under the normal model. But no one influential
observation is pointed out under the Student-t model.
a b
Fig. 6. Index plots of Bi for assessing local influence on

f under case-weight perturbation under normal and Student-t models fitted to the Chilean Stock
Market data.
a b
φ under case-weight perturbation under normal and Student-t models fitted to the Chilean Stock
Market data.
5.1.2. IPSA perturbation

The index plots of Bi = Bei (λ), for λ = β, f, φ , respectively, under the IPSA perturbation scheme are given in Figs. 8–10.
Considering Fig. 8, we note that observations 22, 52 and 105 are more influential under the normal model whereas under the
Student-t model observation 104 appears with a small influence. In Fig. 9 observations 1, 22, 105, 168 have the largest in-
fluence under the normal model whereas observations 1 and 168 appear with a large influence under the Student-t model.
From Fig. 10 we notice that observations 23, 52 and 105 are more influential under the normal model. However, no one
observation is pointed out as influential under the Student-t model.
5.1.3. Response variable perturbation

The index plots of Bi = Bei (λ), for λ = β, f, φ , respectively, under the response variable perturbation scheme are given
in Figs. 11–13. Considering Fig. 11, we note that observation 104 is more influential under the normal and Student-t models.
In Fig. 12 observations 1 and 168 have the largest influence under the normal and Student-t models. From Fig. 13 we notice
that observations 23, 52 and 105 are more influential under the normal model. However, no one observation is pointed out
as influential under the Student-t model.
Based on these local influence graphics we can conclude that the MPLEs for the nonparametric component and scale
parameter appear to be less sensitive under the Student-t model under case-weight perturbation, whereas the sensitivity of
β appears to be similar under the two fitted models. Under IPSA and response perturbations the MPLE of the scale parameter

from the Student-t model with 4 degrees of freedom appears to be less sensitive than the MPLE from the normal model.
Observation 105 that is pointed out in various graphics for the normal model corresponds to the smallest return. Note that
the influence of this observation is not apparent under the Student-t model.
a b
β under IPSA perturbation under normal and Student-t models fitted to the Chilean Stock Market
data.
a b
Fig. 9. Index plots of Bi for assessing local influence on

f under IPSA perturbation under normal and Student-t models fitted to the Chilean Stock Market
data.
a b
φ under IPSA perturbation under normal and Student-t models fitted to the Chilean Stock Market
data.
a b
β under response perturbation under normal and Student-t models fitted to the Chilean Stock
Market data.
a b

f under response perturbation under normal and Student-t models fitted to the Chilean Stock
Market data.
a b
φ under response perturbation under normal and Student-t models fitted to the Chilean Stock
Market data.
Table 3
Relative changes (%) on maximum penalized likelihood estimates of β and φ under normal and Student-t (with ν = 4) models fitted to the Chilean Stock
Market data.
Dropped observation Normal Student-t
β
 φ
 β
 φ

None – – – -
22 12 5 2 5
23 10 7 2 6
49 10 3 2 6
52 4 15 5 9
105 9 20 3 6
I1 6 39 1 17
2 9 1 3 13
51 6 0 13 5
98 4 0 8 3
107 8 2 9 5
I2 1 5 7 8
1 0 1 3 2
104 8 0 9 2
168 0 3 1 6
I3 8 2 8 1
I1 ∪ I2 ∪ I3 16 42 2 21
5.2. Confirmatory analysis
Table 3 presents the relative changes (in %) of the MPLEs of β and φ after removing from the data set the pointed
out observations for the distance  δi and in the local influence graphics under normal and Student-t models. Let I1 =
{22, 23, 49, 52, 105} and I2 = {2, 51, 98, 107} the sets of observations identified as influential under normal and Student-t
models, respectively. In addition, let I3 = {1, 104, 168} the set of observations identified as influential under both models.
The RC (in %) of each estimated parameter is defined by: RCψ = |(ψ −ψ (I ) )/ψ|
j
 × 100%, where ψ (I ) denotes the MPLE of
j
ψ , with ψ = β, φ , after the set Ij (j = 1, 2, 3) has been removed. Even though some RC are large, inferential changes are
not observed. It is interesting to notice from Table 3 the coherence with the diagnostic graphics. For instance, elimination
of the observations with large δ̂i under the normal model leads to smaller changes in the MPLEs from the Student-t
model, confirming the robust aspects of such estimates against outlying observations. On the other hand, elimination of
observations detached in the Student-t diagnostic graphics causes larger changes in the parameter estimates from this
model. Thus, the well-known robust aspects of the maximum likelihood estimates from Student-t models are not necessarily
extended to other perturbation schemes, indicating the need of a diagnostic examination in each case.
6. Concluding remarks
In this paper we discuss parameter estimation and some statistical diagnostics for Student-t partial linear models, which
can be considered as a generalization of the normal partial linear model proposed for Zhu et al. (2003). Local influence
approaches for the proposed model under case-weight, scale parameter, explanatory variable and response variable pertur-
bations are developed. Closed form expressions are obtained for the penalized observed and expected information matrices.
A real data set previously analyzed under normal errors is reanalyzed under Student-t errors by assuming the smoothing
parameter fixed and by applying the Schwarz information criteria to choose a degrees of freedom parameter estimate. The
empirical study provides evidence on the robust aspects of the MPLEs from the Student-t partial linear model with small
degrees of freedom against outlying observations, as pointed out by Lange et al. (1989) in parametric regression and multi-
variate analysis. However, these robust aspects do not seem to be extended to all perturbation schemes of the local influence
approach, indicating the useful of the normal curvatures derived in this work for assessing the sensitivity of the MPLEs from
the Student-t models. Thus, we can recommend Student-t partial linear models as an option to fit symmetric data sets with
nonparametric components and indications of heavy tails.
Acknowledgements
The authors are grateful to the Associate Editor and the referees for their helpful comments. This work was supported by
CAPES, CNPq and FAPESP, Brazil.
References
Beckman, R.J., Nachtsheim, C.J., Cook, R.D., 1987. Diagnostics for mixed-model analysis of variance. Technometrics 29, 413–426.
Bianco, A., Boente, G., Martínez, E., 2006. Robust tests in semiparametric partly linear models. Scandinavian Journal of Statistics 33, 435–450.
Cook, R.D., 1986. Assessment of local influence (with discussion). Journal of the Royal Statistical Society B 48, 133–169.
Cysneiros, F.J.A., Paula, G.A., 2005. Restricted methods in symmetrical linear regression models. Computational Statistics and Data Analysis 49, 689–708.
Cysneiros, F.J.A., Paula, G.A., Galea, M., 2007. Heteroscedastic symmetrical linear models. Statistics and Probability Letters 77, 1084–1090.
Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39,
1–38.
Díaz-Garcia, J.A, Galea, M., Leiva-Sanchez, V., 2003. Influence diagnostics for elliptical multivariate linear regression models. Communications in Statistics,
Theory and Methods 32, 625–641.
Escobar, L.A., Meeker, W.Q., 1992. Assessing local influence in regression analysis with censored data. Biometrics 48, 507–528.
Eubank, R.L., 1984. The hat matrix for smoothing splines. Statistics and Probability Letters 2, 9–14.
Eubank, R.L., 1985. Diagnostics for smoothing splines. Journal of the Royal Statistical Society B 47, 332–341.
Eubank, R.L., Gunst, R.F., 1986. Diagnostics for penalized least-squares estimators. Statistics and Probability Letters 4, 265–272.
Eubank, R.L., Thomas, W., 1993. Detecting heteroscedasticity in nonparametric regression. Journal of the Royal Statistical Society B 55, 145–155.
Fung, W., Zhu, Z., Wei, B., He, X., 2002. Influence diagnostics and outlier tests for semiparametric mixed models. Journal of the Royal Statistical Association
B 64, 565–579.
Galea, M., Paula, G.A., Bolfarine, H., 1997. Local influence in elliptical linear regression models. The Statistician 46, 71–79.
Galea, M., Bolfarine, H., Vilca, F., 2005a. Local influence in comparative calibration models under elliptical t-distributions. Biometrical Journal 47, 691–706.
Galea, M., Paula, G.A., Cysneiros, F.J.A., 2005b. On diagnostics in symmetrical nonlinear models. Statistics and Probability Letters 73, 459–467.
Gannaz, I., 2007. Robust estimation and wavelet thresholding in partially models. Statistics and Computing 17, 239–310.
Gourieroux, C., Monford, A., 1995. Statistics and Econometric Models, Vols. 1 and 2. Cambridge University Press, Cambridge.
Green, P.J., 1987. Penalized likelihood for general semi-parametric regression models. International Statistical Review 55, 245–259.
Green, P.J., 1990. On use of the EM algorithm for penalized likelihood estimation. Journal of the Royal Statistical Society B 52, 443–452.
Green, P.J., Silverman, B.W., 1994. Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach. Chapman and Hall, Boca
Raton.
Hamilton, S.A., Truong, Y.K., 1997. Local linear estimation in partly linear models. Journal of Multivariate Analysis 60, 1–19.
He, X., Shi, P., 1996. Bivariate tensor-product B-splines in a partly linear model. Journal of Multivariate Analysis 58, 162–181.
He, X., Zhu, Z.Y., Fung, W.K., 2002. Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89,
579–590.
Heckman, N., 1986. Spline smoothing in a partly linear model. Journal of the Royal Statistical Society B 48, 244–248.
Heckman, N., 1988. Minimax estimates in a semiparametric model. Journal of the American Statistical Association 83, 1090–1096.
Kim, C., 1996. Cook’s distance in spline smoothing. Statistics and Probability Letters 31, 139–144.
Kim, C., Park, B.U., Kim, W., 2002. Influence diagnostics in semiparametric regression models. Statistics and Probability Letters 60, 49–58.
Lange, K.L., Little, R.J.A., Taylor, J.M.G., 1989. Robust statistical modeling using the t distribution. Journal of the American Statistical Association 84, 881–896.
Lesaffre, E., Verbeke, G., 1998. Local influence in linear mixed models. Biometrics 54, 570–582.
Lee, S.Y., Xu, L., 2004. Influence analyses of nonlinear mixed-effects models. Computational Statistics and Data Analysis 45, 321–341.
Liang, H., 2006. Checking linearity of non-parametric component in partially linear models with an application in systemic inflammatory response
syndrome study. Statistical Methods in Medical Research 15, 273–284.
Liu, S., 2000. On local influence for elliptical linear models. Statistical Papers 41, 211–224.
Liu, S., 2002. Local influence in multivariate elliptical linear regression models. Linear Algebra and its Applications 354, 159–174.
Lucas, A., 1997. Robustness of the Student t based M-estimator. Communications in Statistics, Theory and Methods 26, 1165–1182.
Osorio, F., Paula, G.A., Galea, M., 2007. Assessment of local influence in elliptical linear models with longitudinal structure. Computational Statistics and
Data Analysis 51, 4354–4368.
Ouwens, M.J.N., Tan, F., Berger, M., 2001. Local influence to detect influential data structures for generalized linear mixed models. Biometrics 57, 1166–1172.
Paula, G.A., 1993. Assessing local influence in restricted regression models. Computational Statistics and Data Analysis 16, 63–79.
Paula, G.A., Cysneiros, F.J.A., Galea, M., 2003. Local influence and leverage in elliptical nonlinear regression models. In: Verbeke, G., Molenberghs, G., Aerts,
A., Fieuws, S. (Eds.), Proceedings of the 18th International Workshop on Statistical Modelling. Katholieke Universiteit Leuven, Leuven, pp. 361–365.
Pitrun, I., King, M.L., Zhang, X., 2006. Smoothing spline based tests for non-linearity in a partially linear model. Journal of Statistical Planning and Inference
136, 2446–2469.
Poon, W., Poon, Y.S., 1999. Conformal normal curvature and assessment of local influence. Journal of the Royal Statistical Society B 61, 51–61.
Pratt, J.W., 1981. Concavity of the log likelihood. Journal of the American Statistical Association 76, 103–106.
Rice, J., 1986. Convergence rates for partially splines models. Statistics and Probability Letters 4, 203–2008.
Rigby, R., Stasinopoulos, D., 2005. Generalized additive models for location, scale and shape. Applied Statistics 54, 507–554.
Robinson, P., 1988. Root n-consistent semiparametric regression. Econometria 56, 931–954.
Segal, M.R., Bacchetti, P., Jewell, N.P., 1994. Variances for maximum penalized likelihood estimates obtained via the EM algorithm. Journal of the Royal
Statistical Society B 56, 345–352.
Silverman, B.W., 1985. Some aspects of the spline smoothing approach to non-parametric regression curve fitting. Journal of the Royal Statistical Society B
47, 1–52.
Speckman, P., 1988. Kernel smoothing in partial linear models. Journal of the Royal Statistical Society B 50, 413–436.
Thomas, W., 1991. Influence diagnostics for the cross-validated smoothing parameter in spline smoothing. Journal of the American Statistical Association
86, 693–698.
Verbeke, G., Molenberghs, G., 2000. Linear Mixed Models for Longitudinal Data. Springer, New York.
Wahba, G., 1983. Bayesian confidence intervals for the cross-validated smoothing spline. Journal of the Royal Statistical Society B 45, 133–150.
Wei, W.H., 2004. Derivatives diagnostics and robustness for smoothing splines. Computational Statistics and Data Analysis 46, 335–356.
Wei, B.C., Hu, Y.Q., Fung, W.K., 1998. Generalized leverage and its applications. Scandinavian Journal of Statistics 25, 25–37.
Zhang, D., Lin, X., Raz, J., Sowers, M., 1998. Semiparametric stochastic mixed models for longitudinal data. Journal of the Americam Statistical Association
93, 710–719.
Zhu, Z.Y., He, X., Fung, W.K., 2003. Local influence analysis for penalized Gaussian likelihood estimators in partially linear models. Scandinavian Journal of
Statistics 30, 767–780.
Zhu, H.T., Lee, S.Y., 2001. Local infuence for incomplete data models. Journal of the Royal Statistical Society B 63, 111–126.

Local Influence For Student-T Partially Linear Models - Germàn PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Local Influence For Student-T Partially Linear Models - Germàn PDF

Uploaded by

Copyright:

Available Formats

Computational Statistics and Data Analysis 55 (2011) 1462–1478

Contents lists available at ScienceDirect

Computational Statistics and Data Analysis

article info abstract

∗ Corresponding author. Tel.: +55 11 30916129; fax: +55 11 30916130.

2. Student-t partial linear models

2.1. The model

2.2. Penalized function

has a density function given by

where f (l) (t) = dl

3.1. Score function

3.2. Hessian matrix

with D(a) = diag{a1 , . . . , an }, D(ζ ′ ) = diag{ζ1′ , . . . , ζn′ }, b = (b1 , . . . , bn )T , δ = (δ1 , . . . , δn )T , ai = −2 ζi + 2ζi′ δi ,

3.3. Existence of the MPLEs

θ can be performed by considering successive maximizations as described, for instance,

can be obtained via the following three-step procedure:

Here is called the concentrated penalized log-likelihood function in β.

Here Lcp is called the concentrated penalized log-likelihood function in β and f.

θ is the solution of the three-step procedure (a)–(c) then 

3.4. Finding the solution in practice: iterative process

The three-step procedure (a)–(c) is equivalent to solving the score functions

which from (12) lead to the following equations:

XT D(v)Xβ = XT D(v)(y − Nf),

β(r +1) = (XT D(v(r ) )X)−1 XT D(v(r ) )(y − Nf(r ) ), (13)

3.5. Robustness of the MPLEs

3.6. Asymptotic results

with W = diag{ κφν κν

4.1. The method

Cℓ (θ) = −2{ℓT ∆Tp L̈−

4.2. Local influence on subvectors

θ 1 . Alternatively, we can inspect the normal curvature Cℓ (θ 1 ) in the direction ℓ = ei .

4.3. Conformal normal curvature

4.4. Some cutoff criteria

4.5. Perturbation schemes

4.5.1. Cases-weight perturbation

4.5.2. Scale perturbation

4.5.3. Explanatory variable perturbation

4.5.4. Response variable perturbation

Parameter Estimate (SE) Lp (

5.1. Local influence analysis

5.1.1. Case-weight perturbation

Fig. 6. Index plots of Bi for assessing local influence on

5.1.2. IPSA perturbation

5.1.3. Response variable perturbation

Fig. 9. Index plots of Bi for assessing local influence on

Fig. 12. Index plots of Bi for assessing local influence on 

5.2. Confirmatory analysis

You might also like