4 views

Uploaded by RashidAli

it is the student's notes on ordinary least square methods

- Television and Divorce- Evidence from Brazilian Novelas.pdf
- 3 - 2 - Module 11- GLM Estimation (8-08)
- Hashim and Zaman
- PRICE PERFORMANCE IN THE PRIMARY MARKET OF THE EGYPTIAN PIPOs
- SPSS Ikan Pindang
- METHODS TO HOMOGENIZE WIND SPEEDS FROM SHIPS AND BUOY
- rini 2.docx
- Introduction to Econometrics, Tutorial (7)
- Harvard Econ
- Solution
- The Effects of Leveraged Buyouts on Prod
- Wooldridge, J. M., 2010. Econometric Analysis of Cross Section and Panel Data
- Health Eq Tn10
- Analysing Third-World Urbanization a Theoretical Model With Empirical Evidence
- 2015 Midterm Solutions
- Effectiviness of Business Innovation and R&D
- 1302.5714
- conditional heteroscedastic models
- 01 Besaran Dan Satuan
- solutions ex 3

You are on page 1of 3

Allin Cottrell

1

Let y denote the dependent variable, a n1 vector, and let X denote the nk matrix of regressors (independent

variables). Write b for the k-vector of regression coefficients, and write e for the n-vector of residuals, such

that ei = yi Xi b.

The vector b is the ordinary least squares (OLS) solution if and only if it is chosen such that the sum of

squared residuals,

n

X

SSR =

ei2 ,

i=1

is at a minimum.

Attaining the minimum SSR can be approached as a calculus problem. In matrix notation, we can write the

SSR as

e0 e

(y Xb)0 (y Xb)

(y 0 (Xb)0 ) (y Xb)

y 0 y y 0 Xb (Xb)0 y + (Xb)0 Xb

y 0 y 2b0 X 0 y + b0 X 0 Xb

The manipulations above exploit the properties of the matrix transpose, namely (A + B)0 = A0 + B0 and

(AB)0 = B0 A0 . In addition we rely on the point that y 0 Xb and (Xb)0 y (which can also be written as b0 X 0 y)

are identical by virtue of the fact that they are 1 1: they are transposes of each other, but transposing a 1 1

matrix gives back the same matrix.

Now take the derivative of the last expression above with respect to b and set it to zero. This gives

2X 0 y + 2X 0 Xb

X 0 y + X 0 Xb

X 0 Xb

= 0

= 0

= X0 y

(In general, the derivative of b0 Cb with respect to b is (C + C 0 )b, but X 0 X is a symmetric matrix and the

derivative simplifies to that shown, 2X 0 Xb.)

If the matrix X 0 X is non-singular (i.e. it possesses an inverse) then we can multiply by (X 0 X)1 to get

b = (X 0 X)1 X 0 y

(1)

This is a classic equation in statistics: it gives the OLS coefficients as a function of the data matrices, X and y.

Orthogonality of residuals and regressors. Heres another perspective. As we said above, the regression

residuals are e = y Xb. Suppose we require that the residuals be orthogonal to the regressors, X. We then

have

X0e = 0

0

X (y Xb) = 0

X 0 Xb = X 0 y

which replicates the solution above. So another way of thinking about the least squares coefficient vector, b,

is that it satisfies the condition that the residuals are orthogonal to the regressors. Why might that be a good

thing? Well, intuitively, if this orthogonality condition were not satisfied, that would mean that the predictive

power of X with regard to y is not exhausted. The residuals represent the unexplained variation in y; if they

were not orthogonal to X that would mean that more explanation could be squeezed out of X by a different

choice of coefficients.

Up to this point we have been concerned with the arithmetic of least squares. From an econometric viewpoint,

however, were interested in the properties of least squares as an estimator of some data generating process

(DGP) that we presume to be operating in the economy. Let us write the DGP as

y = X + u

where u is the error or disturbance term, satisfying E(u) = 0.

of the unknown

From this standpoint we consider the least squares coefficient vector b as an estimator, ,

parameter vector . By the same token our regression residuals, e, can be considered as estimates of the

with a question of particular

unknown errors u, or e = u.

In this section we consider the expected value of ,

equal ? In the following section we examine the variance of

interest being: under what conditions does E()

.

Since, as we saw above, the least squares equals (X 0 X)1 X 0 y, we can write

= E[(X 0 X)1 X 0 y]

E()

The next step is to substitute the presumed DGP for y, giving

E()

E[ + (X 0 X)1 X 0 u]

+ E[(X 0 X)1 X 0 u]

(Since itself is not a random variable, it can be moved through the expectation operator.) We then see that

= in other words, the least squares estimator is unbiasedon condition that

E()

E[(X 0 X)1 X 0 u] = 0

What does it take to satisfy this condition? If we take expectations conditional on the observed values of the

regressors, X, the requirement is just that

E(u|X) = 0

(2)

i.e., the error term is independent of the regressors.

So: the least squares estimator is unbiased provided that (a) our favored specification of the DGP is correct

and (b) the independent variables in X offer no predictive power over the errors u. We might say that the errors

must be uncorrelated with the regressors, but the requirement in (2) is actually stronger than that.

Var(X) = E[X E(X)]2 = E[(X E(X)) (X E(X))]

The variance of a random variable X is the expectation of the squared deviation from its expected value. The

the concept

above holds good for a scalar random variable. For a random vector, such as the least squares ,

is similar except that squaring has to be replaced with the matrix counterpart, multiplication of a vector into its

own transpose.

This requires a little thought. The vector is a column vector of length k, and so is the difference E().

Given such a k-vector, v, if we form v 0 v (the inner product) we get a 1 1 or scalar result, the sum of squares

of the elements of v. If we form vv 0 (the outer product) we get a k k matrix. Which do we want here? The

latter: by the variance of a random vector we mean the (co-)variance matrix, which holds the variances of the

elements of the vector on its principal diagonal, and the covariances in the off-diagonal positions.

Therefore, from first principles,

0

= E E()

Var()

E()

2

= and, substituting in

If we suppose that the conditions for the estimator to be unbiased are met, then E()

0

0

1 0

0

1 0

Var() = E (X X) X y (X X) X y

Then, once again, substitute the DGP, X + u, for y, to get

0

0

1 0

0

1 0

Var() = E (X X) X (X + u) (X X) X (X + u)

0

0

1 0

0

1 0

= E + (X X) X u + (X X) X u

0

0

1 0

0

1 0

= E (X X) X u (X X) X u

h

i

= E (X 0 X)1 X 0 uu0 X(X 0 X)1

(3)

(Note that the symmetric matrix (X 0 X)1 is equal to its transpose.) As with figuring the expected value of ,

we will take expectations conditional on X. So we can write

= (X 0 X)1 X 0 E(uu0 )X(X 0 X)1

Var()

(4)

The awkward bit is the matrix in the middle, E(uu0 ). This is the covariance matrix of the error term, u;

note that it is n n and so potentially quite large. However, under certain conditions it simplifies greatly

specifically, if the error term is independently and identically distributed (i.i.d.). In that case E(uu0 ) boils down

to an n n version of the following,

2

0

0

0 2 0 ,

0

0 2

that is, a matrix with a constant value on the diagonal and zeros elsewhere. The constant 2 reflects the

identical part of the i.i.d. condition (the error variance is the same at each observation) and the zeros off the

diagonal reflect the independent part (theres no correlation between the error at any given observation and

the error at any other observation). Such an error term is sometimes referred to as white noise.

Under the assumption that u is i.i.d., then, we can write E(uu0 ) = 2 I, where I is the n n identity matrix.

And the scalar 2 can be moved out, giving

Var()

= 2 (X 0 X)1 X 0 X(X 0 X)1

= 2 (X 0 X)1

(5)

The somewhat nasty equation (4) therefore gives the variance of in general case, while equation (5) gives

the nice and simple version that arises when the error term is assumed to be i.i.d.

The true error variance, 2 ,

How do we actually get a value for (that is, an estimate of) the variance of ?

2

is unknown, but we can substitute an estimate, s , based on the regression residuals:

Pn

e2

2

s = i=1 i

nk

and our estimated variance is then s 2 (X 0 X)1 provided, of course, that we are willing to assume that the error

term is i.i.d.

- Television and Divorce- Evidence from Brazilian Novelas.pdfUploaded byRafaelB.Santos
- 3 - 2 - Module 11- GLM Estimation (8-08)Uploaded byVictoria Botezatu
- Hashim and ZamanUploaded bycp91110
- PRICE PERFORMANCE IN THE PRIMARY MARKET OF THE EGYPTIAN PIPOsUploaded byZakaria Hegazy
- SPSS Ikan PindangUploaded byPuti Kalistha
- METHODS TO HOMOGENIZE WIND SPEEDS FROM SHIPS AND BUOYUploaded byHerlambang Miracle Yudhian
- rini 2.docxUploaded byerlin
- Introduction to Econometrics, Tutorial (7)Uploaded byagonza70
- Harvard EconUploaded byscribewriter1990
- SolutionUploaded bylorenzojcd
- The Effects of Leveraged Buyouts on ProdUploaded bysayyedtarannum
- Wooldridge, J. M., 2010. Econometric Analysis of Cross Section and Panel DataUploaded byAssan Achibat
- Health Eq Tn10Uploaded byAnita Arias Uriona
- Analysing Third-World Urbanization a Theoretical Model With Empirical EvidenceUploaded byqwedcxz
- 2015 Midterm SolutionsUploaded byEdith Kua
- Effectiviness of Business Innovation and R&DUploaded bydnlkaba
- 1302.5714Uploaded bySaif Eldin
- conditional heteroscedastic modelsUploaded byapi-285777244
- 01 Besaran Dan SatuanUploaded byachmad ali mashuri
- solutions ex 3Uploaded byNuno Azevedo
- School Resource Availability and Planning as Correlate of Students’ Attitude to Learning in Senior Secondary Schools in Education Zone B and C of Yobe State, Nigeria.Uploaded byAnonymous izrFWiQ
- Autor CopyUploaded byVishnu Pradeesh Durvas
- ACADEMIC ACHIEVEMENT AMONG UPPER PRIMARY SCHOOL STUDENTS IN RELATION TO THEIR MENTAL PRESSURE AFTER KEDARNATH DISASTERUploaded byAnonymous CwJeBCAXp
- uts fixUploaded byHandika Setya Wijaya
- Hotel CochinUploaded bykes7
- Patrick Meyer Reliability Understanding Statistics 2010Uploaded byjcgueinj
- Tutorial 2Uploaded byayoub
- Multvariate_analysicsUploaded byakeey4u
- CMA Lecture 2 - Cost Behaviour, Cost Drivers and Cost Estimation (Student version)(4).pptxUploaded byMuhammad Omar
- Cerna 522016 a Cri 27924Uploaded byDr Patrick Cerna

- indirectutility.pdfUploaded byRashidAli
- fpm-s-08Uploaded byRashidAli
- Demand ExamplesUploaded byRashidAli
- Syed Rashid AliUploaded byRashidAli
- llwp36Uploaded byRashidAli
- ECON212 MTII with answer key.pdfUploaded byRashidAli
- For Each ArtUploaded byRashidAli
- EViewsGuide.pdfUploaded byRashidAli
- pak.1200.aliUploaded byRashidAli
- Template IJARUploaded byRashidAli
- Indirect Utility exampleUploaded byRashidAli
- Variance TemplateUploaded byRashidAli
- 2010.08Uploaded byRashidAli
- FPM-S-08Uploaded byUsama Qayyum
- donor.remUploaded byRashidAli
- Lecture1.pdfUploaded byRashidAli
- j.1360-3736.2005.00223.xUploaded byRashidAli
- Acceleration PrincipleUploaded byRashidAli
- Regression AnalysisUploaded byRashidAli
- Econtrics RESOURCEUploaded byrambori
- FDI and Economic Growth in PakistanUploaded byRashidAli
- Parreńas. Long Distance IntimacyUploaded byRodessa May Marquez
- TaxUploaded byRashidAli
- A4B1C351627E42248CD7F310DA8A1352Uploaded byRashidAli
- Parrenas05Uploaded byRashidAli
- 15 Types of Regression You Should KnowUploaded byRashidAli
- tax (1)Uploaded byRashidAli
- Time series solution manualUploaded byA.s. Charan
- Time series solution manualUploaded byA.s. Charan

- TratamenteUploaded byKeresztes Enikő
- Final Review SolnsUploaded byMorgan Sanchez
- Olabarrieta-Landa Et Al.,2017 (Peabody)Uploaded byNicol Mariana V. Fernández
- Lecture 3 Regression AnalysisUploaded byAtina Tungga Dewi
- Determinants of Regional Minimum Wages in the Philippines.pdfUploaded byavisalari
- 3slsUploaded by.cadeau01
- Organizational Citizenship Behavior in Public and Private SectorUploaded byLim LeePing
- Error AnalysisUploaded byRAJA
- Homework 7 SolutionsUploaded byEric Yan
- A Procedure to Analyze Air Quality Data for the Detection of Linear Time TrendsUploaded byMeron Moges
- Discussion week 10.docxUploaded byPandurang Thatkar
- Influence of Sucking Habits and Breathing Pattern on PalatalUploaded byMilton David Rios Serrato
- Measurements and ErrorsUploaded bySarah Cates
- Statistics ReviewerUploaded byRenatoCosmeGalvanJunior
- Combining ErrorsUploaded bymaple0syrup
- Nov - 09 - Forecast Method J MartinoUploaded byToriq Wibowo
- Statistics study notesUploaded bybakajin00
- Ch 3 ForecastingUploaded byAnonymous iEtUTYPOh3
- An Exploratory Research on the Effect of Demographic VariablesUploaded bysrinidhi
- 1.11GB.T17671－1999胶砂强度.pdfUploaded bygoldenman
- Exam 3 SolutionUploaded byPotatoes123
- A Study on Effect of Emotional Appeals Used in Television Commercials of FMCG Sector on Brand RegistrationUploaded byInternational Journal of Innovative Science and Research Technology
- Pol Dec and CorruptionUploaded byBabelHardy
- IGNOU EC0-07 Solved AssignmentsUploaded byRaghuvir sharan
- THE IMPACT OF HEALTH-RELATED PROBLEM ON ACADEMIC ACHIEVEMENT OF STUDENTS: APPLICATION OF DUMMY VARIABLE REGRESSION MODELUploaded byIJAR Journal
- The Comparative Effect of Practicing Cooperative Learning and Critical Thinking on EFL Learners’ WritingUploaded byAmirAli Mojtahed
- Effect of Audit Quality on Financial Performance Evidence from Deposit Money Banks in NigeriaUploaded byEditor IJTSRD
- Coffee 1234Uploaded byVaibhav Prasad
- An Introduction to Bayesian Hierarchical Models With an Application in the Theory of Signal DetectionUploaded byAte Cec
- 69669Uploaded byfadligmail