Professional Documents
Culture Documents
Thislecture
Introduction
Advantages of panel data
Issues involved in using panel data
Panel data models typology
Whatisapaneldataset?
A data set that combines time-series of cross-sections is called a panel
data set.
Examples:
Data on the expenditure pattern of a sample of households followed
over time.
Data on macro aggregates of many countries over time.
Balance sheet data for a set of firms over time.
Agricultural profile of states over time.
Two famous panels (from USA) are, the National Longitudinal Survey
of Labour Market Experience (NLS) and the Michigan Panel Study of
Income Dynamics (PSID).
In the Indian context, the Village Level Surveys conducted by
ICRISAT.
In-house, panel constructed with firm-level data from PROWESS,
CMIE.
1
unbalanced i.e., the number of cross-sectional units may vary
across time periods
Typically, a panel data set is wide i.e., covers a large number of cross-
sectional units, but is short i.e., covers only a few points in time.
Advantages of panel data sets
Permits analysis of number of economic questions that cannot be
addressed using cross-section or time-series data sets. Complicated
behavioural models can be constructed and tested that would not be
possible with just cross-section / time-series data sets
Example 1: Ben-Porath (1973) observes that at a certain point in
time, in a cohort of women, 50% may appear to be working. Does
this imply that (i) in this cohort one-half of the women on average
will be working, or (ii) the same one-half will be working in every
period? Cross-sectional data alone are inadequate to answer this
question.
Example 2: Disentangling economies of scale and technological
change in a production function analysis. Cross-sectional data
provide information on scale economies alone, while time-series
data muddle the two effects. Panel data allows estimation of both
the rate of technological change (over time) and economies of
scale (across firms of different sizes).
Example 3: Estimation of a demand system. With cross-section
data pertaining to a particular time period one can estimate only
the income elasticity but not the price elasticity (as all individuals
in the sample would face the same price). With a panel data, both
income and price variation would be captured in the model.
2
Since panel data allows us to capture both intertemporal dynamics
as well as heterogeneity of individuals, one can better control for
the effects of missing or unobserved variables.
3
Issuesinvolvedinusingpaneldata
Heterogeneity bias:
Consider modelling total factor productivity (TFP) for a panel of firms.
Typically, the data would reveal differences across firms and over time in
TFP. An important source of these differences in productivity levels
across firms is economies of scale and over time is technological change.
Similarly, the technology change may not account for all the inter-
temporal differences in productivity levels. There could be some time
specific factor (such as energy price shock, tax policy changes, etc.) that
are important, which are not captured in the observed data.
4
Paneldatamodels
Approaches to modelling with panel data can be categorised based on the
way heterogeneity is specified. The basic framework of panel data models
that characterise heterogeneity across individuals considers a regression
model of the form,
yi ,t xi ,t z i i ,t i 1, , n; t 1, , T
In this model, xi,t are K regressors. The constant term is not part of xi,t.
If zi is observed for all individuals, then the model can estimated with
OLS.
Pooled regression
If zi contains only a constant term. y i ,t xi ,t i ,t
This is a CLRM with a common intercept term and common slope
vector . OLS provides consistent and efficient estimates of and .
The term fixed is used only to indicate that the individual specific effect
does not vary over time.
5
Random effects models
If the unobserved individual heterogeneity are assumed to be uncorrelated
with xi,t, then the model may be formulated as,
y i ,t xi ,t E[z i ] {z i E[z i ]} i ,t x i ,t u i i ,t
The key difference between the fixed and random effects models is
whether the unobserved individual effect embodies elements that are
correlated with the regressors.