Professional Documents
Culture Documents
David Garson
Professor of Public Administration, North Carolina State University, Raleigh, North Carolina.
Path Analysis
Lecture Notes, 2008
Contents
Key concepts and terms
Path coefficients
Path multiplication rule
Effect decomposition
Path analysis with structural equation modeling
Assumptions
Frequently asked questions
Bibliography
Overview
Path analysis is an extension of the regression model, used to test the fit of the correlation
matrix against two or more causal models which are being compared by the researcher. The
model is usually depicted in a circle-and-arrow figure in which single-headed arrows
indicate causation. A regression is done for each variable in the model as a dependent on
others which the model indicates are causes. The regression weights predicted by the model
are compared with the observed correlation matrix for the variables, and a goodness-of-fit
statistic is calculated. The best-fitting of two or more models is selected by the researcher as
the best model for advancement of theory.
Path analysis requires the usual assumptions of regression. It is particularly sensitive to
model specification because failure to include relevant causal variables or inclusion of
extraneous variables often substantially affects the path coefficients, which are used to
assess the relative importance of various direct and indirect causal paths to the dependent
variable. Such interpretations should be undertaken in the context of comparing alternative
models, after assessing their goodness of fit discussed in the section on structural equation
modeling (SEM packages are commonly used today for path analysis in lieu of stand-alone
path analysis programs). When the variables in the model are latent variables measured by
multiple observed indicators, path analysis is termed structural equation modeling, treated
separately. We follow the conventional terminology by which path analysis refers to
modeling single-indicator variables.
Path multiplication rule: The value of any compound path is the product of
its path coefficients. Imagine a simple three-variable compound path where
education causes income causes conservatism. Let the regression coefficient of
income on education be 1000: for each year of education, income goes up
$1,000. Let the regression coefficient of conservatism on income be .0002: for
every dollar income goes up, conservativism goes up .0002 points on a 5-point
scale. Thus if education goes up 1 year, income goes up $1,000, which means
conservatism goes up .2 points. This is the same as multiplying the
coefficients: 1000*.0002 = .2. The same principle would apply if there were
more links in the path. If standardized path coefficients (beta weights) were
used, the path multiplication rule would still apply, but the the interpretation is
in standardized terms. Either way, the product of the coefficients along the
path reflects the weight of that path.
9. The spurious indirect effect of Skill Level and Job Status as a common
anteceding variables directly causing both dependents, indicated by
multiplying the path coefficient from Skill Level to Income by the
correlation of Skill Level and Job Status by the path from Job Status to
House Value and adding the product of the path from Job Status to
Income by the correlation of Skill Level and Job Status by the path
from Skill Level to Median House Value..
10. The residual effect is the difference between the correlation of Income
and Median House Value and the sum of the spurious direct and
indirect effects.
Correlated exogenous variables. The path weights connecting correlated
exogenous variables are equal to the Pearson correlations. When calculating
indirect paths, not only direct arrows but also the double-headed arrows
connecting correlated exogenous variables, are used in tracing possible indirect
paths, except:
Tracing rule: An indirect path cannot enter and exit on an arrowhead. This
means that you cannot have a direct path composed of the paths of two
correlated exogenous variables.
Select outputs. Statistical tests and other outputs are selected under
View, Analysis Properties, in the AMOS menu system, yielding the
dialog shown below:
Overall test of the model. The likehood ratio chi-square test, also called
the model chi-square test or deviance test, assesses the overall fit of the
model. A finding of nonsignificance corresponds to an adequate model
- one whose model-implied covariance matrix does not differ from the
observed covariance matrix. For this example, there is adequate fit:
10
10
11
11
12
Direct and indirect effects. AMOS will use the muliplication rule
automatically to partition overall effects into direct and indirect effects
for the endogenous variables (for Intent and Behavior in this example).
12
13
13
14
calls for adding or dropping one arrow at a time as each change will
affect the coefficients. For this example, the MIs are so small that no
addition of arrows is warranted. In fact, all MIs are well below the
usual lower threshold of 4.0.
Assumptions
Linearity: relationships among variables are linear (though, of course,
variables may be nonlinear transforms).
o Additivity: there are no interaction effects (though, of course, variables may
be interaction crossproduct terms)
o Interval level data for all variables, if regression is being used to estimate
path parameters. As in other forms of regression modeling, it is common to use
dichotomies and ordinal data in practice. If dummy variables are used to code a
categorical variable, one must be careful that they are represented as a block in
the path diagram (ex., if an arrow is drawn to one dummy it must be drawn to
all others in the set). If an arrow were to be drawn from one dummy variable to
another dummy variable in the same set, this would violate the recursivity
assumption discussed below.
o Residual (unmeasured) variables are uncorrelated with any of the variables
in the model other than the one they cause.
o
14
15
15
16
16
17
How do you assess the significance of the total (direct and indirect) effect
of exogenous variable x on endogenous variable y?
Run a regression with y as dependent and all others as independents,
leaving out any variable which mediates between x and y. The
significance of the b or beta for x in this equation is a test of the
significance of the total effect.
How are path coefficients related to the correlation matrix for purposes of
testing a model?
First, recognize that computation of the model-estimated correlations
and their comparison with observed correlations is best done by relying
on a model-estimating program such as LISREL or AMOS. The model
path coefficients can be compared to the predicted path coefficients as
computed from the correlation matrix, following which the model
coefficients can be tested for goodness-of-fit with the predicted
coefficients.
17
18
The tracing rule is a rule for identifying all the paths, the sum of effects
of which is the estimated correlation between two variables in the
model. This model-estimated correlation can be compared to the
observed correlation to assess the fit of the model to the data. The
tracing rule is simply that the model-implied correlation between two
variables in a model is the sum of all valid paths (tracings) between the
two variables. These include the total effect (which is the sum of direct
and indirect effects) plus any associational effects due to correlated
exogenous variables. These associational effects are calculated by
multiplying the correlation between the exogenous variable under
consideration with a second exogenous variable, by this second
exogenous variable"s total effect on the target variable under
consideration.
For simplicity, consider this simple model:
A
1
B
.379
1
D
-.652
-.451
1
A
B
C
A
1
B
.379
1
D
-.562
-.238
1
If one had only the path output and wanted to estimate back to the
correlation matrix, one would use these equations, one for each path:
18
19
The path coefficients from VARA to VARC and from VARB to VARC
are given by this second regression command:
REGRESSION
/MISSING LISTWISE
/STATISTCS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT VARC
/METHOD=ENTER VARA VARB.
19
20
How can multiple group path analysis determine if the path model differs
across groups in my sample?
Multiple group path analysis may be accomplished simply by running
separate path analysis for each group in the sample, then comparing the
path estimates. A more sophisticated approach supported by some path
analysis and SEM packages involves a second step: to impose a crossgroup equality constraint on the path estimates, then run the analysis
separately for each group, then see if the goodness-of-fit for the
constrained models is as good as for the unconstrained models. If the fit
of the constrained model is worse than that for the corresponding
unconstrained model, then the researcher concludes that model direct
effects differ by group.
20
21
Bibliography
o
Ingram, K. L., Cope, J. G., Harju, B. L., & Wuensch, K. L. (2000). Applying
to graduate school: A test of the theory of planned behavior. Journal of Social
Behavior and Personality, 15, 215-226.
21