You are on page 1of 24

Forecasting the Probability of Recession

May 13, 2013 Andrew Gellert Arman Oganisian

Economic Forecasting ECN 409-001

Dr. Fang Dong

Abstract Building on previous research, we estimate a probit model to forecast the probability of recession one month later. We use data from the St. Louis Federal Reserves database to estimate four different models. We choose the optimal model based on the models ability to make in-sample predictions of turning points from recession to expansion and its overall fit. The optimal model is then used to generate 18 out-of-sample forecasts from October 2011 to March 2013. These forecasts demonstrate that ability to capture real events, as the predicted probability of recessions jumped in periods of instability and dropped during periods of stability. Introduction The paper begins with a literature review surveying some key papers which build a probit model the probability of recession. Some papers have built dynamic models which exploit the autocorrelation structure of the binary dependent variable. Others use various financial explanatory variables, such as the yield curve, to capture the so called wisdom of the crowd contained in liquid secondary markets. We estimate four different models. Two of them are static models and the other two are nonhomogeneous Markov processes. The main model is described in our Model section, which is followed by a brief description of our data. The next section estimates the four models and compares their in-sample fits as well as their ability to predict turning points in the economy. Model 1 is chosen as the optimal model because it exhibits the best fit and turning-point predictions. In the final section, we use model 1 to generate 18 out-of-sample forecasts from October 2011 to March 2013. The model demonstrates a fine ability to capture the real

macroeconomic risks stemming from the European sovereign debt crisis which raged from fall 2011 to fall 2012. During the crisis, predictions became very volatile, fluctuating around 50%. After the German constitutional court approved Greek bailout funds and the probability of a Greek exit declined, the models predictions decreased to about 10% and stayed there until the present. Literature Review Our model has several features that we borrow from previous models. First, we include the slope of the yield curve1 as an explanatory variable. Second, we include a lagged recession dummy as an explanatory variable, transforming the model into a firstorder nonhomogenous Markov process. Finally, we use a probit model, which is the most widely used probability model in similar research. Dueker (1997) presents a theoretical argument for why the slope of the yield curve contains forward-looking information about the economy. He argues that the yields of long-term and short-term securities, because they are traded on a liquid secondary market, contain the so-called wisdom of the crowd. The yield curve, which plots the yields of bonds with different maturities, normally slopes upwards. Higher maturity debt carries a larger risk of the issuer defaulting and, thus, the market prices the debt at a premium to lower-maturity bonds2. When the economic outlook dims, the yield curve may flatten or invert. This is because investors expect looser monetary policy (i.e. lower short-term rates), so they choose to sell their short-term debt and buy long-term debt to lock in higher yield. This causes short-term rates to rise and long-term rates to decline. 1 Unless otherwise noted, hereafter yield curve refers to the difference in yields between a 10-year government bond and a 3-month T-bill. 2 Longer-maturity debt is priced at a premium because other risks, such as inflation spikes, also increase with time. 3

Thus, the slope ( = ) of the yield curve flattens or turns negative. The magnitude of the decline depends on the crowds view of the severity and duration of the coming downturn.3 There is a wide body of literature devoted to producing multi-period forecasts of the probability of recession using the yield curve. Estrella and Mishkin (1998) examine the out-of-sample forecasting performance of several financial variables including the yield curve (spread between the 10-year and 3-month treasure yields), the NYSE composite stock index, Commerce Department leading index, as well as Stock-Watson leading index. They evaluate the performance of the variables by using a pseudo-R2 after estimating their probit model using maximum likelihood.4 For short forecasting periods of one to three quarter horizons, stock prices have superior forecasting ability. Beyond this period, however, the slope of the yield curve dominates. The pseudo R2 for the yield curve is only .072 for 1-quarter-ahead forecasts, but increases to .295 for 4-quartersahead forecasts. NYSEs pseudo R2, by contrast, is .161 for 1-quarter-ahead forecasts. However, this metric declines to .016 for 4-quarters-ahead forecasts. Dueker runs Estrella and Mishkins probit model using monthly 30-year treasury yields (as opposed to the quarterly 10-year treasury yields) and confirms their results. He finds that the yield curves predictive power is optimized with a lag of 9 months. The yield curve becomes the dominant predictor after 3 months (1 quarter), which is
4 !

3 The yield curve is not foolproof. Monetary policy is not based solely on the expected future state of

the economy. It is also based on inflation expectations and pure randomness. =1(

The use of this metric is justified in Estrella and Mishkins paper, as well as the use of Newey-West standard errors to handle autocorrelated forecast errors.

!"# !! ! ! !"# !! ) ! , where 0 and 1 correspond to no fit and perfect fit, respectively. !"# !!

consistent with Estrella and Mishkin. Before ending his paper, he presents a probit model augmented with a Markov switching process. He argues that this may be superior since it exploits the autocorrelation structure of the binary dependent variable. Chauvet and Potter (2001) criticize the use of Estrella and Mishkins probit model, claiming that the model is misspecified in two fundamental ways: (1) estimated parameters are not constant over time and (2) the model does not properly account for autocorrelated errors. Estrella, Rodrigues, and Schich (2003) examine both U.S. and German data and find no evidence of breakpoints. Chauvet and Potter develop a computationally difficult method by applying Bayesian numerical methods (Kauppi, 2008). According to Kauppi, this approach, and other similar approaches, have problems in their interpretation, practical implementation, and flexibility. Instead, he builds a dynamic probit model by including a lagged dependent variable as an explanatory variable, thus modeling the economy as a first-order nonhomogeneous Markov chain. It is nonhomogeneous because the transition matrix varies with respect to the slope of the yield curve. He finds that there is no evidence for parameter instability provided that the apparent serial dependence of the recession indicator is taken into account using the lagged dependent variable as an explanatory variable. Kauppis model is (! = 1) = (! + ! !!! + ! !!!" ). Xt-12 is the lagged yield curve, yt-1 is the lagged dependent variable, and () is the cumulative distribution function of N(0,2). The model predicts the probability of

recession 12 months ahead, where yt=1 indicates a state of recession. This probability varies with respect to the slope of the yield curve. The probabilities outputted by the model form the transition matrix, which, again, vary with respect to the slope of the yield curve. In a two-state Markov (state 1 is recession and state 2 is no recession), the transition probabilities from one state to another can be expressed in a 2x2 matrix:

!! !" =

!" !! = !! 1 !!

1 !! !!
5

(! + ! + ! !!!" ) (1 ! + ! + ! !!!" )

1 [1 ! + ! !!!" ] 1 (! + ! !!!" )

We extend upon Kauppis dynamic probit model by adding a causal dimension to the model by way of several leading indicators of consumption, housing, and investment. We will see whether this extension significantly improves the models probability forecasts. Model We will construct a time series model that will output an 1-month-ahead forecast of the probability of recession. Our main model will take the following form:

(!!! = 1) = (! + ! ) , !"# ; ! ; ~ . . , (0, ! )


Where: () = c.d.f. for the normal distribution. n = 414 k=4 R = a column vector containing n observations of either 0 or 1, where 0 indicates a state of no recession in time t+1 and 1 indicates recession in time t+1.
5 P 11 is the probability of moving from a state of recession in period t to another state of recession in period t+1. P12 is the probability of moving from a state of recession in period t to a state of no recession in period t+1.

= a vector containing 5 coefficients and one constant term to be estimated. X = A 415x6 matrix of the following independent variables at time t (except for the yield curve and lagged dependent variable, which are lagged): housing starts (hs), industrial production index (ip), consumer sentiment (cs), yield curve (yc) (lagged 11 months, and dependent variable (lagged 1 month). As mentioned at the end of the literature review, our research seeks to improve previous models using the yield curve and Markov-switching by adding several covariates. We include housing starts as a leading indicator of the housing sector, which is a large component of residential investment and, consequently, GDP. Building permits would be an equally valid, yet identical leading indicator. A simple correlation coefficient indicates that the two variables are correlated with r = .98. The industrial production index is a good leading indicator of the industrials sector (which includes manufacturing, mining, and utilities). This is an interest-sensitive sector, so it is particularly useful when dealing with the business cycle. Adverse shocks to the economy will hit this sector before all others. Thus, IP is a good leading indicator of the economy as a whole. Additionally, we include the Michigan sentiment survey as a leading indicator of consumption activity, which comprises some 70% of total U.S. output. We also include the yield curve (lagged 11 months) and the dependent variable (lagged 1 month) for reasons outlines in the previous literature review. We lag the yield curve 11-months because previous research shows that the yield curves predictive power is optimal at a 3-4 quarter horizon. Since the yield curve is at time t is the value from t-

11, t+1 (the 1-month-ahead forecast) falls in the optimal time horizon. For a 2-monthsahead forecast, a 10-month lag must be used for the yield curve. Thus, the previous model, Pr (!!! = 1) = (0 + 1 ! + 2 ! + 3 ! + 4 !!!! + 5 ! + ! ) ~ . . , (0, ! ) , which predicts the probability of recession in t+1, can be used to predict f steps ahead with the following generalized model: Pr (!!! = 1) = (0 + ! + 2 ! + 3 ! + 4 !!(!"!!) + 5 Pr (!!(!!!) + ! )
1

This model assumes that all forecasts are made in time t, so that the information set available at time t excludes all information available after this period. This is a huge drawback because the information set does not increase with the prediction horizon, which decreases the accuracy of high-f forecasts. The most accurate forecast, therefore, is the forecast for f=1. If the forecast for period t+f is made in period t+(f-1), this would not be the case. We will also estimate three other variations of this model. One model will omit the lagged recession variable. The third will just include the yield curve and the fourth will include only the yield curve and the lagged recession variable. Model 1, which will prove superior to the other three models, does not succumb to the shortcomings of the model described above. Since it is not a dynamic model, previous predictions are not explanatory variables, thus the information set depends on values of HS, IP, CS, and YC, which are exogenously determined.

Data We retrieved all of our data from the Federal Reserve Economic Data (FRED) database at the St. Louis Federal Reserve Bank. All data series are seasonally adjusted and recorded on a monthly basis. Our time sample period is from March 1, 1977 to September 1, 2011. Summary statistics are available in the table below along with the expected sign of the variables coefficient. All data was lagged within excel. Table 1: Data Summary Mean St. Dev. Minimum 1459.95 405.98 478 72.15 17.68 46.6 85.39 13.28 51.7 00.13 00.34 0 01.16 01.21 -3.1

Variable Housing Starts Industrial Production Consumer Sentiment Yield Curve Recession

N 415 415 415 415 415

Maximum 2273 100.7 112 1 3.4

Expected Sign +

The chart below is a time series of the 11-month lagged yield curve from March 1, 1977 to September 1, 2011, a key component of our data. Since the series is lagged, the yield curve is flat or negative during or right before recession (indicated in blue).
Figure 1: Yield Curve Slope Dips Into Nega<ve Territory Before Recessions
15 13 1 0.9 0.8
Yield Curve

Yield Curve Slope (%)

11 9 7 5 3 1 -1 -3
Mar-77 Mar-78 Mar-79 Mar-80 Mar-81 Mar-82 Mar-83 Mar-84 Mar-85 Mar-86 Mar-87 Mar-88 Mar-89 Mar-90 Mar-91 Mar-92 Mar-93 Mar-94 Mar-95 Mar-96 Mar-97 Mar-98 Mar-99 Mar-00 Mar-01 Mar-02 Mar-03 Mar-04 Mar-05 Mar-06 Mar-07 Mar-08 Mar-09 Mar-10 Mar-11

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

-5

Estimation and Results 9

We estimate the four models and compare the fits of each model. The fourth one is almost identical to Kauppis model. 1 2 3 4 : (!!! = 1) = (! + ! HS + ! IP + ! CS + ! YC + ! ) : (!!! = 1) = (! + ! HS + ! IP + ! CS + ! YC + ! R ! + ! ) : (!!! = 1) = (! YC + ! ) : (!!! = 1) = (! YC + ! R ! + ! ) The estimation results can be found in the appendix to this paper. In model 1, all the coefficients of the independent variables are significant with p<.001 except for industrial production. The only significant coefficient in Model 2 was that of the lagged recession indicator. In Model 3, the yield curve slopes coefficient was extremely significant in the expected direction. In model 4, the lagged recession indicator was more significant than the yield curve, which was insignificant in the expected direction. The coefficient itself was also much higher than the slopes coefficient. To compare the fit of the individual models, we utilize the max-rescaled R2, which differs from the pseudo R2.6 This information is summarized in the table below.

6 Economists disagree about the superiority the max-rescaled R2 over the pseudo- R2. We chose the max-rescaled R2 because it is easily calculated using SAS. A preliminary R2 is defined as R2 = 1 exp{2[logL(M) logL(0)] / n}. LogL(M) and LogL(0) are the maximized log-likelihood functions of the fitted model, M, and the null model, containing only an intercept term, respectively. The number of observations is n. However, this equation can never equal 1. A perfect fit outputs an R2 of .75. Thus, it must be rescaled: Max-Rescaled R2 = R2 / [1 exp(2 logL(0) / n) ].
More information is available at http://www2.sas.com/proceedings/sugi25/25/st/25p256.pdf .

10

Table 2: Max-Rescaled R2 Model 1 Static Model 2 Dynamic Model 3 Static Yield Curve Model 4 Dynamic Yield Curve

0.6605 0.8585 0.2831 0.8286

Clearly, the dynamic model with all of the explanatory variables achieves the best fit. Indeed, the R2 value is slightly higher than that of model 4, which is an imitation of Kauppis model.7 A graphical analysis of the models predicted and actual events is beneficial. The chart below plots the two models prediction of a recession in time t. The blue areas indicate recession in time t. Model 1 fits the data better than model 3. This indicates that a model with both the yield curve and the selected independent variables outperforms the yield curve alone. However, they are both extremely volatile. Model 1s predictions were very volatile from 1995 to 2001, before the recession in the early 2000s. Nevertheless, model 1 is very good at identifying turning points in the economy. When the economy was not yet in a recession in the early 80s, the model predicted an 84.5% chance of recession next month. Next month, there was a recession. Its previous prediction was only 32%. For each of the 3 months before the 2008 recession, the model predicted a 65% chance of recession. While the economy was still in the 2008 recession,

7 This model is just an imitation. Kauppi used Bayesian estimation and a pseudo-R2. Thus, our results are not directly comparable. The highest R2 out of all of Kauppis models was .77. Kauppi and Estrella use the same R2: Pseudo-R2 = 1 - (log(Lu)/log(Lc))^(2log(Lc)/T). Here, Lu is the unconstrained maximized likelihood function. Lc is the constrained likelihood function with the constraint that all coefficients, except the constant, are zero. T is the number of observations. 11

1 0.9 0.8

Figure 2: Static Models


Model 3

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Model 1

March-77

March-78

March-79

March-80

March-81

March-82

March-83

March-84

March-85

March-86

March-87

March-88

March-89

March-90

March-91

March-92

March-93

March-94

March-95

March-96

March-97

March-98

March-99

March-00

March-01

March-02

March-03

March-04

March-05

March-06

March-07

March-08

March-09

March-10

the model predicted a 5% chance of recession the next month. The recession did end the following month. As another example, the model was assigning a 7% chance of recession for the month of March 1990. Its prediction for the next month shockingly jumped to 20%, then 40% the next month, until finally calling a 55% chance of recession in August. Indeed, a recession did begin in August. While the nation was still in recession, the model predicted a 27% chance of recession next month, down from 40% the month before. The recession did end the next month.

March-11

12

1.2 1 0.8 0.6 0.4 0.2 0

Figure 3: Dynamic Models

100% 90% 80% 70%

Model 2 Model 4

60% 50% 40% 30% 20% 10% 0%

The chart above plots the probability of recession in time t as predicted by model 2 and 4. Again, the shaded area represents recession in time t. The dynamic models forecasts are much less volatile than the static forecasts. However, they do a poor job at identifying turning points in the economy. Model 2 assigned a 4% probability of recession to August 1981. However, a recession did start that month. That recession ended in December 1982. However, Model 2 had assigned a 97% chance of a recession to that month. Model 4 made a similar blunder with that recession. It assigned a 5% probability of a recession to August 1981 and a 97% probability of recession to December 1982. For every month between the first and last months of the recession, the model would consistently predict over 90% probabilities of recession. Both dynamic models follow this pattern for every recession in our sample. We believe this occurs because of the large coefficient on the recession lag. In both dynamic models, it is the largest and most significant coefficient. This emphasis on recessionary 13

state in the current month as a predictor of recession next month is a flaw. The models are unable to correctly predict turning points. Thus, we decided to select our optimal model from the static group. We select the model with the highest max-rescaled R2, model 1. Model one has the highest R2 metric of .66, making it the best-fit model. Out-of-Sample Predictions While the in-sample forecasts are good, it remains to be seen whether the out-ofsample forecasts are accurate. Figure 4 plots model 1s out-of-sample and in-sample predictions of the probability of recession. The line is marked in red with an arrow pointing to the future. We make 18 out-of-sample predictions.

100% 90% 80% 70% 60% 50% 40% 30% 20% 10%

Figure 4: Model 1 with Out-of-Sample Predictions

1 0.9 0.8

Model 1

0.7 0.6 0.5 0.4 0.3 0.2 0.1

Mar-77

Mar-79

Mar-81

Mar-83

Mar-85

Mar-87

Mar-89

Mar-91

Mar-93

Mar-95

Mar-97

Mar-99

Mar-01

Mar-03

Mar-05

Mar-07

Mar-09

Mar-11

The model gets jumpy in the future in the first few months. The model predicted a 55% chance of recession in October 2011, the first month in the out-of-sample forecast 14

Mar-13

0%

period. It fluctuated greatly around 50% for the next following months before lowering to around 10% from December 2012 to March 2013. This fluctuation and apparent uncertainty is not without cause and we do not believe that it reflects inaccuracies in the model. Instead, the period of uncertainty, from October 2011 to November 2012, corresponds to the uncertainty regarding the European sovereign debt crisis. Yields on Spanish, Greek, and Italian long-term maturity bonds were soaring throughout this period. Analysts were entertaining the possibility of contagion, as U.S. banks with large stakes in European sovereign debt were at risk. Similarly, experts and political leaders were questioning the very existence of the European Union. There was widespread fear that a Greek exit from the Euro would spark capital flight out of the continent and plunge the EU into a recession. There was widespread fear that this would cause a double-dip recession in the United States. These fears largely subsided after the German constitutional court decided that a Greek bailout was legal. The fear of a Greek exit and subsequent macroeconomic shocks disappeared. The model reflects this with lower recession probability forecasts. The average forecast from December 2012 to March 2013 was 10%. Conclusion Both in-sample and out-of-sample predictions confirm that model one is the superior performer, as discussed in the previous section. Furthermore, it is worth noting that model 1, which includes additional explanatory variables is superior to model 2, in terms of in-sample fit. Thus, a model which augments the yield curve with IP, HS, and CS is superior to a model which includes YC as the sole explanatory variable. Model 1 is also superior to both dynamic models, which fail to predict turning points in the economy. 15

Bibliography 1. Arturo Estrella & Frederic S. Mishkin, 1996. "Predicting U.S. recessions: financial variables as leading indicators," Research Paper 9609, Federal Reserve Bank of New York. o http://www.albany.edu/~xl843228/teaching/ECON350/EstrellaMishkin19 98.pdf 2. Arturo Estrella & Frederic S. Mishkin, 1996."The Yield Curve as a Predictor of U.S. Recessions," Current Issues in Economics and Finance, Federal Reserve Bank of New York, issue Jun. o http://www.newyorkfed.org/research/current_issues/ci2-7.pdf 3. Heikki Kauppi, 2008."Yield-Curve Based Probit Models for Forecasting U.S. Recessions: Stability and Dynamics," Discussion Papers 31, Aboa Centre for Economics. o http://ethesis.helsinki.fi/julkaisut/eri/hecer/disc/221/yieldcur.pdf 4. Marcelle Chauvet & Simon Potter, 2005."Forecasting recessions using the yield curve," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 24(2), pages 77103. o http://www.newyorkfed.org/research/staff_reports/sr134.pdf 5. Michael Dueker, 1997. "Strengthening the case for the yield curve as a predictor of U.S. recessions," Review, Federal Reserve Bank of St. Louis, issue Mar, pages 41-51. o http://research.stlouisfed.org/publications/review/97/03/9703md.pdf 6. Arturo Estrella & Anthony P. Rodrigues & Sebastian Schich, 2000."How stable is the predictive power of the yield curve? evidence from Germany and the United States," Staff Reports 113, Federal Reserve Bank of New York. o http://www.newyorkfed.org/research/staff_reports/sr113.pdf

16

Appendix Model estimates:


The LOGISTIC Procedure: MODEL 1 Model Information Response Variable WORK.SET1 rec Recession Dummy (dependent) Number of Response Levels 2 Model binary probit Optimization Technique Fisher's scoring Number of Observations Read Number of Observations Used Response Profile Ordered Value 1 2 rec 0 1 Total Frequency 359 56 415 415 Data Set

Probability modeled is rec=1. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept Criterion Only AIC SC -2 Log L R-Square 0.3611 330.406 334.435 328.406 and Covariates 152.452 172.594 142.452 0.6605

Max-rescaled R-Square

17

The LOGISTIC Procedure Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square 185.9540 137.1321 61.7645 DF 4 4 4 Pr > ChiSq <.0001 <.0001 <.0001

Analysis of Maximum Likelihood Estimates Parameter Intercept hs ip cs yc DF 1 1 1 1 1 Standard Estimate 0.5335 -0.00521 0.0197 0.0507 -1.4715 Error 0.8871 0.000800 0.00830 0.0145 0.2272 Wald Chi-Square 0.3617 42.3478 5.6534 12.2649 41.9414 Pr > ChiSq 0.5475 <.0001 0.0174 0.0005 <.0001

Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs 95.5 4.4 0.1 20104 Somers' D Gamma Tau-a c 0.912 0.912 0.213 0.956

Partition for the Hosmer and Lemeshow Test Group 1 2 3 4 5 6 7 8 Total 111 42 42 42 42 42 42 52 rec = 1 Observed 0 0 0 1 2 4 8 41 Expected 0.00 0.00 0.02 0.39 1.81 3.99 11.56 38.47 rec = 0 Observed 111 42 42 41 40 38 34 11 Expected 111.00 42.00 41.98 41.61 40.19 38.01 30.44 13.53

The LOGISTIC Procedure Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square 3.1285 6 DF Pr > ChiSq 0.7926

NOTE: In calculating the Expected values, predicted probabilities less than 1E-6 and greater than 0.999999 were changed to 1E-6 and 0.999999 respectively.

18

The LOGISTIC Procedure: MODEL 2 Model Information Response Variable WORK.SET1 rec Recession Dummy (dependent) Number of Response Levels 2 Model binary probit Optimization Technique Fisher's scoring Number of Observations Read Number of Observations Used Response Profile Ordered Value 1 2 rec 0 1 Total Frequency 359 56 415 415 Data Set

Probability modeled is rec=1. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept Criterion Only AIC SC -2 Log L R-Square 0.4694 330.406 334.435 328.406 and Covariates 77.425 101.595 65.425 0.8585

Max-rescaled R-Square

19

The LOGISTIC Procedure Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square 262.9813 336.6404 85.5654 DF 5 5 5 Pr > ChiSq <.0001 <.0001 <.0001

Analysis of Maximum Likelihood Estimates Parameter Intercept hs ip cs yc reclag DF 1 1 1 1 1 1 Estimate -1.2086 -0.00249 0.0174 0.0181 -0.8500 2.7474 Standard Wald Error Chi-Square 1.3237 0.00106 0.0111 0.0205 0.3032 0.3600 0.8336 5.5320 2.4710 0.7749 7.8562 58.2376 Pr > ChiSq 0.3612 0.0187 0.1160 0.3787 0.0051 <.0001

Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs 99.1 0.9 0.0 20104 Somers' D Gamma Tau-a c 0.982 0.982 0.230 0.991

Partition for the Hosmer and Lemeshow Test Group 1 2 3 4 5 6 7 8 9 Total 57 45 42 42 42 42 42 42 61 rec = 1 Observed 0 0 0 0 0 0 1 2 53 Expected 0.00 0.00 0.00 0.01 0.11 0.34 0.73 2.40 52.02 rec = 0 Observed 57 45 42 42 42 42 41 40 8 Expected 57.00 45.00 42.00 41.99 41.89 41.66 41.27 39.60 8.98

The LOGISTIC Procedure Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square 0.7637 7 DF Pr > ChiSq 0.9978

NOTE: In calculating the Expected values, predicted probabilities less than 1E-6 and greater than 0.999999 were changed to 1E-6 and 0.999999 respectively.

20

The LOGISTIC Procedure: MODEL 3 Model Information Response Variable WORK.SET1 rec Recession Dummy (dependent) Number of Response Levels 2 Model binary probit Optimization Technique Fisher's scoring Number of Observations Read Number of Observations Used Response Profile Ordered Value 1 2 rec 0 1 Total Frequency 359 56 415 415 Data Set

Probability modeled is rec=1. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept Criterion Only AIC SC -2 Log L R-Square 0.1548 330.406 334.435 328.406 and Covariates 262.622 270.679 258.622 0.2831

Max-rescaled R-Square

21

The LOGISTIC Procedure Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square 69.7843 67.5452 53.2809 DF 1 1 1 Pr > ChiSq <.0001 <.0001 <.0001

Analysis of Maximum Likelihood Estimates Parameter Intercept yc DF 1 1 Standard Estimate -0.6813 -0.5956 Error 0.0958 0.0816 Wald Chi-Square 50.5965 53.2809 Pr > ChiSq <.0001 <.0001

Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs 81.9 16.3 1.8 20104 Somers' D Gamma Tau-a c 0.656 0.668 0.153 0.828

Partition for the Hosmer and Lemeshow Test Group 1 2 3 4 5 6 7 8 9 10 Total 41 43 40 37 41 41 41 45 42 44 rec = 1 Observed 0 0 2 5 0 3 2 10 10 24 Expected 0.26 0.65 1.18 1.58 2.41 4.01 5.51 8.29 10.64 20.94 rec = 0 Observed 41 43 38 32 41 38 39 35 32 20 Expected 40.74 42.35 38.82 35.42 38.59 36.99 35.49 36.71 31.36 23.06

Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square 16.0127 DF 8 Pr > ChiSq 0.0422

22

The LOGISTIC Procedure: MODEL 4 Model Information Response Variable WORK.SET1 rec Recession Dummy (dependent) Number of Response Levels 2 Model binary probit Optimization Technique Fisher's scoring Number of Observations Read Number of Observations Used Response Profile Ordered Value 1 2 rec 0 1 Total Frequency 359 56 415 415 Data Set

Probability modeled is rec=1. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Model Fit Statistics Intercept Intercept Criterion Only AIC SC -2 Log L R-Square 0.4531 330.406 334.435 328.406 and Covariates 83.992 96.077 77.992 0.8286

Max-rescaled R-Square

23

The LOGISTIC Procedure Testing Global Null Hypothesis: BETA=0 Test Likelihood Ratio Score Wald Chi-Square 250.4143 335.1638 124.1582 DF 2 2 2 Pr > ChiSq <.0001 <.0001 <.0001

Analysis of Maximum Likelihood Estimates Parameter Intercept yc reclag DF 1 1 1 Standard Estimate -1.9258 -0.3699 3.3208 Error 0.2015 0.1364 0.3087 Wald Chi-Square 91.3024 7.3548 115.6948 Pr > ChiSq <.0001 0.0067 <.0001

Association of Predicted Probabilities and Observed Responses Percent Concordant Percent Discordant Percent Tied Pairs 98.7 1.1 0.2 20104 Somers' D Gamma Tau-a c 0.977 0.978 0.229 0.988

Partition for the Hosmer and Lemeshow Test Group 1 2 3 4 5 6 7 8 9 10 Total 41 43 38 47 42 34 44 43 40 43 rec = 1 Observed 0 0 0 0 0 0 0 2 15 39 Expected 0.05 0.09 0.14 0.25 0.34 0.41 0.71 1.07 12.04 40.20 rec = 0 Observed 41 43 38 47 42 34 44 41 25 4 Expected 40.95 42.91 37.86 46.75 41.66 33.59 43.29 41.93 27.96 2.80

The LOGISTIC Procedure Hosmer and Lemeshow Goodness-of-Fit Test Chi-Square 4.4367 8 DF Pr > ChiSq 0.8157

24

You might also like