You are on page 1of 14

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

INTRODUCTION
Several factors such as starting pitcher, temperature/weather, team record, traffic, and
more play a role in attendance. However, these factors are unpredictable and cannot be used for
planning ahead. As consultants to Major League Baseball (MLB), our group has the primary goal
of increasing attendance through statistical analysis. Using data from over 12,000 games over
four years, we make recommendations to the MLB on changes they can make to the schedule to
increase attendance.
THE SPECIFICATION (MODEL)
The choice of estimation procedure builds upon a prior study of MLB baseball attendance
by Lemke et al. of the 2007 season. Both game attendance and log attendance are used as the
dependent variables in ordinary least squares (OLS) and censored regression (CR) models. Right
censored regression is used to model the effects of capacity on sell-out games. All models are
fixed-effect (FE) models in which each home team receives its own fixed-effect to account for
local market conditions and intercity variations. We assume that unobservable factors that might
simultaneously affect the LHS and RHS of the regression are time-invariant. Explanatory
variables include: time factors (day of week, time of day, year, month); factors that influence
attendance (interleague and opening day games and games on holidays); and, whether two
games are played in a city at once (New York City, San Francisco Bay Area, Chicago,
Washington, DC, and Los Angeles). The OLS models are AR(1) to account for correlation of
errors in the time-series data. The Newey-West estimator is used to correct for autocorrelation
and heteroskedasticity in the error terms of the OLS models, serving to weaken the assumptions
of the model. Nine dummy variables control for the day of the week and the time of the game.
There is a separate dummy variable for each day, Monday through Friday, plus a variable for
playing a day game during the week. Saturday and Sunday games are each further separated by
time of day. Additionally, there are are five dummy variables to control for the month and four
more variables to control for the year.
THE DATA
The data includes the date, time of day, and attendance records of all MLB games played
over the 2008-2012 seasons (inclusive) for a total of 12,100 observations. Mean attendance at
MLB games was 30,860 people for the period in questions, with a range of 8,269 (TOR vs. TMB
on April 22, 2008) and 57,099 (SFN v. LAN on April 13, 2009) (see full detail of descriptive
statistics at Table 2, Appendix). The observations also include whether or not each game was at
capacity, was played on opening day or a holiday, involved interleague play, or was held on the
same day as another game in the same metropolitan area (as indicator variables).
REGRESSION RESULTS
When using attendance or log attendance as the dependent variables, estimated
coefficients are interpreted as changes in attendance or percentage changes in attendance
(respectively). For example, under the OLS model, a Thursday night game averages 3,288 fewer
attendees than a Sunday afternoon game (see OLS regression at Table 3, Appendix). Using log
attendance, the same data would be interpreted as 14.41 percent fewer in attendance. The
baseline is attendance at a Sunday afternoon game held in FLO in April 2008 that is not on
opening day, and not on a holiday or an interleague game (21,007 people).
Based on the CR models, the semi-log functional form is judged to be the better model
based on Akaike info criterion (0.427297 vs. 16.6486). Only OAK and the simultaneous game
cities (except NY2) are not statistically significant factors in both CR models, which confirm the
conclusions that may drawn from the OLS models.

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

The proportion of the variance in attendance and log attendance that is explained by the
OLS models are 0.6919 and 0.6752, respectively, with the adjusted R-squared values being
slightly lower (0.6904 and 0.6752). All OLS model coefficients are statistically significant (within
0.05 significance) with the exception of the simultaneous game variables, Sunday night games,
home team game attendance at OAK, and (for the log attendance model) Friday night games
(see Table 4, Appendix). The simultaneous game coefficients were left in the model to support
the findings and recommendations of this report. Removing these variables from the model did
not have a significant impact the ability of the model to explain variability in attendance. The signs
and magnitudes of the coefficients are in alignment with expectations relative to the baseline
(FLO having the lowest league attendance) and with the descriptive statistics of the data set (see
Table 2, Appendix). Leverage plots were performed on each coefficient without suggesting
nonlinearities. The model was rejected by the Ramsey test, but given the large time series data
set, we hold the Ramsey test to be uninformative. Choosing the functional form to be
untransformed or semi-log is supported by the academic literature.
From the model we make a few general observations: Monday through Thursday games
draw significantly fewer fans than Saturday or Sunday afternoon games. Day games in general
offer slightly higher attendance than night games. Attendance is expected to be less in
September compared to July and August, and is expected to be more on major holidays.
FINDINGS AND RECOMMENDATIONS
Monday vs. Thursday Off Days
The most commonly scheduled off days in the league are Monday and Thursday, when
teams often travel home or away for a new series. Viewing our OLS regression results (Table 3,
Appendix), we see that Monday and Thursday both imply a statistically significant negative
attendance effect when compared with the baseline of Sunday daytime games. At first glimpse, it
seems that Monday indicates a larger negative effect on attendance than Thursday, but to be
certain, we can conduct a Wald Test (Table 7, Appendix).
For this Wald test, we made Monday + Daytime = Thursday + Daytime our null
hypothesis. This resulted in a p-value of 0.1865, which means that we do not have enough
evidence to reject the hypothesis at a 0.05 level that Monday and Thursday games are the same.
From a statistical standpoint, there is no difference between Monday and Thursday games, but
from a managerial perspective, it might be interesting to know that there will occasionally be
differences. It may be prudent to slightly favor Monday off days when scheduling because the
Monday coefficient has a larger negative effect on attendance.
Annual Attendance
Using numbers from the OLS regression (Table 3, Appendix), we put together an annual
attendance graph (Figure 1, Appendix) as implied by the annual indicator variables (2008 - 2012).
This information will give us the means to analyze some very general attendance trends for Major
League Baseball.
We notice that our baseline year of 2008 indicates peak annual attendance, followed by
strong declines through 2010. The trend then turns upward with some weak growth in 2011 and
2012. We conclude that the trend in attendance is directly related to the Great Recession, which
officially lasted from December 2007 to June 2009 in the U.S (source:
http://www.nber.org/cycles.html).
Looking at a chart of Real GDP (source: http://www.multpl.com/us-gdp-inflationadjusted/table, Figure 2, Appendix), we can see that baseball attendance seems to follow these
trends, lagging by about 1 year. One very important concern is that baseball attendance has not

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

recovered as quickly as the rest of the American economy. While the leagues growth trend is
positive, it should try and identify other factors that may be causing slower recovery. It should
also use this data to anticipate attendance in the event of a future economic downturn. If MLB can
use GDP as an indicator, it can better prepare and anticipate for losses caused by poor
attendance.
Should the MLB be concerned with multiple intra-city games on the same day?
While none of our OLS model two game variables (NY2, BAY2, CHI2, DC2, and LA2)
were statistically significant at the 0.05 percent level, we believe there is still a useful
interpretation to some of the coefficients. Eighty-seven percent (1-0.1264) of the time, when both
NY teams in NY play, there will be an increase of 1,564 in attendance. Eighty-five percent of the
time, when both Bay Area teams play in the Bay Area, there will be a 1,102 drop in attendance.
Additionally, 80% of the time, Chicago will see a 656 person increase in attendance. NY2 is
statistically significant under our CR model analysis, further highlighting the managerial
significance of simultaneous games in the New York metropolitan area.
These numbers are what we call managerially significant. While not enough to make
more certain statistical predictions, we recommend using this data to make educated decisions,
with the realization that they will occasionally be incorrect. The NY and Chicago positive effects
could possibly be explained by the rivalry between the intra-city teams. Advising NY and Chicago
teams to work together to schedule same day home games would be a good idea, but it should
be emphasized that this should not be a priority. Considering that the sample size for having two
NY games is less than 25 per season, we felt that there could have been other factors (e.g.
Special City-wide events) affecting attendance on those specific days that are not accounted for
in the data.
The Bay Area is unique because of the negative overall effect implied. One possible
explanation is that the Giants are much more popular than the As, as evidenced by the HTeam
coefficients of 15,789 for the Giants and 198 for the As (HTeam=OAK is far from statistically
significant, suggesting no effect on attendance). This data suggests that when the Giants and As
play on the same day in the Bay Area, the Giants overpower the As and there is an overall
negative effect. It also could be explained by the fact that these two teams do not have a rivalry
with high levels of animosity, unlike NY and Chicago.
Should the MLB care about day versus night games?
Sunday afternoon games are the baseline in the regression, Saturday, and Sunday night
games are all better than a weekend Day Game. Saturday and Sunday night games experience
an overall increase of 4,209 and 958, respectively. The main explanation for this is that people
generally have more free time on weekends. Furthermore, weekday (including Friday) day games
on average have 757 more in attendance than weekday night games. Our intuitive explanation for
this is that weekday night games do not end until later in the night and many people have to work
the following morning. Additionally, many people take advantage of the businessperson special
games and promotion/giveaway games that are in the day time.
Should the MLB move the schedule to start later in April and end in October?
Attendance increases as the season continues, peaking in July and August and dipping
in September, though remaining higher than April (Figure 3, Appendix). While the end of the
season still has better attendance than the beginning, there is more uncertainty in cold weather
cities, the start of the football season, and how the playoffs will affect attendance. However, the
combined effect of summer weekend games is even more powerful (Table 1). For this reason, we
would recommend eliminating as many April and September games as possible and replacing
them with day/night weekend doubleheaders in July and August.

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

Saturday Day

Saturday Night

Sunday Night

July

+ 6390

+ 7943

+ 4692

August

+ 5547

+ 7100

+ 3849

Table 1: Coefficients of Saturdays and Sundays during Peak Months

Because this recommendation would likely be resisted by the players union, we would
also recommend starting and ending the season later. Overall, the data suggests that doing so
would increase attendance; however, we remain cautious as autocorrelation could affect the
prediction.
CONCLUSION
In conclusion, our study of attendance at MLB games for the 2008-2012 seasons yield
the following observations:
The league should not be concerned with Monday versus Thursday off days as the variables
were not statistically different from each other. While baseball attendance had not reached 2008
levels by the end of 2012, overall attendance seems to be correlated with the Great Recession
and disposable income. New York, Chicago, and Bay area teams should all be concerned with
having multiple intra-city games on the same day. However, this should not be a major concern
as there is a 0.15-0.20 probability this effect will not happen. Day games have higher attendance
than night games on weekdays, but this effect is reversed and magnified for Saturday and
Sunday. If possible, the league should cut games from the beginning of the season in April and
make them up in the form of double headers on weekends in July and August. If this is not
realistic, the league should cautiously begin to start and end the season later in the year, but
beware of playoff and temperature effects.

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

Appendix
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis

ATTENDANCE
30859.70
31369.00
57099.00
8269.000
10653.21
-0.091275
2.047843

Jarque-Bera
Probability

473.8801
0.000000

Sum
Sum Sq. Dev.

3.73E+08
1.37E+12

Observations

12100

Table 2: Descriptive Statistics

Dependent Variable: ATTENDANCE


Method: Least Squares
Date: 04/30/14 Time: 08:49
Sample (adjusted): 2 12100
Included observations: 12099 after adjustments
Convergence achieved after 14 iterations
HAC standard errors & covariance (Bartlett kernel, Newey-West fixed
bandwidth = 12.0000)

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C
INTER
HOLIDAY
OPENING
NY2
BAY2
CHI2
DC2
LA2
YEAR=2009
YEAR=2010
YEAR=2011
YEAR=2012
DAY="Fri"
DAY="Mon"
DAY="Thu"
DAY="Tue"
DAY="Wed"
NIGHT="D"
(DAY="Sat")*(NIGHT="D")
(DAY="Sat")*(NIGHT="N")
(DAY="Sun")*(NIGHT="N")
MONTH=5
MONTH=6

21007.21
2743.328
3131.878
11302.18
1564.513
-1102.537
656.8483
145.8249
356.4238
-2233.452
-2406.716
-2152.011
-1875.283
909.4155
-3932.290
-3288.636
-3613.521
-3642.751
757.0112
1899.323
4209.857
958.4386
817.3996
1875.373

543.0514
295.9214
1084.646
943.4765
1023.403
760.4260
510.1526
724.8650
841.1608
237.4131
231.7313
233.0650
242.1100
457.9592
551.3640
449.7220
484.3133
440.6470
202.3443
452.9396
500.4346
535.3731
295.7628
325.7221

38.68365
9.270461
2.887465
11.97929
1.528736
-1.449894
1.287552
0.201175
0.423728
-9.407449
-10.38580
-9.233522
-7.745582
1.985800
-7.131931
-7.312597
-7.461124
-8.266825
3.741203
4.193325
8.412401
1.790225
2.763700
5.757585

0.0000
0.0000
0.0039
0.0000
0.1264
0.1471
0.1979
0.8406
0.6718
0.0000
0.0000
0.0000
0.0000
0.0471
0.0000
0.0000
0.0000
0.0000
0.0002
0.0000
0.0000
0.0734
0.0057
0.0000

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

MONTH=7
MONTH=8
MONTH=9
HTEAM="ANA"
HTEAM="ARI"
HTEAM="ATL"
HTEAM="BAL"
HTEAM="BOS"
HTEAM="CHA"
HTEAM="CHN"
HTEAM="CIN"
HTEAM="CLE"
HTEAM="COL"
HTEAM="DET"
HTEAM="HOU"
HTEAM="KCA"
HTEAM="LAN"
HTEAM="MIA"
HTEAM="MIL"
HTEAM="MIN"
HTEAM="NYA"
HTEAM="NYN"
HTEAM="OAK"
HTEAM="PHI"
HTEAM="PIT"
HTEAM="SDN"
HTEAM="SEA"
HTEAM="SFN"
HTEAM="SLN"
HTEAM="TBA"
HTEAM="TEX"
HTEAM="TOR"
HTEAM="WAS"
AR(1)

3734.020
2891.422
1582.842
16462.58
7488.247
9828.479
4413.848
14994.90
7209.076
15296.95
6352.246
2985.583
12434.99
12567.91
7707.297
2497.262
20763.37
7541.038
14165.73
12622.57
25278.84
14576.52
198.1210
21873.50
2914.489
7023.768
5782.940
15789.08
17599.24
2392.621
11944.49
4943.496
6118.790
0.359360

R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
F-statistic
Prob(F-statistic)
Prob(Wald F-statistic)

0.691876
0.690417
5927.298
4.23E+11
-122246.6
474.3400
0.000000
0.000000

Inverted AR Roots

.36

289.5645
278.0575
292.6239
416.7005
425.5691
448.8976
480.2770
405.6611
402.4978
417.3289
430.1179
457.5983
442.0033
423.3006
427.4164
422.2319
530.5516
536.7056
410.0024
426.1675
428.1216
581.4019
455.8608
429.6720
449.6637
408.3717
440.9146
424.2521
396.9307
438.5424
543.7304
457.9189
446.5431
0.009121

April 30, 2014

12.89530
10.39865
5.409133
39.50698
17.59584
21.89470
9.190213
36.96412
17.91085
36.65442
14.76862
6.524462
28.13324
29.69028
18.03229
5.914432
39.13544
14.05061
34.55035
29.61880
59.04593
25.07133
0.434609
50.90744
6.481487
17.19945
13.11578
37.21628
44.33832
5.455850
21.96767
10.79557
13.70258
39.40027

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
Hannan-Quinn criter.
Durbin-Watson stat
Wald F-statistic

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.6639
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
30858.57
10652.91
20.21731
20.25279
20.22920
2.041023
212.6663

Table 3: Ordinary Least Squares (OLS)

EVIEWS command for OLS model in Table 3:


ls attendance c @expand(year, @dropfirst)
@expand(day, @drop("Sun"), @drop("Sat"))
@expand(night, @drop("N"))

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),


@drop("Sun"))*@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("D"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sat"))*@expand(night, @drop("D"))
@expand(month, @drop(4))
@expand(hteam, @drop("FLO")) inter holiday opening ny2 bay2 chi2 dc2 la2 ar(1)
And we corrected for covariance with Newey-West.

Dependent Variable: LOG(ATTENDANCE)


Method: Least Squares
Date: 04/30/14 Time: 20:50
Sample (adjusted): 2 12100
Included observations: 12099 after adjustments
Convergence achieved after 11 iterations
HAC standard errors & covariance (Bartlett kernel, Newey-West fixed
bandwidth = 12.0000)

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C
INTER
HOLIDAY
OPENING
NY2
BAY2
CHI2
DC2
LA2
YEAR=2009
YEAR=2010
YEAR=2011
YEAR=2012
DAY="Fri"
DAY="Mon"
DAY="Thu"
DAY="Tue"
DAY="Wed"
NIGHT="D"
(DAY="Sat")*(NIGHT="D")
(DAY="Sat")*(NIGHT="N")
(DAY="Sun")*(NIGHT="N")
MONTH=5
MONTH=6
MONTH=7
MONTH=8
MONTH=9
HTEAM="ANA"
HTEAM="ARI"

9.876957
0.096517
0.116159
0.364025
0.047621
-0.055850
0.027433
0.002935
-0.004848
-0.075371
-0.083857
-0.067519
-0.054640
0.022581
-0.161743
-0.144117
-0.155646
-0.152052
0.032803
0.057904
0.153090
0.025112
0.040327
0.085631
0.155233
0.121272
0.067639
0.622062
0.322166

0.022509
0.010977
0.040788
0.032482
0.031119
0.032508
0.018534
0.028721
0.026065
0.009326
0.009172
0.009180
0.009388
0.016497
0.022540
0.017906
0.019781
0.017707
0.007686
0.015488
0.017309
0.018642
0.012448
0.013196
0.011743
0.011403
0.012029
0.018021
0.017839

438.8063
8.792401
2.847869
11.20712
1.530298
-1.718036
1.480105
0.102184
-0.185991
-8.081495
-9.143060
-7.355108
-5.820300
1.368836
-7.175943
-8.048730
-7.868260
-8.587311
4.268036
3.738729
8.844725
1.347041
3.239515
6.489109
13.21949
10.63520
5.622767
34.51864
18.05966

0.0000
0.0000
0.0044
0.0000
0.1260
0.0858
0.1389
0.9186
0.8525
0.0000
0.0000
0.0000
0.0000
0.1711
0.0000
0.0000
0.0000
0.0000
0.0000
0.0002
0.0000
0.1780
0.0012
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

HTEAM="ATL"
HTEAM="BAL"
HTEAM="BOS"
HTEAM="CHA"
HTEAM="CHN"
HTEAM="CIN"
HTEAM="CLE"
HTEAM="COL"
HTEAM="DET"
HTEAM="HOU"
HTEAM="KCA"
HTEAM="LAN"
HTEAM="MIA"
HTEAM="MIL"
HTEAM="MIN"
HTEAM="NYA"
HTEAM="NYN"
HTEAM="OAK"
HTEAM="PHI"
HTEAM="PIT"
HTEAM="SDN"
HTEAM="SEA"
HTEAM="SFN"
HTEAM="SLN"
HTEAM="TBA"
HTEAM="TEX"
HTEAM="TOR"
HTEAM="WAS"
AR(1)

0.397875
0.179725
0.573980
0.316453
0.584032
0.267118
0.127516
0.489427
0.494991
0.328412
0.113186
0.741537
0.330490
0.546247
0.494215
0.888844
0.550848
0.002631
0.790565
0.116813
0.305410
0.252439
0.602850
0.654934
0.107240
0.463726
0.211442
0.270890
0.396197

R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
F-statistic
Prob(F-statistic)
Prob(Wald F-statistic)

0.675175
0.673638
0.225650
613.1012
873.9778
439.0918
0.000000
0.000000

Inverted AR Roots

.40

0.018222
0.020528
0.017760
0.017275
0.018058
0.018520
0.020854
0.018070
0.017841
0.018346
0.018709
0.019935
0.021572
0.017414
0.017512
0.017798
0.021574
0.020818
0.018376
0.020069
0.017342
0.018677
0.017918
0.017339
0.019856
0.020961
0.019623
0.018908
0.008809

April 30, 2014

21.83528
8.755269
32.31889
18.31899
32.34246
14.42345
6.114765
27.08515
27.74448
17.90098
6.049737
37.19732
15.32020
31.36806
28.22190
49.94021
25.53320
0.126397
43.02083
5.820464
17.61096
13.51576
33.64538
37.77243
5.400820
22.12300
10.77537
14.32693
44.97564

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
Hannan-Quinn criter.
Durbin-Watson stat
Wald F-statistic

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.8994
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
10.26667
0.394989
-0.134884
-0.099405
-0.122987
2.039759
147.1499

Table 4: Ordinary Least Squares (OLS) for Semi-Log Model

EVIEWS command for OLS semi-log model in Table 4:


ls log(attendance) c @expand(year, @dropfirst)
@expand(day, @drop("Sun"), @drop("Sat"))
@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("D"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sat"))*@expand(night, @drop("D"))

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

@expand(month, @drop(4))
@expand(hteam, @drop("FLO")) inter holiday opening ny2 bay2 chi2 dc2 la2 ar(1)
And we corrected for covariance with Newey-West.

Dependent Variable: ATTENDANCE


Method: ML - Censored Normal (TOBIT) (Quadratic hill climbing)
Date: 04/30/14 Time: 20:54
Sample (adjusted): 1 12100
Included observations: 12100 after adjustments
Right censoring (indicator) series: CAPACITY
Convergence achieved after 5 iterations
Covariance matrix computed using second derivatives

Variable

Coefficient

Std. Error

z-Statistic

Prob.

C
INTER
HOLIDAY
OPENING
NY2
BAY2
CHI2
DC2
LA2
YEAR=2009
YEAR=2010
YEAR=2011
YEAR=2012
DAY="Fri"
DAY="Mon"
DAY="Thu"
DAY="Tue"
DAY="Wed"
NIGHT="D"
(DAY="Sat")*(NIGHT="D")
(DAY="Sat")*(NIGHT="N")
(DAY="Sun")*(NIGHT="N")
MONTH=5
MONTH=6
MONTH=7
MONTH=8
MONTH=9
HTEAM="ANA"
HTEAM="ARI"
HTEAM="ATL"
HTEAM="BAL"
HTEAM="BOS"
HTEAM="CHA"
HTEAM="CHN"
HTEAM="CIN"
HTEAM="CLE"
HTEAM="COL"

18529.80
3441.535
4119.488
23183.42
2067.946
-768.7924
708.7618
-3.631945
519.1164
-2701.138
-2840.467
-2456.163
-2328.046
1170.963
-4836.565
-3859.149
-4189.954
-4107.736
1057.236
2411.540
5290.438
2290.158
1153.377
2434.475
4513.829
3612.541
1964.708
22741.62
8777.128
11943.82
5394.724
29154.72
8765.713
23174.00
7684.867
3414.697
15812.97

532.3081
279.6429
550.4018
843.1406
814.9742
890.0366
988.9024
760.0641
893.9740
213.2914
213.5381
213.7740
215.9733
341.8113
358.4919
306.6237
343.8090
303.9955
247.0709
339.7966
371.4553
703.4261
240.3484
278.2535
246.6223
240.1137
238.2791
550.4439
530.1819
531.6618
539.6740
708.1247
537.4913
577.4846
532.1422
531.1509
533.3251

34.81030
12.30689
7.484511
27.49651
2.537437
-0.863776
0.716716
-0.004778
0.580684
-12.66407
-13.30192
-11.48953
-10.77932
3.425758
-13.49142
-12.58595
-12.18687
-13.51249
4.279077
7.097011
14.24246
3.255719
4.798772
8.749129
18.30260
15.04512
8.245407
41.31505
16.55494
22.46507
9.996263
41.17173
16.30857
40.12921
14.44138
6.428864
29.64979

0.0000
0.0000
0.0000
0.0000
0.0112
0.3877
0.4735
0.9962
0.5615
0.0000
0.0000
0.0000
0.0000
0.0006
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0011
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

Streips, Suen, Sullivan, Zerweck

HTEAM="DET"
HTEAM="HOU"
HTEAM="KCA"
HTEAM="LAN"
HTEAM="MIA"
HTEAM="MIL"
HTEAM="MIN"
HTEAM="NYA"
HTEAM="NYN"
HTEAM="OAK"
HTEAM="PHI"
HTEAM="PIT"
HTEAM="SDN"
HTEAM="SEA"
HTEAM="SFN"
HTEAM="SLN"
HTEAM="TBA"
HTEAM="TEX"
HTEAM="TOR"
HTEAM="WAS"

45-752 Project (Trick)

17231.61
9964.019
2908.022
25057.94
9431.093
20029.96
17819.92
29442.87
18738.82
765.8447
35420.06
3697.383
8282.613
6805.123
23097.35
23197.41
3053.152
14286.01
5967.397
7393.712

542.8497
532.2119
531.5844
542.1382
896.9613
546.5815
547.1074
549.1606
550.4079
538.9753
696.1475
532.4180
530.7822
530.2401
567.0065
538.9847
530.7654
533.5162
532.5510
542.1213

April 30, 2014

31.74287
18.72190
5.470481
46.22057
10.51449
36.64587
32.57116
53.61431
34.04533
1.420927
50.88011
6.944512
15.60454
12.83404
40.73559
43.03909
5.752357
26.77708
11.20530
13.63848

0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.1553
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

135.1529

0.0000

Error Distribution
SCALE:C(58)

7041.411

Mean dependent var


Akaike info criterion
Log likelihood
Avg. log likelihood

30859.70
16.64864
-100666.3
-8.319526

Left censored obs


Uncensored obs

0
9631

52.09958

S.D. dependent var


Schwarz criterion
Hannan-Quinn criter.

Right censored obs


Total obs

10653.21
16.68411
16.66053

2469
12100

Table 5: Ordinary Least Squares (OLS) for CR Model

EVIEWS command for CR model in Table 5:


censored(r=capacity, i) attendance c @expand(year, @dropfirst)
@expand(day, @drop("Sun"), @drop("Sat"))
@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("D"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sat"))*@expand(night, @drop("D"))
@expand(month, @drop(4))
@expand(hteam, @drop("FLO")) inter holiday opening ny2 bay2 chi2 dc2 la2 ar(1)

10

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

April 30, 2014

Dependent Variable: LOG(ATTENDANCE)


Method: ML - Censored Normal (TOBIT) (Quadratic hill climbing)
Date: 04/30/14 Time: 20:53
Sample (adjusted): 1 12100
Included observations: 12100 after adjustments
Right censoring (indicator) series: CAPACITY
Convergence achieved after 5 iterations
Covariance matrix computed using second derivatives

Variable

Coefficient

Std. Error

z-Statistic

Prob.

C
INTER
HOLIDAY
OPENING
NY2
BAY2
CHI2
DC2
LA2
YEAR=2009
YEAR=2010
YEAR=2011
YEAR=2012
DAY="Fri"
DAY="Mon"
DAY="Thu"
DAY="Tue"
DAY="Wed"
NIGHT="D"
(DAY="Sat")*(NIGHT="D")
(DAY="Sat")*(NIGHT="N")
(DAY="Sun")*(NIGHT="N")
MONTH=5
MONTH=6
MONTH=7
MONTH=8
MONTH=9
HTEAM="ANA"
HTEAM="ARI"
HTEAM="ATL"
HTEAM="BAL"
HTEAM="BOS"
HTEAM="CHA"
HTEAM="CHN"
HTEAM="CIN"
HTEAM="CLE"
HTEAM="COL"
HTEAM="DET"
HTEAM="HOU"
HTEAM="KCA"
HTEAM="LAN"
HTEAM="MIA"
HTEAM="MIL"
HTEAM="MIN"
HTEAM="NYA"
HTEAM="NYN"

9.745879
0.126170
0.154002
0.815026
0.067072
-0.051313
0.026899
-0.001405
0.004449
-0.096225
-0.103585
-0.080889
-0.076432
0.053041
-0.196273
-0.153047
-0.167310
-0.159051
0.050972
0.078978
0.207139
0.067081
0.055323
0.109652
0.190371
0.151895
0.084862
0.888019
0.414707
0.515558
0.231430
1.157181
0.424679
0.923253
0.344511
0.161893
0.660508
0.718767
0.458283
0.144443
0.920890
0.460859
0.813820
0.737507
1.037362
0.751198

0.020357
0.010731
0.021113
0.032859
0.031497
0.034123
0.038035
0.029027
0.034360
0.008182
0.008190
0.008205
0.008287
0.013116
0.013735
0.011752
0.013175
0.011650
0.009469
0.013068
0.014257
0.027011
0.009194
0.010652
0.009449
0.009193
0.009115
0.021064
0.020249
0.020317
0.020606
0.027078
0.020527
0.022164
0.020322
0.020275
0.020390
0.020804
0.020332
0.020291
0.020744
0.034245
0.020989
0.020968
0.021037
0.021099

478.7583
11.75700
7.294075
24.80383
2.129493
-1.503781
0.707226
-0.048402
0.129484
-11.76126
-12.64733
-9.858578
-9.223013
4.043895
-14.28995
-13.02325
-12.69864
-13.65237
5.382909
6.043736
14.52940
2.483440
6.017308
10.29418
20.14692
16.52321
9.310593
42.15748
20.48000
25.37623
11.23129
42.73508
20.68834
41.65573
16.95223
7.984914
32.39424
34.54928
22.54042
7.118689
44.39280
13.45757
38.77375
35.17231
49.31226
35.60406

0.0000
0.0000
0.0000
0.0000
0.0332
0.1326
0.4794
0.9614
0.8970
0.0000
0.0000
0.0000
0.0000
0.0001
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0130
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

11

Streips, Suen, Sullivan, Zerweck

HTEAM="OAK"
HTEAM="PHI"
HTEAM="PIT"
HTEAM="SDN"
HTEAM="SEA"
HTEAM="SFN"
HTEAM="SLN"
HTEAM="TBA"
HTEAM="TEX"
HTEAM="TOR"
HTEAM="WAS"

45-752 Project (Trick)

0.028266
1.318332
0.153676
0.393128
0.321248
0.920631
0.890679
0.149402
0.580791
0.271014
0.357785

0.020571
0.027081
0.020319
0.020270
0.020245
0.021807
0.020607
0.020256
0.020393
0.020334
0.020700

April 30, 2014

1.374096
48.68170
7.563264
19.39433
15.86797
42.21760
43.22205
7.375540
28.48060
13.32834
17.28431

0.1694
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000

136.1434

0.0000

Error Distribution
SCALE:C(58)

0.268831

Mean dependent var


Akaike info criterion
Log likelihood
Avg. log likelihood

10.26671
0.427297
-2527.149
-0.208855

Left censored obs


Uncensored obs

0
9631

0.001975

S.D. dependent var


Schwarz criterion
Hannan-Quinn criter.

Right censored obs


Total obs

0.394993
0.462773
0.439193

2469
12100

Table 6: Ordinary Least Squares (OLS) for CR Semi-Log Model

EVIEWS command for CR semi-log model in Table 6:


censored(r=capacity, i) log(attendance) c @expand(year, @dropfirst)
@expand(day, @drop("Sun"), @drop("Sat"))
@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("N"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sun"))*@expand(night, @drop("D"))
@expand(day, @drop("Mon"), @drop("Tue"), @drop("Wed"), @drop("Thu"), @drop("Fri"),
@drop("Sat"))*@expand(night, @drop("D"))
@expand(month, @drop(4))
@expand(hteam, @drop("FLO")) inter holiday opening ny2 bay2 chi2 dc2 la2 ar(1)

12

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

Table 7: Wald Test for Mondays vs. Thursdays Off

April 30, 2014

Figure 1: Annual Attendance (2008-2012)

13

Streips, Suen, Sullivan, Zerweck

45-752 Project (Trick)

Figure 2: U.S. Annual Real GDP ($ Trillions)

April 30, 2014

Figure 3: Monthy Attendance (2008-2012)

14

You might also like