Lecture Note 2 - Forecasting Trends

Business Analytics and Forecasting
DS 580
Farideh Dehkordi-Vakil
Introduction
Recall that extrapolative methods of forecasting

focus on a single time series to identify past
patterns in the historical data.
These patterns are then extrapolated to map out
the likely future path of the series.
Introduction
Note that, the past and present values are

already observed, where as the future values
are unknown and represent random
variables.
We do not know their values but we can
describe them in terms of a set of possible
values and the associated probabilities.
Introduction
This figure shows a

time series observed
for time period 1-12
and we would like to
make a forecast for
period 13-20.
Note the increase in

uncertainty as the
forecast horizon
increases.
Introduction
It is important to know both the forecast

origin and for how many periods a head the
forecast is being made.
Extrapolation of the Mean Value
Averaging methods
If a time series is generated by a constant process

subject to random error, then mean of the past values is
a useful statistics and can be used as a forecast for the
next period.
Averaging methods are suitable for stationary time
series data where the series is in equilibrium around a
constant value ( the underlying mean) with a constant
variance over time.
Averaging Methods
The Mean
Uses the average of all the historical data as the

forecast
1 t
Ft 1
t
i 1
When new data becomes available , the forecast for

1
time t+2 is the new mean
the previously
1 t including
Ft 2 this
new
yobservation.
observed data plus

i
t 1
i 1
This method is appropriate when there is no noticeable

trend or seasonality.
Averaging Methods
How do you describe this weekly
sales?
Suppose we are at week 26 and want to

forecast sales for the next few week. Should
use the average of all the 26 weeks available?
Moving Average Method
The moving average for time period t is the mean

of the k most recent observations.
A moving average of order k, MA(k) is the value
of k consecutive observations.
Ft 1 y t 1
( yt yt 1 yt 2 yt k 1 )
K
1 t
Ft 1
yi
k i t k 1
K is the number of terms in the moving average.
Some care should be taken in choosing the span k

for a moving average forecast model.
As a general rule, large spans smooth the time
series more than smaller spans by averaging many
ups and down in each calculation.
The smaller the number k, the more weight is
given to recent periods.
The greater the number k, the less weight is given
to more recent periods.
Moving Averages
A large k is desirable when there are wide,

infrequent fluctuations in the series.
A small k is most desirable when there are
sudden shifts in the level of series.
For seasonal data, the length of the season
is often used for the value of k.
For monthly data, a 12-month moving average,

MA(12), eliminate or averages out seasonal effect.
Moving average method
Assigns equal weight to each observation used in the

calculation.
As more information become available, new data point
will be included in the calculation and the oldest data
point will be discarded.
The moving average model does not handle trend or
seasonality very well although it can do better than the
total mean
Moving Averages
The following figure shows that the MA(3) adapt more quickly to
movements in the series while MA(7) produces a greater degree of
smoothing.
Example: Weekly Department Store Sales
The weekly sales

figures (in millions of
dollars) presented in
the following table are
used by a major
department store to
determine the need for
temporary sales
personnel.
Period (t)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sales (y)
5.3
4.4
5.4
5.8
5.6
4.8
5.6
5.6
5.4
6.5
5.1
5.8
5
6.2
5.6
6.7
5.2
5.5
5.8
5.1
5.8
6.7
5.2
6
5.8

Weekly Sales
8
Sales
Sales (y)
0
0
10
15
Weeks
20
25
30
Use a three-week moving average (k=3) for

the department store sales to forecast for the
week 24 and 26.
y 24
( y23 y22 y21 ) 5.2 6.7 5.8
5.9
3
3
The forecast error is
e24 y24 y 24 6 5.9 .1
The forecast for the week 26 is

y 26
y25 y24 y23 5.8 6 5.2
5.7
3
3
RMSE = 0.63
Weekly Sales Forecasts
Sales
5
Sales (y)
forecast
0
0
10
15
Weeks
20
25
30
Period (t)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Sales (y) forecast

5.3
4.4
5.4
5.8
5.033333
5.6
5.2
4.8
5.6
5.6
5.4
5.6
5.333333
5.4
5.333333
6.5
5.533333
5.1
5.833333
5.8
5.666667
5
5.8
6.2
5.3
5.6
5.666667
6.7
5.6
5.2
6.166667
5.5
5.833333
5.8
5.8
5.1
5.5
5.8
5.466667
6.7
5.566667
5.2
5.866667
6
5.9
5.8
5.966667
5.666667
Exponential Smoothing Methods
This method provides an exponentially

weighted moving average of all previously
observed values.
Appropriate for data with no predictable
upward or downward trend.
The aim is to estimate the current level and
use it as a forecast of future value.
Simple Exponential Smoothing Method
Formally, the exponential smoothing equation is

Ft 1 yt (1 ) Ft
forecast for the next period.

= smoothing constant.
yt = observed value of series in period t.
Ft = old forecast for period t.

The forecast Ft+1 is based on the most recent
observation yt with a weight and weighting the most
recent forecast Ft with a weight of 1-
Ft 1
The implication of exponential smoothing

can be better seen if the previous equation
is expanded by replacing Ft with its
components as follows:
Ft 1 yt (1 ) Ft
yt (1 )[ yt 1 (1 ) Ft 1 ]
yt (1 ) y t 1 (1 ) 2 Ft 1
If this substitution process is repeated by

replacing Ft-1 by its components, Ft-2 by its
components, and so on the result is:
Ft 1 yt (1 ) y t 1 (1 ) 2 y t 2 (1 ) 3 y t 3 (1 )t 1 y1
Therefore, Ft+1 is the weighted moving

average of all past observations.
The following table shows the weights assigned to

past observations for = 0.2, 0.4, 0.6, 0.8, 0.9
The exponential smoothing equation

rewritten in the following form elucidate the
role of weighting factor .
Ft 1 Ft ( yt Ft )
Exponential smoothing forecast is the old

forecast plus an adjustment for the error that
occurred in the last forecast.
Effect of Different Weights

0.6
0.5
0.4
Weight
0.3
0.2
0.1
0
Lag
Choosing the smoothing constant in the

exponential smoothing model is similar to
choosing the span k in the moving average model.
They both related to the smoothness of the model.
Smaller values of correspond to greater smoothing of

the ups and downs in the time series.
Larger values of put most of the weight on the most
recent observed values, so the forecasts tend to follow
the ups and downs of the series more closely.
The value of smoothing constant must be

between 0 and 1
can not be equal to 0 or 1.
If stable predictions with smoothed random
variation is desired then a small value of is
desire.
If a rapid response to a real change in the pattern
of observations is desired, a large value of is
appropriate.
To estimate , Forecasts are computed for

equal to .1, .2, .3, , .9 and the sum of
squared forecast error is computed for each.
The value of with the smallest RMSE is
chosen for use in producing the future
forecasts.
To start the algorithm, we need F1 because

F2 y1 (1 ) F1
Since F1 is not known, we can
Set the first estimate equal to the first observation.

Use the average of a number of initial observations.
the first three or four up to 12 or even the mean of the whole

sample can be used.
When either sample size or is large, the choice of starting
value is relatively unimportant.
Example:University of Michigan Index

of Consumer Sentiment
University of Michigan
Index of Consumer
Sentiment for
January1995December1996.
we want to forecast the
University of Michigan
Index of Consumer
Sentiment using Simple
Exponential Smoothing
Method.
Date
Observed
Jan-95
97.6
Feb-95
95.1
Mar-95
90.3
Apr-95
92.5
May-95
89.8
Jun-95
92.7
Jul-95
94.4
Aug-95
96.2
Sep-95
88.9
Oc t-95
90.2
Nov-95
88.2
Dec-95
91
Jan-96
89.3
Feb-96
88.5
Mar-96
93.7
Apr-96
92.7
May-96
94.7
Jun-96
95.3
Jul-96
94.7
Aug-96
95.3
Sep-96
94.7
Oc t-96
96.5
Nov-96
99.2
Dec-96
96.9
Jan-97

Since no forecast is
available for the first
period, we will set the
first estimate equal to
the first observation.
We try =0.3, and
0.6.
University of Michigan Index of Consumer

Sentiment
100
Consumer Sentiment Index
98
96
94
92
90
88
86
Sep-94
Apr-95
Oct-95
May-96
Date
Dec-96
Jun-97

Note the first forecast is

the first observed value.
The forecast for Feb. 95 (t
= 2) and Mar. 95 (t = 3)
are evaluated as follows:
y t 1 y t ( yt y t )
y 2 y1 0.6( y1 y1 ) 97.6 0.6(97.6 97.6) 97.6
y 3 y 2 0.6( y2 y 2 ) 97.6 0.6(95.1 97.6) 96.1
Date
Jan-95
Feb-95
Mar-95
Apr-95
May-95
Jun-95
Jul-95
Aug-95
Sep-95
Oct-95
Nov-95
Dec-95
Jan-96
Feb-96
Mar-96
Apr-96
May-96
Jun-96
Jul-96
Aug-96
Sep-96
Oct-96
Nov-96
Dec-96
Jan-97
Feb-97
Mar-97
Apr-97
May-97
Jun-97
Jul-97
Aug-97
Sep-97
Oct-97
Nov-97
Dec-97
Consumer Sentiment
97.6
95.1
90.3
92.5
89.8
92.7
94.4
96.2
88.9
90.2
88.2
91
89.3
88.5
93.7
92.7
89.4
92.4
94.7
95.3
94.7
96.5
99.2
96.9
97.4
99.7
100
101.4
103.2
104.5
107.1
104.4
106
105.6
107.2
102.1
Alpha =0.3
#N/A
97.60
96.85
94.89
94.17
92.86
92.81
93.29
94.16
92.58
91.87
90.77
90.84
90.38
89.81
90.98
91.50
90.87
91.33
92.34
93.23
93.67
94.52
95.92
96.22
96.57
97.51
98.26
99.20
100.40
101.63
103.27
103.61
104.33
104.71
105.46
Alpha=0.6
#N/A
97.60
96.10
92.62
92.55
90.90
91.98
93.43
95.09
91.38
90.67
89.19
90.28
89.69
88.98
91.81
92.34
90.58
91.67
93.49
94.58
94.65
95.76
97.82
97.27
97.35
98.76
99.50
100.64
102.18
103.57
105.69
104.92
105.57
105.59
106.55

RMSE =2.66 for = 0.6

RMSE =2.96 for = 0.3
University of Michigan Index of Consumer sentiments
120
100
Sentim ent Index
80
Consumer Sentiment
60
SES (Alpha =0.3)

SES(Alpha=0.6)
40
20
0
Jun-94
Oct-95
Mar-97
Jul-98
Months
Dec-99
Apr-01
Evaluating Forecasts
All quantitative forecasting models are developed on the

basis of historical data.
When RMSE are applied to the historical data, they are
often considered measures of how well various models fit
the data (how well they work in the sample).
To determine how accurate the models are in actual
forecast (out of sample) a hold out period is often used for
evaluation and a measure of forecast accuracy based on the
forecast errors (such as RMSE) can be computed.
Evaluating Forecasts
To evaluate the relative performance of alternative

methods:
The data series is partitioned into two parts.
The first part is called estimation sample or in-sample is used

to estimate the starting value and the smoothing parameter.
This sample typically contains the first 75-80 percent of the
observations.
The second part called hold-out sample or validation sample
or out-sample is used to assess forecasting performance. This
sample contains the last 20-25 percent of observation.
General Comments
On average, SES tends to outperform MA.

SES corresponds to an intuitively appealing
underlying statistical model of the data (we shall
see this in chapter 5).
Direct use of moving average based procedures
are not recommended for forecasting.
Moving averages are useful in the area of seasonal
adjustment (will see this in chapter 4)
General Comments
For evaluation or fitting purposes, we could

minimize RMSE or minimize MAP or
MAE. They generally produce similar
results.
Out-of-sample error measures tend to be
somewhat higher than those calculated for
estimation sample.
Linear Exponential Smoothing
When a time series has a long-term trend (e.g.

increases in GDP or sales) the forecasting
method must accommodate such features. There
are two main approaches:
Convert the series to rates of change (growth rates,

either absolute or percentage) then predict the rate of
change, OR
Develop forecasting methods that account for trends
Linear Exponential Smoothing
Linear trend fitted to Quarterly

Sales
Quarterly Sales = - 6.157 + 4.567 Period
80
S
R-Sq
R-Sq(adj)
70
60
Quarterly Sales = 6.914 - 0.0466 Period

+ 0.2883 Period**2
5.47757
93.7%
93.3%
80
60
50
Netflix Sales
S
R-Sq
R-Sq(adj)
70
50
40
Netflix Sales
30
40
30
20
20
10
10
0
0
10
Period
12
14
16
10
Period
12
14
16
Quadratic trend fitted to

Quarterly Sales
1.98024
99.2%
99.1%
Holts Exponential Smoothing
The previous two models assume a never

changing trend into the future.
The linear exponential smoothing model projects
trends more locally.
Holts two parameter exponential smoothing
method is an extension of simple exponential
smoothing.
It adds a growth factor (or trend factor) to the
smoothing equation as a way of adjusting for
changes in the trend.
Holts Exponential Smoothing
We start by defining the following

variables:
Lt = level of series at time t.
Tt = trend(slope) of series at time t.
The forecast function for one step ahead is:
Ft+1 = Lt + Tt
The forecast m steps ahead is
Ft+m = Lt + mTt
Holts Exponential smoothing
To update the level and the trend:
The new level is the old level (adjusted for the increase produced
by the trend) plus a partial adjustment (weight ) for the most
recent error.
Lt Lt 1 Tt 1 et
L t yt (1 )( Lt 1 Tt 1 )
The new trend is the old trend plus a partial adjustment (weight )
for the error.
Tt Tt 1 et
Tt ( Lt Lt 1 ) (1 )Tt 1
Forecast m steps into the future.
F t m Lt Tbt
The weight and can be selected

subjectively or by minimizing a measure of
forecast error such as RMSE.
Large weights result in more rapid changes
in the component.
Small weights result in less rapid changes.
The initialization process for Holts linear

exponential smoothing requires two estimates:
One to get the first smoothed value for L1
The other to get the trend b1.
One alternative is to set L1 = y1 and

b1 y 2 y1
or
b1
y 4 y1
3
or
b1 0
Example:Quarterly sales of saws for

Acme tool company
The following table

shows the sales of
saws for the Acme tool
Company.
These are quarterly
sales From 1994
through 2000.
Year
1994
1995
1996
1997
1998
1999
2000
Quarter
t
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
sales
500
350
250
400
450
350
200
300
350
200
150
400
550
350
250
550
550
400
350
600
750
500
400
650
850
600
450
700

Acme tool company
Examination of the
plot shows:
A non-stationary time
series data.
Seasonal variation
seems to exist.
Sales for the first and

fourth quarter are larger
than other quarters.
Sales of saws for the Acme Tool Company: 1994-2000

900
800
700
600
500
Saws
400
300
200
100
0
0
10
15
Year
20
25
30

Acme tool company
The plot of the Acme data shows that there might

be trending in the data therefore we will try Holts
model to produce forecasts.
We need two initial values
The first smoothed value for L1

The initial trend value b1.
We will use the first observation for the estimate

of the smoothed value L1, and the initial trend
value b1 = 0.
We will use = .3 and =.1.

Acme tool company
Year
1994
1995
1996
1997
1998
1999
2000
Quarter
t
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
sales
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
500
350
250
400
450
350
200
300
350
200
150
400
550
350
250
550
550
400
350
600
750
500
400
650
850
600
450
700
Lt
500.00
455.00
390.35
385.88
398.18
378.34
318.61
303.23
307.38
266.55
220.98
261.95
339.77
340.55
311.38
379.12
431.67
427.00
407.92
467.83
558.73
553.10
517.56
564.16
659.35
656.71
608.16
644.43
bt
0.00
-4.50
-10.52
-9.91
-7.69
-8.90
-13.99
-14.13
-12.30
-15.15
-18.19
-12.28
-3.27
-2.86
-5.49
1.83
6.90
5.74
3.26
8.93
17.12
14.85
9.81
13.49
21.66
19.23
12.45
14.83
Ft+m
500.00
500.00
450.50
379.84
375.97
390.49
369.44
304.62
289.11
295.08
251.40
202.79
249.67
336.50
337.69
305.89
380.95
438.57
432.74
411.18
476.75
575.85
567.94
527.37
577.65
681.01
675.94
620.61

Acme tool company
RMSE for this application

is:
= .3 and = .1
RMSE = 155.5
The plot also showed the
possibility of seasonal
variation that needs to be
investigated.
Quarterly Saw Sales Forecast Holt's Method

900
800
700
600
500
sales
Sales
Ht+m
400
300
200
100
0
0
10
15
Quarters
20
25
30
Exponential smoothing with a damped Trend
One common features of times series for

sales is a decline in sales as a product lines
matures unless the product is upgraded in
some way.
A procedure that damps down the trend
component as the forecast horizon is
extended assumes that the series will level
out over time.
This kind of life-cycle effects can be

accommodated by introducing a damping
factor to the updating equations for level
and trend.
Lt Lt 1 Tt 1 et
Tt Tt 1 et
The damping factor 0 < < 1 will dampen the

trend term.
forecast function for m steps ahead

Ft m Lt ( 2 m )Tt
This forecast levels out over time,

approaching the limiting value
Lt
Tt
(1 )
Use of Transformations
Use of LES methods requires that series be locally linear.

In many cases this assumption is not realistic and the forecasts either
underestimate or overestimate the actual value. This becomes more
serious as forecasting horizon increases.
Use of Transformations
A series with a more complex nonlinear

pattern can be forecast in two ways:
Transform the series so that the trend becomes

linear
Convert the series to growth over time, forecast
growth rate, and then convert back to the
original series.
The Log Transform
The log transform produces a linear trend, we can apply

LES and then transform back to the original series to
obtain the forecast of interest.
Typically the effect of the log transform process is to
improve forecasting performance for exponential growth
curve.
Yt 1.05Yt 1
Log transform :
LnY t ln 1.05 ln Yt 1
the reverse transformation :
Exp(ln Yt ) exp(ln 1.05 ln Yt 1 ) 1.05Yt 1
Use of Growth Rate
Gt
Yt Yt 1
100
Yt 1
Define Growth rate

Use SES to predicr the growth rate for the
next period.
The one step forecast for the original series
is given by
Gt 1
Ft 1 Yt (1
)
100
Growth Rate Analysis of Netflix Quarterly Sales

Year
Quarter
Quarterly Sales
Growth
-percentage
Growth forcast
Sales Forecast
2000
5.17*
2000
7.15
38.1
38.1
2000
10.18
42.5
38.1
9.9
2000
13.39
31.5
41.9
14.1
2001
17.06
27.4
33.0
19.0
2001
18.36
7.6
28.2
22.7
2001
18.88
2.8
10.5
23.5
2001
21.62
14.5
3.9
20.9
2002
30.53
41.2
13.0
22.5
2002
36.36
19.1
37.2
34.5
2002
40.73
12.0
21.7
49.9
2002
45.19
10.9
13.4
49.6
2003
55.67
23.2
11.3
51.2
2003
63.19
13.5
21.5
62.0
2003
72.20
14.3
14.6
76.8
2003
81.19
12.4
14.3
82.8
The BOX-Cox Transformations
Logarithmic transformation is appealing because it

reflects proportional rather absolute change.
But proportional change may project future
growth in excess of reasonable expectations.
A modified LES to allow for a damped trend was
introduced earlier.
This modification can be applied after the log
transform when appropriate.
The BOX-Cox Transformations
A second possibility is to select a transformation

that is moderate than the logrithmic one.
Box and Cox suggested using a power
transformation.
Z t Yt c
1 C 1
When to Transform
Do not use complex transforms unless they

are supported by both theory and data.
Always compare transformed method with
a benchmark by transforming the forecast to
the data series of interest.

Lecture Note 2 - Forecasting Trends

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture Note 2 - Forecasting Trends

Uploaded by

Copyright:

Available Formats

Business Analytics and Forecasting

Recall that extrapolative methods of forecasting

Note that, the past and present values are

This figure shows a

Note the increase in

It is important to know both the forecast

Extrapolation of the Mean Value

If a time series is generated by a constant process

Uses the average of all the historical data as the

When new data becomes available , the forecast for

observed data plus

This method is appropriate when there is no noticeable

Suppose we are at week 26 and want to

Moving Average Method

The moving average for time period t is the mean

K is the number of terms in the moving average.

Moving Average Method

Some care should be taken in choosing the span k

A large k is desirable when there are wide,

Moving Average Method

For monthly data, a 12-month moving average,

Assigns equal weight to each observation used in the

Example: Weekly Department Store Sales

The weekly sales

Example: Weekly Department Store Sales

Example: Weekly Department Store Sales

Use a three-week moving average (k=3) for

( y23 y22 y21 ) 5.2 6.7 5.8

The forecast error is

e24 y24 y 24 6 5.9 .1

Example: Weekly Department Store Sales

The forecast for the week 26 is

y25 y24 y23 5.8 6 5.2

Example: Weekly Department Store Sales

Sales (y) forecast

Exponential Smoothing Methods

This method provides an exponentially

Simple Exponential Smoothing Method

Formally, the exponential smoothing equation is

forecast for the next period.

Ft = old forecast for period t.

Simple Exponential Smoothing Method

The implication of exponential smoothing

Simple Exponential Smoothing Method

If this substitution process is repeated by

Therefore, Ft+1 is the weighted moving

Simple Exponential Smoothing Method

The following table shows the weights assigned to

Simple Exponential Smoothing Method

The exponential smoothing equation

Exponential smoothing forecast is the old

Effect of Different Weights

Simple Exponential Smoothing Method

Choosing the smoothing constant in the

Smaller values of correspond to greater smoothing of

Simple Exponential Smoothing Method

The value of smoothing constant must be

Simple Exponential Smoothing Method

To estimate , Forecasts are computed for

Simple Exponential Smoothing Method

To start the algorithm, we need F1 because

Since F1 is not known, we can

Set the first estimate equal to the first observation.