Professional Documents
Culture Documents
(Demand) Forecasting
Topics
• Introduction to (demand) forecasting
• Overview of forecasting methods
• A generic approach to quantitative forecasting
• Time series-based forecasting
• Building causal models through multiple linear
regression
• Confidence Intervals and their application in forecasting
Forecasting
• The process of predicting the values of a certain quantity, Q,
over a certain time horizon, T, based on past trends and/or a
number of relevant factors.
• Some forecasted quantities in manufacturing
– demand
– equipment and employee availability
– technological forecasts
– economic forecasts (e.g., inflation rates, exchange rates, housing
starts, etc.)
• The time horizon depends on
– the nature of the forecasted quantity
– the intended use of the forecast
Forecasting future demand
• Demand forecasting is based on:
– extrapolating to the future past trends observed in the company sales;
– understanding the impact of various factors on the company future
sales:
• market data
• strategic plans of the company
• technology trends
• social/economic/political factors
• environmental factors
• etc
Collect data:
<Ind.Vars; Obs. Dem.>
Yes Model No
Valid?
Time Series-based Forecasting
Basic Model:
Time Series
D(i ), i 1,..., t Model
Dˆ (t ), 1,2,...
Historical Forecasts
Data
12.00
10.00
8.00
Series1
6.00
4.00
2.00
0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
The above data points have been sampled from a normal distribution with a
mean value equal to 10.0 and a variance equal to 4.0.
Forecasting constant mean series:
The Moving Average model
The presumed model for the observed data:
D(t ) D e(t )
where
D is the constant mean of the series and
e(t ) is normally 2distributed with zero mean and some unknown
variance
Then, under a Moving Average of Order N model, denoted as MA(N),
the estimate of D returned at period t, is equal to:
N 1
ˆ 1
D (t )
N
D(t i)
i 0
The forecasting error
• The forecasting error
N 1
ˆ 1
(t 1) D (t ) D(t 1)
N
D(t i) D(t 1)
i 0
• Also
1 N 1 1
E[ (t 1)] E[ D(t i )] E[ D(t 1)] ( ND ) D 0
N i 0 N
1 N 1 1 1 2
Var [ (t 1)] 2 Var [ D(t i)] Var [ D(t 1)] 2 N (1 )
2 2
N i 0 N N
Forecasting error (cont.)
20.00
15.00
Series1
Series2
Series3
10.00
5.00
0.00
10
13
19
34
40
16
22
25
28
31
37
1
• blue series: the original data series, distributed according to N(10,4) for the first 20
points, and N(20,4) for the last 20 points.
• magenta series: the predictions of the MA(5) forecasting model.
• yellow series: the predictions of the MA(10) forecasting model.
• Remark: the MA(5) model adjusts faster to the experienced jump of the data mean
value, but the mean estimates that it provides under stationary operation are less accurate
than those provided by the MA(10) model.
Forecasting constant mean series:
The Simple Exponential Smoothing model
• The presumed demand model:
D(t ) D e(t )
where D is an unknown constant and e(t ) is normally distributed
with zero mean and an unknown variance .
2
20.00
15.00 Series1
Series2
Series3
10.00 Series4
5.00
0.00
10
13
19
34
40
16
22
25
28
31
37
1
• dark blue series: the original data series, distributed according to N(10,4) for the first 20
points, and N(20,4) for the last 20 points.
• magenta series: the predictions of the ES(0.2) model initialized at the value of 10.0.
• yellow series: the predictions of the ES(0.2) model initialized as 0.0.
• light blue series: the predictions of the ES(0.8) model initialized at 10.0.
• Remark: the ES(0.8) model adjusts faster to the jump of the series mean value, but the
estimates that it provides under stationary operation are less accurate than those provided by
the ES(0.2) model. Also, notice that the effect of the initial value is only transient.
The inadequacy of SES and MA
models for data with linear trends
12
10
8
Dt
6 SES(0.5)
SES(1.0)
4
0
1 2 3 4 5 6 7 8 9 10
• blue series: a deterministic data series increasing linearly with a slope of 1.0.
• magenta series: the predictions obtained from the SES(0.5) model initialized at
the exact value of 1.0.
• yellow series: the predictions obtained from the SES(1.0) model initialized at
the exact value of 1.0.
• Remark: Both models under-estimate the actual values, with the most inert
model SES(0.5) under-estimating the most. This should be expected since both
of these models (as well as any MA model) essentially average the past
observations. Therefore, neither the MA nor the SES model are appropriate for
forecasting a data series with a linear trend in it.
Forecasting series with linear trend:
The Double Exponential Smoothing Model
The presumed data model:
D(t ) I T t e(t )
where
I is the model intercept, i.e., the unknown mean value for t=0,
T is the model trend, i.e., the mean increase per unit of time, and
e(t ) is normally distributed with zero mean and some unknown
variance
2
The Double Exponential Smoothing
Model (cont.)
The model forecasts at period t for periods t+, =1,2,…, are
given by:
Dˆ (t ) Iˆ(t ) Tˆ (t )
with the quantities Iˆ(t ) and Tˆ (t ) obtained through the following
recursions:
Iˆ(t ) a D(t ) (1 a)[ Iˆ(t 1) Tˆ (t 1)]
Tˆ (t ) [ Iˆ(t ) Iˆ(t 1)] (1 ) Tˆ (t 1)
The parameters a and take values in the interval (0,1) and are the
model smoothing constants, while the values Iˆ(0) and Tˆ (0) are the
initializing values.
The Double Exponential Smoothing
Model (cont.)
• The smoothing constants are chosen by trial and error, using the
MAD, MSD and/or MAPE indices.
• For t Iˆ(t ) I and Tˆ (t ) T
• The variance of the forecasting error, , can be estimated as a
2
10
8
Dt
6 DES(T0=1)
DES(T0=0)
4
0
1 2 3 4 5 6 7 8 9 10
• blue series: a deterministic data series increasing linearly with a slope of 1.0.
• magenta series: the predictions obtained from the DES(0.5;0.2) model
initialized at the exact value of 1.0.
• yellow series: the predictions obtained from the DES(0.5;0.2) model initialized
at the value of 0.0.
• Remark: In the absence of variability in the original data, the first model is
completely accurate (the blue and the magenta series overlap completely), while
the second model overcomes the deficiency of the wrong initial estimate and
eventually converges to the correct values.
Time Series-based Forecasting:
Accommodating seasonal behavior
The data demonstrate a periodic behavior (and maybe some
additional linear trend).
350
300
250
200
Series1
150
100
50
0
0 2 4 6 8 10 12 14
Remarks:
• At each cycle, the demand of a particular season is a fairly stable percentage of
the total demand over the cycle.
• Hence, the ratio of a seasonal demand to the average seasonal demand of the
corresponding cycle will be fairly constant.
• This ratio is characterized as the corresponding seasonal index.
A forecasting methodology
Forecasts for the seasonal demand for subsequent years can be obtained by:
i. estimating the seasonal indices corresponding to the various seasons in the
cycle;
ii. estimating the average seasonal demand for the considered cycle (using, for
instance, a forecasting model for a series with constant mean or linear trend,
depending on the situation);
iii. adjusting the average seasonal demand by multiplying it with the
corresponding seasonal index.
Example (cont.):
Year 1 Year 2 Year 3 SI(1) SI(2) SI(3) SI
Spring 90 115 120 0.9 0.92 0.78 0.87
Summer 180 230 290 1.8 1.84 1.88 1.84
Fall 70 85 105 0.7 0.68 0.68 0.69
Winter 60 70 100 0.6 0.56 0.65 0.6
Total 400 500 615 4 4 4 4
Average 100 125 153.75
Winter’s Method for Seasonal
Forecasting
The presumed model for the observed data:
• ci, i=1,2,…N, is the seasonal index for the i-th season in the cycle;
• e(t) is normally distributed with zero mean and some unknown variance 2
Winter’s Method for Seasonal Forecasting
(cont.)
The model forecasts at period t for periods t+, …, are given by:
D b0 b1 X 1 ... bk X k
• We need to estimate <b0,b1,…,bk> and from a set of n observations
{ D j ; X 1 j , X 2 j ,..., X kj , j 1,..., n}
Estimating the parameters bi
• The observed data satisfy the following equation:
D1 1 X 11 ... X k1 b0 e1
D 1 X ... X k 2 b1 e2
2 12
... ... ... ... ... ... ...
Dn 1 X 1n ... X kn bk en
or in a more concise form
d X b e
• The vector
e d X b
denotes the difference between the actual observations and the corresponding
mean values, and therefore, b̂ is selected such that it minimizes the Euclidean
norm of the resulting vector eˆ d X bˆ .
• The minimizing value for b̂ is equal to bˆ ( X T X ) 1 X T d
• The necessary and sufficient condition for the existence of ( X T X ) 1 is that the
columns of matrix X are linearly independent.
Characterizing the model variance
• An unbiased estimate of is given by
SSE
MSE (Mean Squared Error)
n k 1
where
SSE eˆT eˆ (d X bˆ)T (d X bˆ) (Sum of Squared Errors)
• The quantity SSE/follows a Chi-square distribution with n-k-1 degrees of
freedom.
• Given a point x0T=(1,x10,…,xk0), an unbiased estimator of D ( x0 )is given by
SYY
where n
SSR bˆT ( X T d ) nd 2 ( Dˆ j d ) 2
n j 1
1
d Dj
n j 1
and
SYY SSE SSR
• Remark: A natural way to interpret R2 is as the fraction of the variability in the
observed data interpreted by the model over the total variability in this data.
Multiple Linear Regression and
Time Series-based forecasting
• The model needs to be linear with respect to the parameters bi but not the
explanatory variables Xi. Hence, the factor multiplying the parameter bi can be any
function fi of the underlying explanatory variables.
• When the only explanatory variable is just the time variable t, the resulting multiple
linear regression model essentially supports time-series analysis.
• The above approach for time-series analysis enables the study of more complex
dependencies on time than those addressed by the moving average and exponential
smoothing models.
[ Dˆ ( x0 ) D( x0 )] 1 x0 ( X T X ) 1 x0 Dˆ ( x0 ) D( x0 )
T
T
SSE 2 T
MSE [1 x0 ( X T X ) 1 x0 ]
n k 1
follows a t distribution with n-k-1 degrees of freedom.
• For large samples, T can also be approximated by a standardized normal
distribution.
Adjusting the forecasted demand in
order to achieve a target service level p
Letting y denote the required adjustment, we essentially need to solve the following
equation:
ˆ ( x ) y) p
P ( D( x0 ) D 0
D( x0 ) Dˆ ( x0 ) y
P( ) p
T T
MSE[1 x0 ( X T X ) 1 x0 ] MSE[1 x0 ( X T X ) 1 x0 ]
y
t p ,n k 1
T
MSE[1 x0 ( X T X ) 1 x0 ]
T
y t p ,n k 1 MSE[1 x0 ( X T X ) 1 x0 ]
Remark: The two-sided confidence interval that is necessary for monitoring the
model performance can be obtained through a straightforward modification of the
above reasoning.