You are on page 1of 9

132 reviewer

Forecasting

The predictability of an event depends on several factors:


- How well we understand the factors that contribute to it
- How much data are available
- Whether the forecasts can affect the thing we are trying to forecast

Good forecasts capture the genuine patterns and relationships which exist in the historical
data, but do not replicate past events that will not occur again.
- Know if there are random fluctuations in past data that must be ignored
- And genuine pattern that should be modelled and extrapolated

Good forecasting model captures the way in which things are changing
- Forecasts rarely assume that the environment is unchanging
- But the forecast assumes that the way the environment is changing will continue in the
future.

The choice of method depends on what data are available and the predictability of the quantity
to be forecasted.

Forecasting, Planning and goals

Forecast – common statistical task in business and should be an integral part of business
decision-making.
- Predicting the future as accurately as possible given data and knowledge of future enevts
that might impact the forcasts.

Goals – what you want to happen


- Linked to forecasts but does not always happen

Planning
- Responses to goals and forecast
- Appropriate actions to match forecast with goals

Forecasts can be of short, medium, or long term

Forecasting methods depend largely on what data are available.


- No data: qualitative
- Numerical info is available; it is reasonable to assume that some aspects of the past
patterns will continue into the future – quantitative
Cross sectional data – single point in time
- Cross sectional models are used when the variable to be forecast exhibit a relationship with
one or more predictor variables.
- The purpose of the cross sectional models is to describe the form of the relationship and
use it to forecast values of the forecast variables that have not been observed.
- Predictors will affect the output in a predictable way – assuming that their relationship
does not change (e.g regression models)
- Predict – cross section; forecast – time series
-
Time series data – collected at regular intervals over time
- Time series data are useful when you are forecasting something that is changing over time
(stock prices, sales)
- When you are forecasting a time series data, the aim is to estimate the sequence of
observations that will continue into the future.

Prediction interval - 80% PI: each future value is expected to lie in the dark region with a
probability of 80%
- Useful way of displaying the uncertainty in forecast
- In the example, forecasts are expected to be accurate, hence the prediction intervals are
quite narrow.

Time series forecasting


- Uses only information on the variable to be forecast, and makes no attempt to discover the
factors which affects its behavior.
- It will extrapolate trends and seasonal patterns, but ignores al lother info such as mktg
initiatives

Predictor variables (explanatory vars) – explanatory model


- Can be used in time series forecasting
- Relationships will not always be exact; such that changes in output cannot completely be
explained by the predictors. Kaya may error term
- Error term allows for random variation and the effects of relevant variables not included in
the model .

Time series model


- Prediction of the future is based on past values of a variable, but not on external variables
which may affect the system.

Mixed of both – dynamic regression models

Steps in forecasting task


1. Problem definition
2. Gathering information
3. Preliminary analysis – graph the data
4. Choosing and fitting models
5. Using and evaluating a forecasting model

Thething we are trying to forecast is unknown and so we treat it as a random variable

The variation associated with the thing we are forecasting will shrink as the event approaches.
When we obtain a forecast, we are estimating the middle of the range of possible values the
random variable could take.
A forecast is accompanied by a prediction interval giving a range of values the random variable
could take with relatively a high probability.

Example: a 95% PI contains a range of values the random variable could take at a 95%
probability.

The set of values that this random variable could take given their probabilities is known as the
probability distribution.  forecast distribution.

When we talk about forecast, we usually mean the average value of the forecast distribution
 “hat”. Soooo  forecast of y1 is y1 hat (this is the average of the possible values the random
variable can take)

Chapter 2 – forecaster’s toolbox

First thing to do in a data is to plot the graph.


- Graphs enable many features of the data to be visualized in cluding patterns, unusual
observations, changes over times, and relationships between variables.
- Datas determine what graphs to use.

1. Time plots
- For time series data
- Observations are plotted against the time of observation, with consecutive observations
joined by straight lines.

Antidiabetic drug sales time plot – increasing trend, strong seasonal pattern that increases in
size as the level of the series increases.

TIME SERIES PATTERNS


1. Trend – a trend exists when there is a long term increase or decrease in the data.
2. Seasonal – this occurs when a time series is affected by seasonal factors such as the time of
the year or the day of the week.
3. Cycle – occurs when the data exhibit rises and falls that are of not fixed period. These
fluctuations are usually due to economic conditions and are related to the business cycle.
Cyclic vs seasonality

- Seasonal patterns have fixed and known length


- Cyclic pattern have variable and unknown length
- ave length of cycles are usually longer than seasonality and are usually more variable

when choosing a forecasting method, we will first need to identify the time series patterns in
the data andthen choose a method that is able to capture the patterns properly. .

2. seasonal plots
- similar to time plot but instead of time of observation, the data is plotted against the
individual “seasons” in which the data were observed.

Ex. From book


- data sa laaht ng season, nag ooverlap
- instead na Makita lang yung year, mas specific yung x. (months) ** but then yung
nakalagay sax lab ay xlab=year pa rin…
- seasonal patterns are useful in identifying yars wehre there are changes in patterms.

Seasonal subseries plots


- data for each season are collected together in separate mini time plots.
- Horizontal lines – means for each month.
- Shows the changes In seasonality over time.

3. Scatterplots
- Useful for exploring relationships of variables in a cross sectional data
- Axes are the different variables
- Helps visualize the relationship between variables.

Non-linear relationship: a change is one entity does not correspond to a constant change in
another.  ex. There is much less benefit in improving fuel economy from 30 to 40 mpg than
there was in moving from 20 to 30 mpg.

**the strength of the relationship of the variables is good for forecasting

scatterplot matrices
- If there are multiple predictor variables

Numerical data summaries


- Summary number calculated from the data is called “Statistic”
- Univariate stat( single var): For single data set: average and median
- Percentile
- Bivariate: commonly used statistic: coeff correlation – measure strength of linear
relationship between two variables (-1 to 1)
- It can be that the corr ceoff is low indicating a low linear relationship of the vars but the
two vars actually have a strong non linear relationship

Ex. -0.97 corr coeff: the value is negative because one var decrease with the increase of
another.

Autocorrelation
- Measures linear relationship between lagged values of a time series.
- Different coefficients depending on lag length
- Plot is known as correlogram

White noise
- Time series that show no autocorrelation
- Autocorr is close to zero. Not exactly close to zero as there is some random variation.

- If there are one or more large spikes outside the bounds (PI) or more than 5% of spikes are
outside, then the series is probably not white noise.

- T = length of time series

- Bounds (+- sqrt(T))

SIMPLE FORECASTING METHODS


1. Average method – forecast for all future values are equal to the mean of the historical
data (can be used for both time series and cross sectional) the rest are for time series
only!!
2. Naïve method – forecasts are simply set to be the value of the last observation.
3. Seasonal naïve
- Useful for highly seasonal data
- Set each forecast to be equal to the last observed value from the same season of the year
(same month of the previous year)
- Ex. The forecast for all future feb values is the last observed value of feb
4. Drift method
- A variation of the naïve method is to allow increase or decrease over time, where the
amount of change over time (called the drift) is set to be the average change seen in the
historical data.

Transformations and adjustments


- Adjusting the historical data can often lead to a simpler forecasting model.
- Purpose is to simplify the patterns in the historical data by removing known sources of
variation or by making the pattern more consistent across the whole data set.  simpler
data set usually leads to a more accurate forecasts.
1. Mathematical transformations
- If the data show variation that increases or decreases with the level of the series,
transformation can be useful.
- Log transformation: relative changes (percentage change) on the original scale.
- Log transformations constrain the forecasts to stay positive on the original scale.
- Power trans = cube and sqrt transformations

Box-cox transformation
- Family of transformations that includes log and power tranformations
- Depends on the parameter (lambda)
o Log in box cox is always a natural log (base e)
- A good value og lambda is one that makes the size of the seasonal variation about the
same across the whole series, as that makes the forecast model simpler.

1. Forecast transformed data.


2. Then reverse the transformation to obtain forecasts on the original scale.(*reverse box
cox transformation)

#2 calendar adjustments
- Some variation seen in seasonal data may be due to simple calendar effects (remove the
variation) (eg different numbers numbers of days of the month
- Simple patterns are easier to models – which leads to a more accurate forecast.

#3 population adjustments
- Any data that are affected by population changes can be adjusted to give per capita data.
(number of beds per thousand people)

#4 inflation adjustments
- Data that are affected by the value of money are best adjusted before modelling. (financial
time series)
- Use price index!! (z=price index) then adjusted price = y/z

EVALUATING FORECAST ACCURACY

Scale-dependent errors
- Forecast error is equal to e = actual minus forecast
- Accuracy measures that are based on e are sale dependent and cannot be compared to
other error on different scales.

Scale dependent measures: (absolute errors or squared errors

MAE and RMSE


Mean absolute error (MAE) = mean of error
Root mean squared error = sqrt of squared error mean

*sa single data (univariate) = popular gamitin ang MAE

percentage errors = p = 100e/y


- Scale independent and so is used when comparing forecasts of different data sets.

MAPE – mean of absolute percentage error


- If percentage errors siya, it has adisadvantage to be undefined if yung y is = 0 and having
extreme values if y is close to 0.
- Percentage errors assume meaning 0s.
- Disadv: place heavier penalty on negative errors than positive.

sMAPE – symmetric MAPE

lower results in accuracy measures == better method!!!

**it is important to evaluate forecast accuracy using genuine forecasts**


**it is invalid to look at how well a model fits the historical data. The accuracy of forecasts can
only be determined by considering how well a model performs on new data that were not used
when fitting the model**

**training + test data set**

test data set is used to measure how well the model is likely to forecast on new data. (usually
20% of the sample data)

test data set are also called hold-out set == held out for fitting (out sample data)

RESIDUAL DIAGNOSTICS

Residual – difference between the observed value and its forecast (e=y-y hat)

A good forecasting method will yield residuals with the following properties:
- Residuals are uncorrelated.
- Zero mean (if mean is not zero, it is biased)
- Residuals is not a good way of determining which method to use.

*if may mean other thsn zero  eg if mean is = m, just add m on all forecast to fix mean
problem (corr problem is harder to fix)

prediction intervals assume normal distribution


portmanteau test for autocorrelation (para sa lagged values ang autocorr)
- Considerinf whole set of autocorrelation rather than each one separately.
- We test whether the first h autocorrs are significantly different from what would be
expected from a white noise process.
- Box pierced
- Ljung box

Large values of Q suggest that the autocorr do not come for the white noise series.

PREDICTION INTERVAL
- Gives an interval within which we expect y to lie with a specified probability.
- If there are parameters: sd is larger in the forecast dist. Than the residual’s.
- If no parameter(eg naïve): same lang
- PI increases as the forecast horizon increases  the further ahead we forecast, the more
uncertainty we have.

Chapter 6
TIME DECOMPOSITION

- Decomposing time series data into several components to better understand the time
series and improve forecasts.

- We shall think of the time series as comprising of three components: trend-cycle, seasonal,
and remainder

Seasonally adjusted data


- If seasonality is removed in the data, it is seasonally adjusted.
- Seasonally adjusted series contain trend cycle and remainder therefore they are not
smooth and ups and downs can be misleading.

Moving averages
- First step in classical decomposition is to use moving average method to estimate the trend
cycle

Moving ave smoothing


- Estimating the trend cycle at time t is obtained by averaging the values of the time series
within k periods of t.
- The average eliminates some of the randomness in the data, leaving a smooth trend cycle
component.  m-ma: moving average of order m  (2k+1)
- The moving average method does not allow estimates of T where t is close to the ends of
the series; no estimates at endpoints

The order of the ma determines the smoothness of the trend cycle estimate. Larger order
means smoother curve.
- Order of ma is usually odd para symmetric.

Ma of ma – weighted ma
- To make even order ma symmetric
- 2x4 ma – 4 ma followed by 2 ma
- when a 2 ma follows a ma of even order – it is called “centered moving average of order 4”
- even – even; odd – odd

- when applied to qtrly data, each qtr of the yr is given equal weight as the first and last
terms apply to the same qtr of the consecutive yrs.

- Even-even: equivalent to weights = 1/m pero yung endpoints ay 1/2m

- *weights sum to 1 and they are symmetric

CLASSICAL DECOMPOSITION

- additive decom and mult decomp

additive decomp
- 1. Moving ave
- if m is even number, use 2xma
- if odd: m-ma
- 2. Detrend
- y-t (t is equal to trend cycle)
- 3. Estimate seasonal component: ave of detrended values for that month/qtr  seasonal
component is obtained by stringing together all the seasonal indices for each year data.
This gives s hat
- 5. Get remainder by subtracting trend cyclend and seasonal component to actual value of
time series

multi decomp
- same with additive but replace subtraction with division
- remainder: e=y/(t)(s)

comments on classical decomp:


- unavailable trend estimate for first few and last few obs.
- No estimate for the remainder din
- Classical decomp assume constant seasonality

STL – seasonal and trend decomp using loess”

You might also like