You are on page 1of 17

Contents

TOPICS Chapter 1 1.
1.1 1.1.1 1.1.2

Page no.

Conceptual understanding: Definition of Time Series: Forecasting Modelling

1.2 Types of Forecasting Time Series 1.2.1 Autoregressive integrated moving average (ARIMA) Box-Jenkins 1.3 New terms: 1.2.2 Stationary process 1.2.3 Ergodicity Chapter 2
2.

Applications of forecasting and modelling

Chapter 1 Introduction
1.1

Definition of Time Series:

A time series is a sequence of observations of a random variable. It is a set of ordered observations on a quantitative characteristic of a phenomenon at equally spaced time points. In statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. A time series is a set of numbers that measures the status of some activity over time. It is the historical record of some activity, with measurements taken at equally spaced intervals (exception: monthly) with a consistency in the activity and the method of measurement. Examples include the monthly demand for a product, the annual freshman enrollment in a department of a university, and the daily volume of flows in a river. An example of a time series for 25 periods is plotted in Fig. 1 from the numerical data in the Table 1. The data might represent the weekly demand for some product. We use x to indicate an observation and the subscript t to represent the index of the time period. For the case of weekly demand the time period is measured in weeks. The observed demand for time t is specifically designated xt. The lines connecting the observations on the figure are provided only to clarify the graph and otherwise have no meaning.

Mathematical Model
Our goal is to determine a model that explains the observed data and allows extrapolation into the future to provide a forecast. The simplest model suggests that the time series in Fig. 1 is a constant value b with variations about b determined by a random variable t. Xt = b + t.......................................................... (1) The upper case symbol Xt represents the random variable that is the unknown demand at time t, while the lower case symbol xt is a value that has actually been observed. The random variation t about the mean value is called the noise, and is assumed to have a mean value of zero and a given variance. It is also common to assume that the noise variations in two different time periods are independent.

Analysis of Time Series


Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data. Time series data have a natural temporal ordering. This makes time series analysis distinct from other common data analysis problems, in which there is no natural ordering of the observations (e.g. explaining people's wages by reference to their education level, where the individuals' data could be entered in any order). Time series analysis is also distinct from spatial data analysis where the observations typically relate to geographical locations (e.g. accounting for house prices by the location as well as the intrinsic characteristics of the houses). Time series analysis provides tools for selecting a model that can be used to forecast of future events.

There are two main goals of time series analysis: (a) Identifying the nature of the phenomenon represented by the sequence of observations, and (b) Forecasting (predicting future values of the time series variable).

Both of these goals require that the pattern of observed time series data is identified and more or less formally described. Once the pattern is established, we can interpret and integrate it with other data (i.e., use it in our theory of the investigated phenomenon, e.g., seasonal commodity prices). Regardless of the depth of our understanding and the validity of our interpretation (theory) of the phenomenon, we can extrapolate the identified pattern to predict future events.

Notation:
A number of different notations are in use for time-series analysis. A common notation specifying a time series X that is indexed by the natural numbers is written X = {X1, X2, ...}. Another common notation is Y = {Yt: t T}, where T is the index set.

Different types of Analysis of Time Series


There are several types of data analysis available for time series which are appropriate for different purposes: General exploration Graphical examination of data series. Spectral analysis to examine cyclic behaviour which need not be related to seasonality. For example, sun spot activity varies over 11 year cycles. Other common examples include celestial phenomena, weather patterns, neural activity, commodity prices, and economic activity. Separation into components representing trend, seasonality, slow and fast variation, cyclical irregular. Uses simple properties of marginal distributions. Prediction and forecasting

Fully formed statistical models for stochastic simulation purposes, so as to generate alternative versions of the time series, representing what might happen over non-specific time-periods in the future

Simple or fully formed statistical models to describe the likely outcome of the time series in the immediate future, given knowledge of the most recent outcomes (forecasting). Time series analysis provides tools for selecting a model that can be used to forecast of future events. Modeling the time series is a statistical problem. Forecasts are used in computational procedures to

estimate the parameters of a model being used to allocate limited resources or to describe random processes such as those mentioned above. Time series models assume that observations vary according to some probability distribution about an underlying function of time. Once a model is selected and data are collected, it is the job of the statistician to estimate its parameters; i.e., to find parameter values that best fit the historical data. Statisticians usually assume that all values in a given sample are equally valid. For time series, however, most methods recognize that recent data are more accurate than aged data. Influences governing the data are likely to change with time so a method should have the ability of deemphasizing old data while favoring new. A model estimate should be designed to reflect changing conditions.

1.1.1

Forecasting

Forecasting is the process of making statements about events whose actual outcomes (typically) have not yet been observed. A commonplace example might be estimation for some variable of interest at some specified future date. Prediction is a similar, but more general term. Both might refer to formal statistical methods employing time series, cross-sectional or longitudinal data, or alternatively to less formal judgemental methods. The terms "forecast" and "forecasting" are sometimes reserved for estimates of values at certain specific future times, while the term "prediction" is used for more general estimates, such as the number of times floods will occur over a long period. Risk and uncertainty are central to forecasting and prediction; it is generally considered good practice to indicate the degree of uncertainty attaching to forecasts. . Forecasting can be described as predicting what the future will look like, whereas planning predicts what the future should look like.
Categories of forecasting methods Qualitative vs. Quantitative Methods Qualitative forecasting techniques are subjective, based on the opinion and judgment of consumers, experts; appropriate when past data is not available. It is usually applied to intermediate-long range decisions. Example of qualitative forecasting methods:

Informed opinion and judgment Delphi method Market research Historical life-cycle Analogy.

Quantitative forecasting models are used to estimate future demands as a function of past data; appropriate when past data is available. It is usually applied to short-intermediate range decisions. Example of Quantitative forecasting methods:

Last period demand Arithmetic Average Simple Moving Average (N-Period) Weighted Moving Average (N-period) Simple Exponential Smoothing Multiplicative Seasonal Indexes

In making a forecast, it is also important to provide a measure of how accurate one can expect the forecast to be. The use of intuitive methods usually precludes any quantitative measure of confidence in the resulting forecast. The statistical analysis of the individual relationships that make up a model, and of the model as a whole, makes it possible to attach a measure of confidence to the models forecasts. Once a model has been constructed and fitted to data, a sensitivity analysis can be used to study many of its properties. In particular, the effects of small changes in individual variables in the model can be evaluated. For example, in the case of a model that describes and predicts interest rates, one could measure the effect on a particular interest rate of a change in the rate of inflation. This type of sensitivity study can be performed only if the model is an explicit one. In Time-Series Models we presume to know nothing about the causality that affects the variable we are trying to forecast. Instead, we examine the past behavior of a time series in order to infer something about its future behavior. The method used to produce a forecast may involve the use of a simple deterministic model such as a linear extrapolation or the use of a complex stochastic model for adaptive forecasting. One example of the use of time-series analysis would be the simple extrapolation of a past trend in predicting population growth. Another example would be the development of a complex linear stochastic model for passenger loads on an airline. Time-series models have been used to forecast the demand for airline capacity, seasonal telephone demand, the movement of short-term interest rates, and other economic variables. Time-series models are particularly useful when little is known about the underlying process one is trying to forecast. The limited structure in time-series models makes them reliable only in the short run, but they are nonetheless rather useful.

1.1.2 Modeling
Modeling the time series is a statistical problem. Models for time series data can have many forms and represent different stochastic processes. When modeling variations in the level of a process, three broad classes of practical importance are the autoregressive (AR) models, the integrated (I) models, and the moving average (MA) models. These three classes depend linearly on previous data points. Combinations

of these ideas produce autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models. The autoregressive fractionally integrated moving average (ARFIMA) model generalizes the former three. Extensions of these classes to deal with vector-valued data are available under the heading of multivariate time-series models and sometimes the preceding acronyms are extended by including an initial "V" for "vector". An additional set of extensions of these models is available for use where the observed time-series is driven by some "forcing" time-series (which may not have a causal effect on the observed series): the distinction from the multivariate case is that the forcing series may be deterministic or under the experimenter's control. For these models, the acronyms are extended with a final "X" for "exogenous". The main purpose of modeling a time series is to make forecasts which are then are used directly for making decisions, such as ordering replenishments for an inventory system or developing staff schedules for running a production facility. They might also be used as part of a mathematical model for a more complex decision analysis. In the analysis, let the current time be T, and assume that the demand data for periods 1 through T are known. Say we are attempting to forecast the demand at time T + in the example presented above. The unknown demand is the random variable XT+ , and its ultimate realization is xT+ . Our forecast of the realization is ^xT+ . Of course, the best that we can hope to do is estimate the mean value of XT+ . Even if the time series actually follows the assumed model, the future value of the noise is unknowable. Assuming the model is correct XT+ = E[XT+ ] + t where E[XT+ ] = f(b0, b1, b2,,bn, T+ ). When we estimate the parameters from the data for times 1 through T, we have an estimate of the expected value for the random variable as a function of . This is our forecast. xT+ = f(^b0, ^b1, ^b2, ,^bn, T+ ). Using a specific value of in this formula provides the forecast for period T+ . When we look at the last T observations as only one of the possible time series that could have been obtained from the model, the forecast is a random variable. We should be able to describe the probability distribution of the random variable, including its mean and variance. Modeling for Forecasting Accuracy and Validation Assessments Forecasting is a necessary input to planning, whether in business, or government. Often, forecasts are generated subjectively and at great cost by group discussion, even when relatively simple quantitative methods can perform just as well or, at very least; provide an informed input to such discussions. Data Gathering for Verification of Model: Data gathering is often considered "expensive". Indeed, technology "softens" the mind, in that we become reliant on devices; however, reliable data are needed to verify a quantitative model. Mathematical models, no matter how elegant, sometimes escape the appreciation of the decision-maker. In other words, some people think algebraically; others see geometrically. When the data are complex or multidimensional, there is the more reason for working with

equations, though appealing to the intellect has a more down-to-earth undertone: beauty is in the eye of the other beholder - not you; yourself.

The following flowchart highlights the systematic development of the modeling and forecasting phases:

1.2 methods
Time series methods

Time series

Time series methods use historical data as the basis of estimating future outcomes. Moving average Weighted moving average Exponential smoothing Autoregressive moving average (ARMA) Autoregressive integrated moving average (ARIMA) e.g. Box-Jenkins Extrapolation Linear prediction Trend estimation Growth curve

Autoregressive integrated moving average (ARIMA)

Box-Jenkins Methodology Box-Jenkins Forecasting Method: The univariate version of this methodology is a self- projecting time series forecasting method. The underlying goal is to find an appropriate formula so that the residuals are as small as possible and exhibit no pattern. The model- building process involves a few steps, repeated as necessary, to end up with a specific formula that replicates the patterns in the series as closely as possible and also produces accurate forecasts.

Box-Jenkins forecasting models are based on statistical concepts and principles and are able to model a wide spectrum of time series behavior. It has a large class of models to choose from and a systematic approach for identifying the correct model form. There are both statistical tests for verifying model validity and statistical measures of forecast uncertainty. In contrast, traditional forecasting models offer a limited number of models relative to the complex behavior of many time series, with little in the way of guidelines and statistical tests for verifying the validity of the selected model. Data: The misuse, misunderstanding, and inaccuracy of forecasts are often the result of not appreciating the nature of the data in hand. The consistency of the data must be insured, and it must be clear what the data represents and how it was gathered or calculated. As a rule of thumb, Box-Jenkins requires at least 40 or 50 equally-spaced periods of data. The data must also be edited to deal with extreme or missing values or other distortions through the use of functions such as log or inverse to achieve stabilization. Preliminary Model Identification Procedure: A preliminary Box-Jenkins analysis with a plot of the initial data should be run as the starting point in determining an appropriate model. The input data must be adjusted to form a stationary series, one whose values vary more or less uniformly about a fixed level over time. Apparent trends can be adjusted by having the model apply a technique of "regular differencing," a process of computing the difference between every two successive values, computing a differenced series which has overall trend behavior removed. If a single differencing does not achieve stationarity, it may be repeated, although rarely, if ever, are more than two regular differencing required. Where irregularities in the differenced series continue to be displayed, log or inverse functions can be specified to stabilize the series, such that the remaining residual plot displays values approaching zero and without any pattern. This is the error term, equivalent to pure, white noise. Pure Random Series: On the other hand, if the initial data series displays neither trend nor seasonality, and the residual plot shows essentially zero values within a 95% confidence level and these residual values display no pattern, then there is no real-world statistical problem to solve and we go on to other things.

Model Identification Background Basic Model: With a stationary series in place, a basic model can now be identified. Three basic models exist, AR (autoregressive), MA (moving average) and a combined ARMA in addition to the previously specified RD (regular differencing): These comprise the available tools. When regular differencing is applied, together with AR and MA, they are referred to as ARIMA, with the I indicating "integrated" and referencing the differencing procedure.

Seasonality: In addition to trend, which has now been provided for, stationary series quite commonly display seasonal behavior where a certain basic pattern tends to be repeated at regular seasonal intervals. The seasonal pattern may additionally frequently display constant change over time as well. Just as regular differencing was applied to the overall trending series, seasonal differencing (SD) is applied to seasonal non-stationarity as well. And as autoregressive and moving average tools are available with the overall series, so too, are they available for seasonal phenomena using seasonal autoregressive parameters (SAR) and seasonal moving average parameters (SMA).

Establishing Seasonality: The need for seasonal autoregression (SAR) and seasonal moving average (SMA) parameters is established by examining the autocorrelation and partial autocorrelation patterns of a stationary series at lags that are multiples of the number of periods per season. These parameters are required if the values at lags s, 2s, etc. are nonzero and display patterns associated with the theoretical patterns for such models. Seasonal differencing is indicated if the autocorrelations at the seasonal lags do not decrease rapidly.

1.3 New terms


1.3.1

Stationary process

In the mathematical sciences, a stationary process (or strict(ly) stationary process or strong(ly) stationary process) is a stochastic process whose joint probability distribution does not change when shifted in time or space. Consequently, parameters such as the mean and variance, if they exist, also do not change over time or position. Stationarity is used as a tool in time series analysis, where the raw data are often transformed to become stationary; for example, economic data are often seasonal and/or dependent on a non-stationary price level. An important type of non-stationary process that does not include a trend-like behavior is the cyclostationary process. Note that a "stationary process" is not the same thing as a "process with a stationary distribution". Indeed there are further possibilities for confusion with the use of "stationary" in the context of stochastic processes; for example a "time-homogeneous" Markov chain.

1.3.2 Ergodicity
An attribute of stochastic systems ;generally, a system that tends in probability to a limiting form that is independent of the initial conditions. The term Ergodic is used to describe a dynamical system which, broadly speaking, has the same behaviour averaged over time as averaged over space.

Chapter 2
Applications of forecasting
Forecasting has application in many situations: Weather forecasting, Flood forecasting and Meteorology Transport planning and Transportation forecasting Economic forecasting Egain Forecasting Technology forecasting Earthquake prediction Land use forecasting

Product forecasting Player and team performance in sports Telecommunications forecasting Political Forecasting Sales Forecasting Supply chain management - Forecasting can be used in Supply Chain Management to make sure that the right product is at the right place at the right time. Accurate forecasting will help retailers reduce excess inventory and therefore increase profit margin. Accurate forecasting will also help them meet consumer demand. Economic forecasting is the process of making predictions about the economy. Forecasts can be carried out at a high level of aggregation - for example for GDP, inflation, unemployment or the fiscal deficit - or at a more disaggregated level, for specific sectors of the economy or even specific firms.

Demand/Sales Forecasting in Indian Firms


A company can use different time-series analysis methods for forecasting sales. Time-series analysis makes a forecast for the future sales on the base of historical data. The difficulty and style of these analyses can differ extensively. The most easy assessment might use last years figures as the next years forecast, but if the company is new or is a growing company, then this easy method cannot be used.

Study of Indian Two-wheeler Automotive Industry


The composition of the two-wheeler industry has witnessed major changes in the post-reform period. In 1991, the share of scooters was about 50 per cent of the total 2-wheeler demand in the Indian market. Motorcycle and moped had been experiencing almost equal level of shares in the total number of two-

wheelers. In 2003-04, the share of motorcycles increased to 78 per cent of the total two-wheelers, while the shares of scooters and mopeds declined to the level of 16 and 6 per cent respectively. Demand Forecast Estimations were based on Panel Regression, which takes into account both time series and cross section variation in data. A panel data of 16 major states over a period of 5 years ending 1999 was used for the estimation of parameters. The models considered a large number of macro-economic, demographic and socio-economic variables to arrive at the best estimations for different two-wheeler segments. The projections have been made at all India and regional levels. Different scenarios have been presented based on different assumptions regarding the demand drivers of the two-wheeler industry. The most likely scenario assumed annual growth rate of Gross Domestic Product (GDP) to be 5.5 per cent during 2002-03 and was anticipated to increase gradually to 6.5 per cent during 2011-12. The all-India and region wise projected growth trends for the motorcycles and scooters are presented in Table 1. The above-mentioned forecast presents a long-term growth for a period of 10 years. The high growth rate in motorcycle segment at present is forecast to stabilise after a certain point beyond which a condition of equilibrium.

ACKNOWLEDGEMENT
We would like to acknowledge the support, valuable advice and guidance of many people who had helped us in carrying out this project. For the invaluable support and guidance we express our deep sense of gratitude to our respected professor Arup Roy sir. We would also like to thank sir for giving us the opportunity to work on such an interesting and intellectually stimulating

research based project. We would like to thank sir for guiding us at each step during the course of this study/ project. We also wish to acknowledge the different sources form where we have acquired the facts and figures used in this study. The details of these sources have been cited in the end.

CONCLUSION
The selection of a forecasting method is a difficult task that must be base in part on knowledge concerning the quantity being forecast. We can, however, point out some simple characteristics of the methods that have been described. With forecasting procedures, we are generally trying to recognize a change in the underlying process of a time series while remaining insensitive to variations caused by purely random effects. The goal of planning is to respond to fundamental changes, not to spurious effects. With a method based purely on historical data, it is impossible to filter out all the noise. The problem is to set parameters that find an acceptable tradeoff between the fundamental process and the noise. If the process is changing very

slowly, both the moving average and the regression approach should be used with a long stream of data.

Bibliography:
1. Introduction to time series and forecasting. By Peter J. Brockwell and Ricghard A. Davis 2. http://en.wikipedia.org/wiki/Time_series 3.http://www.nber.org/papers/w11033.pdf? new_window=1

You might also like