Crude Oil Review

Econometrics of Crude Oil Markets
Tatyana Gileva
Supervisor: Dominique Gugan e Universit Paris 1 Panthon-Sorbonne e e
A thesis submitted for the degree of Master in Economics 2010 June
Abstract
This project investigates the dynamics of oil prices (Brent and WTI crude oil markets) and their volatilities. Through application of dierent econometrical tools we examine the behavior of crude oil prices and the fundamental factors contributing to this process. Then performance of dierent models of oil markets will be compared. Keywords: Oil markets - Crude oil volatility - GARCH modelling - Oil fundamentals.
Contents
List of Figures List of Tables 1 Introduction 1.1 1.2 1.3 Motivation and background . . . . . . . . . . . . . . . . . . . . . . . . . Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 1.3.2 1.3.3 1.3.4 Crude oil markets and their impact on macroeconomy . . . . . . Oil price volatility . . . . . . . . . . . . . . . . . . . . . . . . . . Modelling crude oil prices and their volatility . . . . . . . . . . . Determining fundamental factors . . . . . . . . . . . . . . . . . . v vi 1 1 2 3 3 3 4 6 7 7 7 9 11 13 13 14 14 14 15 15 16
2 Methodology of time series analysis 2.1 Modelling nancial time series . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.2 2.2.1 ARMA model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . EGARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . GJR-GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . APARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . Choice and validation of time series model . . . . . . . . . . . . . 2.2.1.1 2.2.1.2 2.2.2 Jarque-Bera test . . . . . . . . . . . . . . . . . . . . . . Ljung-Box test . . . . . . . . . . . . . . . . . . . . . . .
General methodology and statistical tools . . . . . . . . . . . . . . . . .
Choice of time series model based on information criteria . . . .
iii
CONTENTS
2.2.3
Choice of time series model based on forecasting performance . . 2.2.3.1 2.2.3.2 Forecast performance measures based on loss function . Diebold-Mariano test . . . . . . . . . . . . . . . . . . .
16 16 17 19 19 23 25 25 27 29 29 31 32 34 34 37 40 40 42 44 44 45 47 48 49
3 Empirical Results 3.1 3.2 3.3 Properties of the data set . . . . . . . . . . . . . . . . . . . . . . . . . . GARCH modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Results of estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 3.3.2 3.4 3.4.1 3.4.2 3.5 WTI crude oil market . . . . . . . . . . . . . . . . . . . . . . . . Brent crude oil market . . . . . . . . . . . . . . . . . . . . . . . . Forecast performance measures . . . . . . . . . . . . . . . . . . . Diebold-Mariano test . . . . . . . . . . . . . . . . . . . . . . . . .
Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Determining fundamental factors 4.1 4.2 4.3 Data set and choice of factors . . . . . . . . . . . . . . . . . . . . . . . . Correlations and linear regression . . . . . . . . . . . . . . . . . . . . . . Principal component analysis . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 4.3.2 4.4 4.4.1 4.4.2 4.5 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . Empirical results . . . . . . . . . . . . . . . . . . . . . . . . . . . Forecasting returns series . . . . . . . . . . . . . . . . . . . . . . Forecasting volatility . . . . . . . . . . . . . . . . . . . . . . . . .
Forecasting and comparison . . . . . . . . . . . . . . . . . . . . . . . . .
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Concluding remarks Bibliography
iv
List of Figures
3.1 3.2 3.3 3.4 3.5 4.1 Dynamics of daily prices and returns of West Texas Intermediate and Brent crude oil markets (US Dollar/Barrel) . . . . . . . . . . . . . . . . Histograms and density plots of crude oil prices and returns . . . . . . . Q-Q plots of crude oil prices and returns . . . . . . . . . . . . . . . . . . Autocorrelation and Partial Autocorrelation functions of WTI returns . Autocorrelation and Partial Autocorrelation functions of Brent returns . Plot of factors in PC1-PC2 coordinate system . . . . . . . . . . . . . . . 20 22 22 23 23 43
List of Tables
3.1 3.2 3.3 3.4 3.5 3.6 Descriptive Statistics of WTI and Brent prices and returns series. . . . . Estimation results of four conditional heteroskedasticity models. WTI returns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Estimation results of four conditional heteroskedasticity models. Brent returns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Volatility forecasts of WTI and Brent crude oil returns. Forecast performance measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Volatility forecasts of WTI crude oil returns. Diebold Mariano test statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Volatility forecasts of Brent crude oil returns. Diebold Mariano test statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 List of variables used in the analysis of fundamental factors aecting WTI crude oil prices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cross correlation analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . Estimation of linear regression model with the initial set of factors. WTI crude oil spot prices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Estimation of linear regression model with the initial set of factors. WTI crude oil spot returns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . Estimation of linear regression model with the reduced set of fundamental factors. WTI crude oil spot price. . . . . . . . . . . . . . . . . . . . . Estimation of linear regression model with the reduced set of fundamental factors. WTI crude oil returns. . . . . . . . . . . . . . . . . . . . . . Forecast of returns series. Results of comparison. . . . . . . . . . . . . . 44 45 43 40 42 39 38 39 32 31 30 28 26 21
vi
LIST OF TABLES
4.9
Modelling weekly WTI returns: mean equation ARMA(1,1) with external regressors. Results of estimation . . . . . . . . . . . . . . . . . . . . 45
4.10 Modelling weekly WTI returns: mean equation ARMA(1,1) without external regressors, four models of conditional variance. Forecast performance measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Modelling weekly WTI returns: mean equation ARMA(1,1) with external regressors, four models of conditional variance. Forecast performance measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 46
vii
Introduction
1.1 Motivation and background
Crude oil is a commodity of great importance and one of the most signicant production factors in many economies. In the last several years crude oil prices have presented large variations - they have steadily risen from about 25 dollars a barrel in August 2003 to over 130 dollars a barrel in May 2008 and then dramatically dropped down during the crisis of 2008. It is mainly because of the key role of oil in the world economy, predicting the future price of this commodity and managing the risks associated with oil prices became crucial for governments and businesses in several dierent respects. Several features of oil market seem well known. We specify some of them.
1. Considerable uctuations in oil prices often have great impacts on economies in general. High volatility of oil prices creates uncertainty, as a result, the economical instability may be observed for both oil-exporting and oil-importing countries. In particular, resource-based economies or economies that extremely depend on oil are characterized by signicant uncertainty and high volatility of exports and therefore government revenues. In such economies oil price uctuations not only aect the government budget considerably, but also have strong eects on stock markets and macroeconomic variables. For instance, higher crude oil prices contribute to ination; the result is recession in oil-consuming countries. 2. Variation of oil prices signicantly aects nancial markets. In particular, daily changes in oil price volatility impact daily stock prices of oil companies and play an
1.2 Objectives
important role in pricing various nancial instruments based on crude oil. Moreover, there is considerable empirical evidence causally linking oil price changes with stock market variables including stock returns, interest rates, real exchange rates, etc. 3. Understanding oil price evolution and its forecasting is very important for oil industry. Producers make oil price forecasts for general purposes of strategic planning and for specic purposes of evaluating investment decisions related to resource exploration, reserve development and production. A price forecast is the foundation for determining a rms risk in managing their oil supply and their forward contracts for oil trades. Accurate price forecasting can therefore help reduce portfolio risks. On the other hand, industrial consumers, such as chemical companies, make these forecasts for the same kinds of reasons - oil is an important input cost that can aect investment decisions. The decision making process on each of the levels described above highly depends upon behavior of oil prices. Given the eects of oil price volatility and the uncertainty, which is accompanied by these price movements, we conclude that there is a great need for oil price volatility measuring and consecutive risk quantication.
1.2
Objectives
The aim of this project is to investigate the dynamics and volatility of prices of West Texas Intermediate and Brent crude oil markets. Through application of econometrics tools we will examine on the one hand the behavior of crude oil prices and on the other hand the fundamental factors contributing to their variation. Then, performance of several models of volatility for forecasting purposes will be compared. The results should be useful to members of oil industries, governments and agents of nancial markets who need to be able to understand and forecast oil price movements in these markets.
1.3 Literature review
1.3
Literature review
Recent studies of crude oil markets are covering a number of dierent areas and issues and examine the characteristics of these markets in various respects. Many empirical studies show evidence that time series of crude oil prices, likewise other nancial time series, are characterized by fat tail distribution, volatility clustering, asymmetry and mean reverse (Morana, 2001; Bina and Vo, 2007). Concerning the most recent time period mentioned in dierent studies, oil price dynamics during 2002-2006 have been characterized by high volatility, high intensity jumps, strong upward drift and were concomitant with underlying fundamentals of oil markets and world economy (Askari and Krichene, 2008).
1.3.1
Crude oil markets and their impact on macroeconomy
Along with other issues, the relationship between oil price volatility and macroeconomy is of a great interest. Lee et al. (1995) examines an impact of dierent oil shocks on real GNP and concludes that positive shocks have a powerful eect on growth while negative shocks do not. The impact of oil prices on economic growth was analyzed by Ferderer (1996), concluding that oil prices have a negative impact. Uri (1998) focuses on the issue whether price of crude oil aects the employment rates. Chen and Chen (2007) examined the relationship between oil prices and real exchange rates in G7 countries using panel analysis. Huang et al. (2005) studied the asymmetry of oil price shocks on economic activities. Another area of research is eciency of crude oil markets. In Green and Mork (1991) generalized method of moments estimation technique is used for the purpose of analyzing eciency. In one of the recent studies, Tabak and Cajueiro (2007) analyze the eciency of crude oil markets (Brent and West Texas Intermediate), describe the long-range dependence in both crude oil prices and volatility and conclude that the markets have become more ecient over time.
1.3.2
Oil price volatility
Why is it important to model oil price volatility? First, considerable uctuations in oil prices often have great impacts on the economy. For example, most U.S. post
World War II recessions were preceded by sharp increases in crude oil prices (Hamilton, 1983). Furthermore, Pindyck (1999) shows that large oil price movements increase uncertainty about future prices and thus cause delays in business investments. Second, volatility plays an important role for pricing derivatives and various nancial instruments. Thus, measuring volatility in order to analyze oil price behavior is a very important issue for both policymakers and agents in nancial markets. The volatility of oil prices was examined in comparison with volatility of other commodities by a number of authors (Pindyck, 1999, Regnier, 2007). In particular, Pindyck (1999) examined oil, coal, and natural gas over a long horizon and found that oil presented the highest degree of volatility. More generally, it has been proven by Regnier (2007) that oil price volatility is relatively high compared to volatility of other commodities. Another area of research is dealing with the relationship between oil price volatility and stock prices. Huang et al. (1996) nds that daily changes in oil price volatility do impact the daily stock prices of oil companies, but there is little impact on the broad stock market. Sadorsky (1999, 2003) estimates vector autoregressions with monthly data on industrial production, interest rates, oil prices, and stock prices and nds that oil price volatility does have a signicant impact on stock price volatility. Oberndorfer (2009) focuses on eects of energy market developments on the energy stock market of Eurozone. Furthermore, there is considerable empirical evidence connecting oil price changes with variables including gross domestic product (Hamilton, 1996), stock returns (Sadorsky, 1999; Huang et al., 1996), interest rates (Papapetrou, 2001; Ferderer, 1996), and real exchange rates (Chen and Chen, 2007). In some cases, the eects have been shown to be asymmetric, i.e. increasing oil price depress the economy, but low oil prices do not increase it proportionately (Sadorsky, 1999; Ferderer, 1996).
1.3.3
Modelling crude oil prices and their volatility
In several early papers standard deviation of price dierences is commonly used as a measure of volatility of commodity prices (Ferderer, 1996; Fleming and Ostdiek, 1999).
Recently, a number of papers dealing with volatility measuring and modelling has significantly increased and more sophisticated techniques are widely used today. The general concept that has been proven to work better over high-frequent time series in nancial markets is generalized autoregressive conditional heteroskedastic models (GARCH) and their modications (such as TGARCH, EGARCH etc.). Initially, the autoregressive conditional heteroskedasticity (ARCH) model was introduced by Engle (1982) and then this model was further modied in the seminal work of Bollerslev (1986), which gained popularity in research of nancial time series. This model assumes that the conditional variance is a deterministic linear function of past squared innovations and past conditional variances. Other techniques such as moving average, simple autoregressive models or linear regressions have shown worse results (Sadorsky, 2006). In particular, using several dierent univariate and multivariate models of ARCH type, Sadorsky nds that the single-equation GARCH outperforms more sophisticated models in forecasting volatility of petroleum futures. Among other recent papers, standard GARCH is used by Yang et al. (2002) for U.S. oil market and by Oberndorfer (2009) for the oil market of Eurozone, by Hwang et al. (2004) for major industrialized countries. Morana (2001) uses the semiparametric approach that exploits the GARCH properties of the oil price volatility of Brent market. Narayan and Narayan (2007) apply exponential generalized conditional heteroskedasticity (EGARCH) model, which allows estimating two features of crude oil price volatility, namely asymmetry and persistence of shocks. Moreover, volatility iS examined over the full DATA sample and across the various sub-samples in order to analyze the robustness of results. The results state that the behavior of oil prices tends to change over short periods of time. Vo (2009) works with a concept of regime-switching stochastic volatility and explains the behavior of crude oil prices of WTI market in order to forecast their volatility. More specically, it models the volatility of oil return as a stochastic volatility process whose mean is subject to shifts in regime. Kang et al. (2009) investigates the eciency of a volatility model with regard to its ability to forecast for three crude oil markets (Brent, Dubai, and WTI). It was shown that the CGARCH and FIGARCH models are better equipped to capture persistence than are the GARCH and IGARCH models.
1.3.4
Determining fundamental factors
Contrary to the popularity of time series analysis applied to crude oil markets, the number of research papers investigating the impact of crude oil determinants on the price series is not as signicant. Krichene (2002) examines world markets for crude oil and natural gas and considers demand elasticities as factors that determine oil price changes. Kaufmann et al. (2004) applies statistical models to estimate the causal relationship between crude oil prices and several factors, such as capacity utilization, production quotas, and production levels. Kaufmann et al. (2008) investigates the factors that might have contributed to the quarterly oil price increase in more details, by expanding a model for crude oil prices to include renery utilization rates, a non-linear eect of OPEC capacity utilization and conditions in futures markets as explanatory variables and nds that this model performs relatively well in terms of forecasting.
The thesis is organized as follows. In Chapter 2 methodology of time series analysis is introduced. Chapter 3 describes the data set and results of time series analysis of WTI and Brent crude oil prices and their volatility. Chapter 4 gives an overview of another approach based on identication of fundamental factors, which may explain the dynamics of WTI crude oil prices. Chapter 5 concludes.
Methodology of time series analysis

In this chapter the methodology used in this thesis is presented. We briey introduce the main models used to model the nancial time series, discuss their main properties, some of their modications and statistical tools applied to time series modelling.
2.1
Modelling nancial time series
The essential point in nancial time series analysis consists in the following: we need to present a model taking into account previous observations in order to extract signicant characteristics of the data. Using such time series forecasting models allows us to forecast future events based on known past observations.
2.1.1
ARMA model
ARMA models is a general class of models, which allows to examine the dynamics of individual time series. Even though ARMA processes have sever disadvantages in modelling nancial time series as it will be discussed below, I prefer to start with ARMA models as they are often considered as a classical technique applied to time series analysis. Furthermore, squared GARCH processes may be seen as ARMA processes and understanding the fundamentals of ARMA might be useful for the further examination of GARCH processes.
2.1 Modelling nancial time series
A general ARMA model consists of two parts: an autoregressive (AR) part and a moving average (MA) part. The qth-order Moving Average process: M A(q) : Yt = + t + 1 t1 + 2 t2 + ... + q tq The pth-order Autoregressive process: AR(p) : Yt = 0 + 1 Yt1 + 2 Yt2 + ... + p Yt1p Mixed Autoregressive Moving Average processes: ARM A(p, q) : Yt = 0 + 1 Yt1 + 2 Yt2 + ... + p Yt1p t 1 t1 2 t2 ... q tq = p q = 0 + i Yti j tj
i=1 j=1
where 0 , 1 , ...p , 1 , q are real coecients, t is a white noise process i.e. {t } is a sequence whose iid elements have mean zero and variance 2 and are uncorrelated that Yt is modeled as a weighted average of past observations and a white noise error. across time: E(t ) = 0, E(2 ) = 2 , E(t s ) = 0 for t = s. This representation implies t
Finding appropriate values of p and q in the ARMA(p,q) model is usually done by plotting the partial autocorrelation functions for an estimate of p and likewise using the autocorrelation functions for an estimate of q. Further information can be obtained by considering the same functions for the residuals of a model tted with an initial selection of p and q. The usual way to estimate parameters in ARMA models, after choosing p and q, is least squares regression such that the values of the parameters minimize the error term. The 1-step ahead forecast of Yh+1 can be easily obtained from the model as
p q Yt (1) = E[Yh+1 |Yh , Yh1 , ...] = 0 + i Yh+1i j h+1j i=1 j=1
and the associated forecast error is eh (1) = Yh+1 Yt (1) , the variance of 1-step ahead
2 forecast error is V ar[eh (1)] = s . For the l-step ahead forecast, we have p q Yt (l) = E[Yh+l |Yh , Yh1 , ...] = 0 + i Yh (l i) j h (l j) i=1 j=1
where h (l j) = 0 if l i > 0 and h (l j) = h+li otherwise, Yh (l i) = Yh+li if l i 0. Therefore, the multistep ahead forecast of an ARMA model can be computed recursively.
The main advantage of ARMA models is their mathematical tractability. ARMA processes give a good approximation of general stationary processes, it is relatively simple to compute explicitly all the parameters, which should be estimated, and the estimation procedure itself is quite clear and well understood. However, as it was shown in dierent studies, ARMA models are not suitable to evaluate the entire distribution of nonlinear processes. Moreover, ARMA models do not take into consideration conditional heteroskedasticity, since the variance is constant over time: Therefore, the next class of models seems to be more appropriate to analyze nancial time series. V ar[Yt |t1 , t2 , ...] = V ar[0 ].
2.1.2
ARCH model
In nancial time series analysis, we are interested in forecasting not only the level of the series, but also its variance. Changes in the variance are very important for understanding nancial markets, as investors require higher expected returns that would compensate holding riskier assets. The AR models imply the unconditional variance being constant. However, we need to take into account that conditional variance may demonstrate a dierent behavior and signicantly change over time. One way to deal with this problem is to present the behavior of squared residuals t as an AR(m) process: 2 = 0 + 1 2 + 2 2 + ... + m 2 . t tm t1 t2
The rst model that provided a systematic framework for volatility modelling is the ARCH model of Engle (1982). The main idea behind the ARCH modelling is the following: the forecast based on the past information is presented as a conditional expectation depending upon the values of past observations. Therefore, the variance of such a forecast depends on past information as well and may therefore be a random variable. Thus, in order to improve the model, we need to include this feature in the model. The ARCH(p) model initially introduced by Engle has the following form: Yt |t1 N (xt , ht ) Y t = t ht p 2 ht = 0 + i yti
i=1
assuming that the mean of Yt is given as xt and a linear combination of lagged endogenous and exogenous variables is included in the information set t1 with as a vector of unknown parameters. The Maximum Likelihood Method is used to estimate the parameters of the model: l= ,
1 T T
lt log ht 1 2 /ht 2 t
lt =
t=1 1 2
where xt includes lagged dependent and exogenous variables. The likelihood function is maximized with respect to the unknown parameters , , which we want to estimate. Engle shows that Ordinary Least Squares estimation gives not as good results as Maximum Likelihood, which is asymptotically superior and more ecient. The following model with mean AR(p) process and conditional variance ARCH(q) process is more intuitive when dealing with nancial data:
p Yt = 0 + i Yti + t i=1 t = t ht
10
ht = 0 +
i=1
i 2 ti
where t N (0, 1) are iid random variables or white noise, independent of the past t , conditions to ensure that the unconditional variance of t is nite.
0 > 0, i 0 for i > 0. The coecients of the model must also satisfy some regularity It can be seen from the model that large past squared shocks 2 imply a large cont ditional variance ht . Therefore, under the ARCH framework, large shocks tend to be followed by other large shocks. This feature is similar to the volatility clustering phenomena observed in nancial time series. The model dened above has the following properties: conditional mean: E[t |t1 ] = 0, conditional variance: E[2 |t1 ] = ht , t conditional distribution function of t is a centered Gaussian distribution: F (t |t1 ) N (0, ht ). The main advantage of this model is that we take into account the fact that conditional variance is substantially aected by the squared residual term (that may be a result of signicant changes on a market) in any of the previous m periods. Therefore, this approach allows capturing the conditional heteroskedasticity of nancial data and provides the explanation of the persistence in volatility. However, the model assumes that positive and negative shocks have the same eects on volatility because it depends on the square of the previous shocks. In practice, it is well known that price of a nancial asset responds dierently to positive and negative shocks. Moreover, The ARCH model does not provide any new insight for understanding the source of variations in nancial time series. It provides only a mechanical way to describe the behavior of the conditional variance and gives no indication about what causes such behavior to occur. In addition, ARCH models are likely to overpredict the volatility because they respond slowly to large isolated shocks to the return series.
2.1.3
GARCH Model
Bollerslev (1986) extended Engles framework by developing a technique that allows the conditional variance to be an ARMA process. GARCH(p,q) therefore has the fol-
11
lowing form:
p Yt = c 0 + ci Yti + t i=1 t = t ht p q ht = 0 + i 2 + j htj ti i=1 j=1
where t are iid random variables with E[t ] = 0, V ar[t ] = 1 (it is often assumed max(p,q) that t N (0, 1)), 0 > 0, i 0 for i > 0, j 0 for j > 0 and (i + i ) < 1.
i=1
The latter constraint is necessary to ensure that the unconditional variance is nite, whereas the conditional variance ht evolves over time. Forecasting Let us now consider the particular simple case of GARCH model and see how forecasts may be constructed within this framework. Volatility forecast from GARCH(1, 1) can be made by repeated substitutions. First, we provide an estimate for the expected squared residuals:
2 E[2 ] = ht E[t ] = ht . t
The conditional variance ht+1 and 1-step ahead forecast is known at time t: ht+1 = 0 + 1 2 + 1 ht .
t
Using the fact that E[2 ] = ht+1 , we obtain t+1 t+2 = 0 + 1 2 + 1 ht+1 = 0 + (1 + 1 )ht+1 . h
t+1
Similarly, ht+2 = 0 + (1 + 1 )ht+1 = 0 + 0 (1 + 1 ) + (1 + 1 )2 ht+1 = = 0 + 0 (1 + 1 ) + 0 (1 + 1 )2 + (1 + 1 )2 [1 2 + 1 ht ]. t Therefore, considering forecasting horizon , we have ht+ = 1(10+1 ) + (1 + 1 ) [1 2 + 1 ht ]. t Moreover, if (1 + 1 ) < 1 , the forecast will converge to the unconditional variance: ht+ 1(10+1 ) . The same reasoning may be applied for GARCH models of higher orders allowing us to compute multistep ahead forecasts.
12
In practice, GARCH models gained popularity because they often give a reasonable t to nancial data and can explain some of the stylized facts. Nevertheless, the model encounters the same weaknesses as the ARCH model. For instance, it responds equally to positive and negative shocks. In addition, recent empirical studies of high frequency nancial time series indicate that the tail behavior of GARCH models remains too short even with standardized Student-t innovations.
2.1.4
EGARCH Model
To overcome some weaknesses of the GARCH model in handling nancial time series, Nelson (1991) proposes the exponential GARCH (EGARCH) model, in particular, to allow for asymmetric eects between positive and negative asset returns. Conditional variance in this case is described as the following process: t = ht t
log(ht ) = 0 +
j=1
j log htj +
k=1
[k (tk /
htk ) + k (|tk / htk | 2/)]
where logged conditional variance is used in order to relax the positiveness constraint of model coecients. Additionally, this specic structure enables the model to respond asymmetrically to positive and negative lagged values of shocks t .
2.1.5
GJR-GARCH Model
The GJR-GARCH model by Glosten, Jagannathan and Runkle (1993) models asymmetric consequences of positive and negative innovations in the GARCH process. t = ht t
ht = 0 +
i=1
i 2 + ti
j=1
j htj +
i=1
i 2 I{ti 0} ti
If the leverage eect holds, we expect i < 0 . The non-negativity condition is satised provided that j 0, i + i 0.
13
2.2 General methodology and statistical tools
2.1.6
APARCH Model
One of the more general GARCH models is the APARCH model (Asymmetric Power ARCH) of Ding, Granger, and Engle (1993). In particular, it nests at least seven ARCH-type models depending on parameters that may be chosen in a specic way. One of the specic features of this model is a possibility to estimate a power coecient that is assumed to be equal to 2 in all previous models. Therefore this model can be much more exible than other models considered. t = ht t p q h = 0 + i [|ti | i ti ] + j h t tj
i=1 j=1
2.2
General methodology and statistical tools
In previous section main time series models were introduced. Now we present necessary procedures that are used in this work. In general, while modelling nancial series, three main steps should be made. 1. Model identication and model selection. 2. Parameter estimation using maximum likelihood estimation or non-linear leastsquares estimation. 3. Model checking by testing whether the estimated model conforms to the specications of a stationary univariate process. In particular, the residuals should be independent and constant in mean and variance over time. Properties of models that are taken into account for choosing the best tting model are presented below.
2.2.1
Choice and validation of time series model
In the validation part, tests are performed to judge whether ARCH eects and autocorrelation have been removed or not. Tests are also performed to see if certain
14
assumptions about the model are fullled, i.e. are the normalized residuals distributed in the way that was assumed in the model. 2.2.1.1 Jarque-Bera test
Jarque-Bera test1 is used to test whether a given distribution is normal or not. H0: data are normally distributed. H1: data are not normally distributed. Jarque-Bera test statistic measures goodness-of-t of departure from normality and is dened as follows JB =
N 2 6 (S
+ 1 K 2 ), 4
where N is the sample size, S is the sample skewness and K is the sample kurtosis at lag k. Null hypothesis is rejected at % signicance level if JB > 2 1,2 , where 2 1,2 is a -quantile of the chi-square distribution with 2 degrees of freedom. 2.2.1.2 Ljung-Box test
Ljung-Box test2 is performed to test whether series have signicant autocorrelation or not. H0: data are not correlated. H1: data are correlated (not random). Ljung-Box test statistic for a number of tested lags k is Q(k) = N (N + 2)
1
i=1
2 i T i ,
Jarque, Carlos M.; Bera, Anil K. (1980). Ecient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters 6 (3): 255259. 2 G. M. Ljung; G. E. P. Box (1978). On a Measure of a Lack of Fit in Time Series Models. Biometrika 65, pp. 297303.
15
where N is the sample size, 2 is the sample autocorrelation at lag k. Null hypothesis i
2 is rejected at % signicance level if Q(k) > 2 1,k , where 1,k is a -quantile of
the chi-square distribution with k degrees of freedom.
2.2.2
Choice of time series model based on information criteria
Another possible approach to analyze dierent models is use of information criteria. Given a data set, several competing models may be ranked according to their values of a chosen information criterion. AIC - Akaike information criterion
AIC is a measure of the goodness-of-t of an estimated statistical model: AIC = 2 log(L) + 2k, where L is the maximized value of the likelihood function for the estimated model, k is the number of parameters in the statistical model. The term 2k is a penalty as an increasing function of the number of estimated parameters. The preferred model is the one with the lowest AIC value.
BIC - Bayes information criterion
BIC is a criterion for model selection among a class of parametric models: BIC = 2 log(L) + k log(n), where L is the maximized value of the likelihood function for the estimated model, k is the number of parameters in the statistical model, n is the sample size. Given any two estimated models, the model with the lower value of BIC is the one to be preferred.
2.2.3
2.2.3.1
Choice of time series model based on forecasting performance

Forecast performance measures based on loss function
A good performance measure can be hard to nd since the volatility is not directly observable. Therefore it makes sense not to rely on one specic measure but rather
16
use several measures. In this work four dierent forecasting performance measures are considered for evaluating the performance of volatility forecasts from dierent GARCH models: 1. Mean Squared Error M SE =
1 n n i=1
(t h2 )2 2 t
1 n n i=1
2. Mean Absolute Deviation M AD =
|t h2 | 2 t 2
3. Quasi Likelihood QLIKE =
1 n
n i=1
log(h2 ) + t h2 2 t t
1 n n i=1
4. Mean Squared Forecast Error R2LOG =
log(t h2 ) 2 t
where n is the number of forecast data points, t is the volatility forecast for date 2 t, ht is the actual volatility on day t. The criteria MSE (1), MAD (2) and R2LOG (4) were suggested by Bollerslev, Engle, and Nelson (1994). QLIKE (3) corresponds to the loss function implied by a Gaussian likelihood. These forecast performance indicators are used in order to estimate forecast accuracy in terms of loss functions. Indeed, given a set of competing models, smaller forecasting error statistics indicate the superior forecasting ability of a corresponding model. In this approach squared residuals 2 are used as a measure of actual volatility. This t choice may be justied using the following equation: E[2 ] = E[ ht t ] = 1E[ ht ] = ht t time t.
since t N (0, 1) and ht is a conditional variance of 2 given information available at t
2.2.3.2
Diebold-Mariano test
Indicators presented above are useful to compare a number of models. Nevertheless, the main shortcoming of this method is that these indicators contain no information whether the dierence between two models is statistically signicant.
17
Second approach involves using statistical test that examines forecast accuracy for two sets of forecast. It was suggested by Diebold and Mariano (1995). Given two dierent models, forecast errors are computed as a dierence between actual and forecasted values: e1,t and e2,t where t = 1, ..., n and n is a number of forecast data points obtained from a model. Then g is chosen as a function of forecast errors and the dierence of errors of two corresponding models is computed: dt = g(e1,t ) g(e2,t ). H0 : E[dt ] = 0, i.e. two models provide equal forecast accuracy. H1 : E[dt ] = 0, i.e. two models provide dierent forecast accuracy. The Diebold-Mariano test statistic is 1/2 DM = V (d) d,
where V - asymptotic variance of the mean of the dierence between the fore(d) casting errors as V (d) n1 [0 + 2 k ], and k - k-th autocovariance of dt . Under the null hypothesis DM test statistic has an asymptotic standard normal distribution. Negative value of the DM statistic suggests that Model 1 statistically dominates Model 2.
18
Empirical results
In this section we briey discuss properties of data set of WTI and Brent crude oil. Then estimation of four competing GARCH-related models and evaluation of their forecasting accuracy is performed.
3.1
Properties of the data set
We analyze West Texas Intermediate (WTI) and Brent crude oil spot prices (in US dollars per barrel). The datasets consist of daily closing prices over the period from January 2, 1995 to March 11, 2010 and contains 3910 and 3909 observations for WTI and Brent crude oil markets respectively1 . In order to assure that there is no trend component and data are stationary to some extend, we analyze dynamics of return series rather than price series themselves. In general, returns are calculated in the following way: Rt =
Pt Pt1 Pt1 .
The usual way to proceed is to compute returns on a continuous compounding basis: rt = ln(1 + Rt ) = ln( PPt ). t1 Here Pt denotes the price of crude oil at time t.
1
The data set was kindly provided by Total.
19
3.1 Properties of the data set
Fig. 3.1 illustrates dynamics of two markets considered. The behavior of prices and returns is clearly unsteady and dynamics of returns gives an evidence of volatility clustering, i.e. periods of high volatility are followed by periods of relatively low volatility. Basic descriptive statistics is presented in the Table 3.1.
Figure 3.1: Dynamics of daily prices and returns of West Texas Intermediate and Brent crude oil markets (US Dollar/Barrel) - January 2, 1995 to March 11, 2010.
Kurtosis is greater than 3 in all cases, thus density function is characterized by
fat tails comparing to the density of the Gaussian distribution N (0, 1). It also may be seen from graphs of the density functions (Figure 3.2). This behavior is known to frequently occur in nancial markets.
Coecient of skewness is positive for prices of WTI and Brent markets and for
returns of WTI market indicating that there is an asymmetry of the probability distribution, namely data are right skewed (i.e. right tail is longer and the mass of distribution is concentrated on the left of the gure), while returns of Brent
20
WTI Prices Number of observations Mean Variance Standard deviation Skewness Kurtosis Minimum 1st Quantile Median Mean 3rd Quantile Maximum Jarque-Bera statistic 3911 40.94359 688.1112 26.23187 1.324662 4.453592 10.5 20.82 29.80 40.94 59.27 145.78 1490.069 (0.0000) Returns 3911 0.0003910 0.0005848455 0.02418358 0.1301484 8.970388 -0.1643000 -0.0124200 0.0007206 0.0003910 0.0139300 0.2138000 5827.231 (0.0000) Prices 3910 39.73698 703.0868 26.51578 1.308778 4.359510 9.7 19.48 28.18 39.74 58.55 145.61 1419.2 (0.0000)
Brent Returns 3910 0.0004051 0.0004865956 0.02205891 -0.07286083 6.942048e -0.1534000 -0.0119300 0.0007449 0.0004051 0.0130500 0.1920000 2539.061 (0.0000)
Table 3.1: Descriptive Statistics of WTI and Brent prices and returns series.
21
market are left skewed since the value of the coecient is negative (i.e. left tail is longer and the mass of distribution is concentrated on the right of the gure). Furthermore, statistics of JarqueBera test suggest that neither oil price series nor oil returns series of both markets are normally distributed. In particular, QQ plots (see Fig. 3.3) of oil prices and oil returns indicate that of the series.
both large positive and large negative shocks are responsible for the non-normality
Figure 3.2: Histograms and density plots of crude oil prices and returns - WTI and Brent crude oil markets.
Figure 3.3: Q-Q plots of crude oil prices and returns - WTI and Brent crude oil markets
22
3.2 GARCH modelling
3.2
GARCH modelling
In this section the whole data set is used to t a respective model of conditional heteroskedastic variance for both crude oil markets. The form of the mean equation is determined by following the Box-Jenkins approach.
Figure 3.4: Autocorrelation and Partial Autocorrelation functions of WTI returns - high values of autocorrelation indicates the presence of serial dependencies across series
Figure 3.5: Autocorrelation and Partial Autocorrelation functions of Brent returns - low values of autocorrelation indicates the absence of serial dependencies across series
Fig. 3.4 and Fig. 3.5 picture graphs of autocorrelation (ACF) and partial autocorrelation (PACF) functions that allow to determine whether there are serial dependencies in series across time, in particular, ACF helps to identify an order of MA process q,
23
3.2 GARCH modelling
while PACF is used to settle an order p of the AR part for the corresponding ARMA model. Presence of autocorrelations in the WTI series suggests that WTI returns show some autocorrelations and an adequate model for mean equation is needed. To simplify the interpretation of the model, ARMA(1,1) is chosen to represent this process. In other words, both past observations and random component will have an impact on today data: Yt = + a1 Yt1 + b1 t1 + t However, in case of Brent returns, ACF and PACF graphs reveal no sign of dependencies among observations that implies that Brent returns most likely follow a process similar to a random walk: Yt = + t Crude oil markets are characterized by persistence of shock, therefore four dierent generalized conditional heteroskedasticity models are used in order to capture this feature. Moreover, such models as GJR-GARCH, EGARCH and APARCH models take into account asymmetric eects, which are also often observable while dealing with nancial time series. Use of these four models allow us to determine specic features of two crude oil markets and further compare them. GARCH(1,1):
t =
ht t
ht = + 1 2 + 1 ht1 t1 GJR-GARCH(1,1): t = ht t ht = + 1 2 + 1 ht1 + 2 I{t1 0} t1 t1 EGARCH(1,1): t = ht t
log(ht ) = + 1 t1 + 1 log(ht1 ) + 1 (|t1 | APARCH(1,1): t = ht t
2/)
h = + 1 [|t1 | 1 t1 ] + 1 h + 2 t t1 t1
24
3.3 Results of estimation
where is unconditional mean, a1 and b1 are GARCH model parameters, 1 is asymmetry term, is power term. Innovations are assumed to be of standard normal distribution: t N (0, 1).
3.3
Results of estimation
We estimate the selected models using the maximum likelihood estimation technique and assuming normally distributed errors. Tables 3.2 and 3.3 contain results of estimation for WTI and Brent crude oil markets respectively 1 . Values in parentheses are standard errors of corresponding parameter estimates. The numbers in brackets are p-values of test statistics. Log(L) is the value of maximized Gaussian log likelihood function. Also additional indicators are computed in order to detect the best time series model for a given data set.
3.3.1
WTI crude oil market
The coecient captures inuence of new shocks on volatility. Estimates of this parameter are statistically signicant for all four models and positive for GARCH, GJR-GARCH and APARCH, however the sign is negative in the case of EGARCH model that can be explained by the construction of EGARCH model that does not impose any constrains on signs of parameters. from the GARCH(1,1) tted model is equal to 0.04979, which is the closest value to the theoretical benchmark used in GARCH modelling (in general, we expect 0.05). The lowest value ( = 0.03039) corresponds to the GJR-GARCH model since in case of this model we take into consideration so-called leverage eect (not all shocks have the same inuence on volatility). The parameter , which that measures persistence of volatility shocks, is positive and statistically signicant at 1% level. For all models considered, value of is close to 1 (around 0.94), indicating that old shocks to crude oil price volatility tend to persist, rather than die out rapidly. This implies that shocks have permanent eects on crude oil price volatility.
1
Estimation procedure based on the maximum likelihood method was performed in R
25
WTI GARCH Coecients a1 b1 1 1 Log(L) Q[10] Q[15] Q[20] JBstat Q2[10] Q2[15] Q2[20] JBstat2 AIC BIC 0.00074 0.67221 -0.69405 0.00001 0.04979 0.93466 9314.446 6.994 9.861 10.637 5688.695 15.47 19.53 22.43 9315841 -4.7608 -4.7496 [0.7260] [0.8284] [0.9551] [0.0000] [0.1159] [0.1908] [0.3176] [0.0000] (0.00034) (0.17280) (0.17254) (0.0000) (0.001) (0.00061) 0.00057 0.70188 -0.72541 0.00001 0.03039 0.93961 0.02848 9319.03 6.697 9.598 10.231 5668.97 21.95 26.10 29.23 9410604 -4.7632 -4.7520 [0.7537] [0.8442] [0.9638] [0.0000] [0.01537] [0.03700] [0.08332] [0.0000] (0.00030) (0.15214) (0.15003) (0.0000) (0.00242) (0.00358) (0.00603) 0.00033 0.67573 -0.70479 -0.12046 -0.02284 0.98341 0.10799 9294.452 7.69 10.60 11.06 5639.815 67.17 69.64 71.55 9540951 -4.7506 -4.7394 [0.6590] [0.7805] [0.9446] [0.0000] [1.554e-10] [5.182e-09] [1.013e-07] [0.0000] (0.00036) (0.04063) (0.03986) (0.02686) (0.00731) (0.00357) (0.01358) 0.00059 0.70962 -0.73293 0.00004 0.04063 0.93843 0.1513 2.18424 9319.283 6.586 9.491 10.150 5667.795 18.05 22.27 25.59 9400984 -4.7628 -4.7500 [0.7639] [0.8505] [0.9654] [0.0000] [0.05412] [0.10099] [0.17967] [0.0000] (0.00028) (0.1482) (0.14613) (0.00001) (0.0037) (0.0073) (0.0507) (0.09115) GJR-GARCH EGARCH APARCH
Residuals. Test statistics
Information criteria
Table 3.2: Estimation results of four conditional heteroskedasticity models. WTI returns.
26
The asymmetry coecient is positive and signicant at 1 % level. This indicates that shocks have asymmetric eects on the volatility of crude oil prices, than negative shocks. The power coecient in APARCH model is 1 % signicant and equal to 2.12. Jarque-Bera test suggests that none of the models presented above has normally that residuals are independent that is necessary in order to conclude that a model is valid. Ljung-Box test on squared residuals indicates that substantially squared residuals test, for example if we consider lag h = 20, in 30 % cases sqaured residuals will be not correlated. APARCH model also shows fairly acceptable p-values, suggesting that heteroskedasticity was partially removed. p-values of the test applied to the EGARCH model, on the contrary, are very low (close to 0) suggesting that this model can hardly be used to adequately describe WTI data set. Both AIC and BIC indicate that GJR-GARCH provides the best t to the data. distributed residuals, while the results of Ljung-Box test for dierent lags implies specically, the positive sign suggests that positive shocks reduce volatility more
are not independent. GARCH model demonstrates the highest p-values of the
3.3.2
Brent crude oil market
The value of parameter 0.94 for GARCH, GJR and APARCH and is slightly greater in the case of the EGARCH model (0.988), similarly to the WTI market meaning that shocks are highly persistent on this market. It is remarkable that estimates of and of Brent crude oil prices are slightly greater than these estimate computed for the WTI market. It may indicate that Brent oil prices sre more liable to shocks both new and persistent comparing to the WTI oil prices. The asymmetry coecient is signicant at 1 % level only in EGARCH model,
while being not signicant in GJR-GARCH and APARCH. This suggests that asymmetry eect of shocks on Brent market is not considerable in contrast to the WTI crude oil market.
27
Brent GARCH Coecients 1 1 Log(L) Q[10] Q[15] Q[20] JBstat Q2[10] Q2[15] Q2[20] JBstat AIC BIC 0.00072 0.00001 0.04801 0.94249 9622.186 4.031 6.952 10.575 2539.061 6.664 12.991 15.219 12880908 -4.9205 -4.9125 [0.9460] [0.9590] [0.9565] [0.0000] [0.7567] [0.6030] [0.7637] [0.0000] (0.00031) (0.0000) (0.00081) (0.00189) 0.00066 0.00001 0.04117 0.943588 0.01022 9622.79 4.155 7.125 10.713 2539.061 7.168 13.752 16.095 12933561 -4.9208 -4.9128 [0.9401] [0.9541] [0.9533] [0.0000] [0.7095] [0.5444] [0.7107] [0.0000] (0.00029) (0.0000) (0.00674) (0.002764) (0.00781) 0.00065 -0.09682 -0.01623 0.98699 0.10641 9616.751 5.064 8.191 11.471 2539.061 11.81 17.30 19.13 12939426 -4.9178 -4.9097 [0.8869] [0.9159] [0.9331] [0.0000] [0.2982] [0.3015] [0.5135] [0.0000] (0.00032) (0.02378) (0.00725) (0.00308) (0.01297) 0.00065 0.00002 0.05051 0.944199 0.079957 1.688617 9623.637 4.422 7.424 10.928 2539.061 7.926 14.390 16.698 12938239 -4.9208 -4.9111 [0.9263] [0.9448] [0.9481] [0.0000] [0.6360] [0.4962] [0.6725] [0.0000] (0.0003) (0.00001) (0.00632) (0.006942) (0.053276) (0.024619) GJR-GARCH EGARCH APARCH
Residuals. Test statistics
Information criteria
Table 3.3: Estimation results of four conditional heteroskedasticity models. Brent returns.
28
3.4 Forecasting
The power coecient in APARCH model is 1 % signicant and equal to 1.69, appropriate.
that suggests that the usual representation used in GARCH model may not be
According to the Jarque-Bera and Ljung-Box test statistics, residuals are not normally distributed for each model, but not correlated across series. Ljung-Box test on squared residuals indicates that squared residuals are independent, which is a necessary condition to conclude that the models are valid. and lowest, similarly to the WTI market, in case of EGARCH model. BIC statistic chooses GJR-GARCH as the best tting model, whereas AIC gives to equal values for GJR-GARCH and APARCH models. p-values of this test are the highest in case of GARCH and GJR-GARCH model
3.4
Forecasting
Considering out-of-sample forecasts, we analyze the forecast performance of four conditional heteroskedasticity models: GARCH, GJR-GARCH, EGARCH and APARCH. The whole data set is divided in two parts, the rst one is used to estimate the parameters of the models, while the remaining part is used to test the forecasts. The out-of-sample measures are computed with one step ahead prediction (not reestimating the coecients) and then for the next day, when new information is available, the prediction is made again. We consider dierent time horizons to perform volatility forecasts and then compare those values with the actual realized volatility from time series by means of dierent statistical indicators and tests.
3.4.1
Forecast performance measures
Since there is not a unique criterion for selecting a model giving the best forecast, we use dierent forecast performance indicators in order to estimate forecast accuracy in terms of loss functions as it was indicated in previous sections. These forecast performance measures are presented in the Table 3.4. MAD and QL suggest that the most accurate forecast model for WTI oil prices is APARCH. According to the MSE criterion APARCH model suits better shorter
29
3.4 Forecasting
GARCH WTI MSE - 50-day horizon - 100-day horizon - 200-day horizon MAD - 50-day horizon - 100-day horizon - 200-day horizon Q2LIKE - 50-day horizon - 100-day horizon - 200-day horizon R2LOG - 50-day horizon - 100-day horizon - 200-day horizon BRENT MSE - 50-day horizon - 100-day horizon - 200-day horizon MAD - 50-day horizon - 100-day horizon - 200-day horizon Q2LIKE - 50-day horizon - 100-day horizon - 200-day horizon R2LOG - 50-day horizon - 100-day horizon - 200-day horizon
GJR-GARCH
EGARCH
APARCH
2.2943e-07 2.7318e-07 4.5561e-07
2.1831e-07 2.6552e-07 4.5042e-07
2.4925e-07 3.0198e-07 4.931e-07
2.1533e-07 2.6199e-07 4.4717e-07
0.00040608 0.00045093 0.00051869
0.00038819 0.00043692 0.0005055
0.00043628 0.00048796 0.00056152
0.00038342 0.00043030 0.00049793
-7.0761 -6.8619 -6.626
-7.1004 -6.8758 -6.6339
-7.0404 -6.8193 -6.5813
-7.1072 -6.8826 -6.6394
12.127 8.3866 9.6335
12.122 8.4543 9.4142
11.639 8.4945 9.9537
12.093 8.427 9.3491
2.1127e-07 2.2817e-07 3.6943e-07
2.1000e-07 2.2418e-07 3.6592e-07
2.1928e-07 2.4266e-07 3.8328e-07
2.1306e-07 2.3141e-07 3.7233e-07
0.0003693 0.00040797 0.00046157
0.00036643 0.00040149 0.0004519
0.00038480 0.00042851 0.00047877
0.00037312 0.00041265 0.00046487
-7.1206 -6.9394 -6.7241
-7.1246 -6.9482 -6.7308
-7.0985 -6.9115 -6.7012
-7.1154 -6.9327 -6.7189
7.7922 6.3026 4.9534
7.7434 6.3328 4.9606
8.05 6.2558 4.439
7.8567 6.384 4.9706
Table 3.4: Volatility forecasts of WTI and Brent crude oil returns. Forecast performance measures. 30
3.4 Forecasting
forecasts rather than longer prediction. In particular, GJR-GARCH model is claimed to be giving the most accurate forecast for 200-day time horizon. R2LOG is however less consistent and best model varies if we change the time horizon of forecasting. Brent market may be characterized as more homogeneous in the sense that almost
all criteria across dierent horizons suggest GJR-GARCH as the most accurate model. Only the value of the R2LOG indicator (n=200) speaks in favour of EGARCH.
These indicators are useful to compare a number of models between each other. Nevertheless, the main shortcoming of this method is that there is no information whether the dierence between two models is statistically signicant.
3.4.2
Diebold-Mariano test
Results of DM test for WTI and Brent crude oil are presented in Tables 3.5 and 3.6 (forecasting horizon was taken as n=100). As it was mentioned in the previous chapter, negative value of the DM statistic suggests that Model 1 (left side of the table) statistically dominates Model 2 (upper side of the table).
GARCH GARCH GJR-GARCH EGARCH APARCH GJR-GARCH 2.7623 [0.0057] EGARCH -8.4513 [0.0000] -11.2135 [0.0000] APARCH 4.3856 [0.0000] 1.6233 [0.1045] 12.8368 [0.0000] -
Table 3.5: Volatility forecasts of WTI crude oil returns. Diebold Mariano test statistics.
WTI GARCH produces more accurate forecast comparing to EGARCH, however it is less accurate than forecast performance of APARCH and GJR. EGARCH model is signicantly outperformed by GARCH, GJR-GARCH and APARCH. APARCH statistically
31
3.5 Summary
dominates GARCH and EGARCH according to the signicant values of corresponding DM statistics. However, comparing APARCH and GJR does not allow to reject the null hypothesis. Therefore, we conclude that due to the possible leverage eect that is taken into account in GJR-GARCH and APARCH models, they generate the most accurate predictions.
GARCH GARCH GJR-GARCH EGARCH APARCH GJR-GARCH 2.1402 [0.03234] EGARCH -3.6495 [ 0.00026] -5.7898 [0.0000] APARCH -0.7153 [ 0.4744] -2.8556 [ 0.004296] 2.9342 [ 0.00334] -
Table 3.6: Volatility forecasts of Brent crude oil returns. Diebold Mariano test statistics.
Brent GJR-GARCH statistically dominates all the other models. As in the case of WTI crude oil prices, EGARCH model is signicantly outperformed by GARCH, GJR-GARCH and APARCH. Thus, GJR-GARCH provides the best volatility forecast for the Brent oil prices.
3.5
Summary
Four GARCH-related models were used to model returns of WTI and Brent crude oil markets. Based on the estimation the following remarks can be made. First of all, GARCH modelling applied to Brent crude oil returns is appropriate since the heteroskedasticity of returns time series was successfully removed that autocorrelations in squared residuals were not completerly eliminated. However, WTI market can be also described using GARCH models, in particular, is superior than other models. through GARCH modelling. As of the WTI crude oil market, there is an evidence
GJR-GARCH gives a good t. However, in terms of forecast accuracy APARCH
32
3.5 Summary
GJR-GARCH model gives the best t for the Brent returns series and outperforms other competing model in forecast accuracy. In general, our ndings suggest that shocks have permanent eects on volatility time asymmetric eects prevail in WTI series rather than in Brent series.
and these eects are stronger for volatility of Brent crude oil prices. At the same
33
Determining fundamental factors

In this section we examine a set of factors, which may explain oil price dynamics. Fundamental factors are to be revealed by means of Principal Component Analysis and evaluated using regression models.
4.1
Data set and choice of factors
In this section weekly WTI crude oil prices are taken, data set contains 1012 observations from November 11, 1990 to March 26, 2010. Weekly frequency is chosen since there is no daily statistics available for most of the explanatory factors. The length of the data sample is also subject to data availability. We assume that there exist some fundamental factors that contribute to the price dynamics. Weekly WTI crude oil prices and corresponding weekly returns are considered as explained variables. In general, uctuations in oil prices can be caused by supply and demand imbalances arising from a specic way these markets function as well as a number of peculiar events like wars, changes in political regimes, economic crises, trading tactics etc. Therefore from the point of view of the general supply-demand economic theory, it makes sense to include factors related to fundamental supply and demand drivers. However, crude oil prices are also severely aected by the number of factors that can not be observable and quantied directly, such as consumers expectations, political aspect etc. Therefore, we include in the analysis some additional factors that describe general economic situation of the US crude oil market.
34
4.1 Data set and choice of factors
The following factors are considered as fundamental factors aecting WTI crude oil prices1 : U.S. Market Demand (Thousand Barrels per Day).
An increased need for crude oil shifts the demand curve in a price-quantity diagram to the right and thereby increases the oil price given not completely elastic supply curve.
U.S. Crude Oil Stocks (Thousand Barrels).
Crude oil stocks are created to prevent unforeseen circumstances cutting down oil supply. According to Kaufmann et al. (2004) negative impact of stocks on the price is expected since an increase in stocks reduces real oil price by diminishing reliance on current production and thereby reducing the risk premium associated with a supply disruption. However, positive coecients can also be expected. If petroleum stocks are lled (released) then demand increases (decreases) and crude oil prices might rise (fall).
Days of Forward Consumption of Crude Oil Stock (Number of Days).
This variable was constructed by Kaufmann et al. (2004) and is the ratio of crude oil stocks in million barrel and crude oil demand in million barrel per day. It can be interpreted as an indicator of the independence from price shocks and should have a negative impact on oil prices.
U.S. Renery Operable Capacity (Thousand Barrels per Day).
A rening capacity is dened as the maximum amount of crude oil that can be processed in a calendar year divided by the number of days in the corresponding year and characterizes how well the downstream sector is developed. A lack of spare rening capacity is seen as one cause for the on-going rise in the price of crude oil.
Number of U.S. Rotary Rigs in Operation (Count)
A rotary rig is a machine used for drilling wells. Presumably, crude oil supply directly depends on the number of rigs in operation, therefore, we assume a negative trade-o between the number of rigs and crude oil spot prices.
Data source: U.S. Energy Information Administration. http://www.eia.doe.gov/
35
4.1 Data set and choice of factors
WTI Crude Oil Future 4 Months Contract (Dollars per Barrel)
Agents expectations in the long-run may also inuence oil prices. These futures prices indicate a price for a contract specifying the earliest delivery date as 4 months ahead. Greater price for such a contract would suggest that consumers expect prices to grow as well in the future, thus assumed sign is positive.
Spread between Crude Oil Future Contract 4 and Crude Oil Future Contract 1 traded on the New York Mercantile Exchange (Dollars per Barrel). The conditions on the futures markets (whether the market is contango - the price of crude oil for four month contracts is greater than the price for near month contracts - or in backwardation - the price of crude oil for four month contracts is less than the price for the near month contract) might also aect the stock behaviors and, in turn, the oil prices. We expect the corresponding coecient to be positive as contango is expected to have a positive eect on prices because the higher price for future deliveries provides an incentive to build and hold stocks, which bolsters demand. U.S. Interest Rate (Percent per Year).
Decreasing interest rates will make lendings more accessible and will stimulate crude oil demand, which, in turn, will contribute to the oil prices increase. Thus the we expect negative eect of interest rates.
S&P 500 Index.
This index characterizing state of nancial markets is included here to represent general nancial situation and its possible eect on oil prices. According to a number of previous studies there is a strong relation between crude oil prices and stock markets. We assume that increase of S&P 500 index will have a positive eect on oil prices.
U.S. Gross Domestic Product and Growth Rate (percent change with respect to the last period). These two factors can served as indicators of the level of economical development as well. Economical growth bolsters up demand for crude oil and therefore upholds crude oil prices.
36
4.2 Correlations and linear regression
Explanatory variables used in the current analysis of crude oil prices and corresponding hypotheses are summarized in the Table 4.1.
4.2
Correlations and linear regression
First step of the analysis includes examination of correlations. Results are presented in the Table 4.2. Correlations greater than 0.75 are highlighted. Basically, we are interested whether there are simple correlations between prices and some of the factors presented above. The Table 4.2 illustrates that price returns are weakly correlated with all the factors considered, while price series are characterized by extremely high correlations with future contracts, which is perfectly explainable (correlations 1). Correlation between price and demand is average (0.554), however, positive sign indicates a certain positive dependence between these two variables: according to the supply-demand theory, increasing demand will urge prices on increasing. Correlation between prices and values of renery capacities is surprisingly high and positive (0.8), even though it was assumed that growth of renery capacities would stimulate production and supply and therefore lead to decrease in oil prices. Fairly high correlations are observed between oil prices and the index of industrial index S&P index (0.543), indicating that level of economical development upholds an increase in crude oil prices.
37
Analyzed variables
V1 V2
PRICE RET
Weekly WTI crude oil spot price WTI crude oil spot returns
Explanatory variables Demand factors V3 V4 DEM DAYS Weekly US market demand Weekly days of forward consumption of crude oil stock (+) ()
Supply factors V5 V6 V7 STOCKS CAP RIGS Weekly US crude oil stocks Weekly US renery operable capacity Number of operating rigs (+/) () ()
Trading factors V8 V9 F4 F41 4 month NYMEX crude oil future contract Spread between crude oil future contract 4 and contract 1 (+) (+)
General economical development factors V10 V11 V12 V13 INT SP GR GDP Weekly Weekly Weekly Weekly US interest rate S&P 500 Index US real growth rate US real GDP () (+) (+) (+)
Table 4.1: List of variables used in the analysis of fundamental factors aecting WTI crude oil
prices.
38
RET PRICE DEM DAYS STOCK CAP RIGS F2 F41 INT SP GR GDP
RET 1 0.05 0.07 -0.04 0.01 0.04 -0.1 0.04 0.11 -0.04 0.07 0.08 0.05
PRICE 1 0.554 -0.37 -0.04 0.798 -0.4 0.997 -0.24 -0.33 0.543 -0.31 0.769
DEM
DAYS
STOCK
CAP
RIGS
F4
F41
INT
SP
GR
GDP
1 -0.86 -0.41 0.74 -0.55 0.55 -0.15 -0.23 0.82 -0.09 0.84
1 0.809 -0.55 0.44 -0.35 -0.19 0.212 -0.72 0.026 -0.69
1 -0.15 0.171 -0 -0.53 0.122 -0.35 -0.04 -0.28
1 -0.75 0.81 -0.35 -0.51 0.74 -0.34 0.95
1 -0.41 0.23 0.69 -0.54 0.13 -0.72
1 -0.31 -0.35 0.54 -0.32 0.78
1 0.25 -0.09 0.23 -0.31
1 -0.07 0.184 -0.46
1 -0.01 0.839
1 -0.22
Table 4.2: Cross correlation analysis.
At this point a linear regression model is estimated in order to see whether the chosen set of factors can explain price variations across considered time period. Results of estimation are shown in Tables 4.3 and 4.4.
WTI crude oil spot price Estimator Const DEM DAYS STOCK CAP RIGS F4 F41 INT SP GROWTH GDP 2.00367 -8.9E-05 -0.08757 3.8E-06 1.9E-05 -3.5E-05 1.00064 1.04316 0.00888 2.4E-05 -0.00057 -2.9E-05 Standard error 1.18988 5.8E-05 0.06298 3.3E-06 4.7E-05 3.5E-05 0.0008 0.00745 0.00939 7.3E-05 0.0042 3E-05 t-statistics 1.68393 -1.54546 -1.39035 1.12544 0.3971 -1.01504 1246.16 140.082 0.94525 0.3304 -0.13543 -0.98274 p-value 0.09251 0.12255 0.16473 0.26067 0.69138 0.31033 0*** 0*** 0.34476 0.74117 0.8923 0.32597 1012 0.999873
Number of observations R2
Table 4.3: Estimation of linear regression model with the initial set of factors. WTI crude oil spot prices.
As it can be seen from tables above, these two models can be hardly used to explain dynamics of crude oil prices and returns. When modelling crude oil prices, even though R2 is very high, only two factors appear to be signicant (F4 and F41), which is a direct consequence of their high correlations with explained variable. Therefore, we can not come to any conclusion based on this model.
39
4.3 Principal component analysis
WTI crude oil returns Estimator Const DEM DAYS STOCK CAP RIGS F4 F41 INT SP GROWTH GDP 8.06619 -0.0003 -0.7804 9.4E-05 -0.0012 -0.0012 0.01156 0.69486 -0.0823 7.4E-05 0.05424 0.00032 Standard error 17.5902 0.00086 0.93108 4.9E-05 0.0007 0.00051 0.01187 0.11009 0.13882 0.00108 0.06208 0.00044 t-statistics 0.45856 -0.3694 -0.8382 1.90467 -1.6854 -2.415 0.97384 6.31188 -0.5929 0.06856 0.87379 0.72937 p-value 0.64665 0.71189 0.40213 0.05711** 0.09222** 0.01591*** 0.33037 4.1E-10*** 0.55339 0.94536 0.38244 0.46595 1012 0.070652706
Table 4.4: Estimation of linear regression model with the initial set of factors. WTI crude oil returns.
As of the returns, three supply-related factors are statistically signicant. An eect of stocks seems to be unimportant since the corresponding coecient is close to zero, while renery capacity and number of operating rigs have a negative impact. As a matter of fact, this conclusion is in line with our initial assumptions. Finally, prices of future contracts have a positive inuence of crude oil price returns. However, a low value of R2 (7 %) suggests that this model does not give a particularly good t to the data. In the next section an attempt to improve these results is made based on the methods of principal component analysis.
4.3
4.3.1
Principal component analysis

Theoretical background
PCA is a multivariate statistical technique which calculates the principal directions of variability in data, and transforms the original set of correlated variables into a new set of uncorrelated variables. The new uncorrelated variables are linear combinations of the original variables. These principal components represent the most important directions of variability in a dataset.
40
Given a data matrix with p variables and n samples, the data are rst centered on the means of each variable. This will insure that the cloud of data is centered on the origin of our principal components, but does not aect the spatial relationships of the data nor the variances along our variables. The rst principal components (Y1 ) is given by the linear combination of the variables X1 , X2 , ..., Xp : Y1 = a11 X1 + a12 X2 + ... + a1p Xp . and is calculated in such a way that it accounts for the greatest possible variance in the data set. Of course, one could make the variance of Y1 as large as possible by choosing large values for the weights a11 , a12 , ...a1p . To prevent this, weights are calculated with the constraint that their sum of squares is 1: a2 + a2 + ... + a2 = 1. 11 12 1p The second principal component is calculated in the same way, with the condition that it is uncorrelated with (i.e., perpendicular to) the rst principal component and that it accounts for the next highest variance. Y2 = a21 X1 + a22 X2 + ... + a2p Xp . This continues until a total of p principal components have been calculated, equal to the original number of variables. At this point, the sum of the variances of all of the principal components will equal the sum of the variances of all of the variables, that is, all of the original information has been explained or accounted for. Collectively, all of these transformations of the original variables to the principal components is Y = AX. The rows of matrix A are called the eigenvectors of variance-covariance matrix of the original data. The elements of an eigenvector are the weights aij , and are also known as loadings. The elements in the diagonal of matrix Sy, the variance covariance matrix of
41
the principal components, are known as the eigenvalues. Eigenvalues are the variance explained by each principal component and are constrained to decrease monotonically from the rst principal component to the last.
4.3.2
Empirical results
Ideas of principal component analysis can be applied to determine what are the factors that can explain variations of crude oil prices. In order to do so, factors described above are used to build 11 principal components that may be represented as a linear combination of initial factors. Table 4.5 shows the results for 11 principal components. According to the method, each principal component is constructed in such a way that variance is maximized. In our particular case the rst principal component P C1 explains 71 % of varinace of the data set, while principal components P C1, P C2, P C3 together account for 88 % of the total variance in the data set.
Standard deviation Proportion of Variance Cumulative Proportion PC1 2.637 0.711 0.711 PC2 1.512 0.093 0.804 PC3 1.099 0.078 0.882 PC4 1.007 0.0544 0.9360 PC5 0.8408 0.0323 0.9683 PC6 0.6479 0.0145 0.9828 PC7 0.4344 0.0080 0.9917 PC8 0.3397 0.0047 0.9964 PC9 0.24812 0.0026 0.999 PC10 0.1843 0.0008 0.9998 PC11 0.0985 0.0002 1.0000
Table 4.5: Principal Component Analysis
Figure 4.1 presents factors depicted in P C1 P C2 coordinate system.
Based on the results of the principal components analysis 7 main factors are chosen among all 11 factors considered. The choice is made by simple eyeballing by taking into consideration the fact that the distance between each single factor and a chosen axis should be minimal. Since the rst principal component explains around 70 % of the total variance of the data set, 7 factors that are located as close as possible to the P C1 axis are chosen. In the following the analysis will be based on these 7 fundamentals rather than on all the factors selected initially. Once the dimension of the data set is reduced, we examine an impact of the selected factors on crude oil prices and returns. Results of regression estimation are presented in Tables 4.6 and 4.7. Table 4.6 gives evidence that use of principal component analysis substantially improves the results. Now almost all estimators are signicance, positive sign of such factors as demand, 4 months future contract price and S&P index conrm our hypotheses about positive impact of these variable on crude oil spot price. Supply variables (renery
42
Figure 4.1: Plot of factors in PC1-PC2 coordinate system - factors corresponding to the points lying along with the horizontal axis are assumed to be fundamental factors
WTI crude oil price spot prices coecient const DEM CAP RIGS F4 INT SP GDP 18.43185 0.000291 -0.000874 -0.000313 1.014414 -0.18246 0.003778 -0.001129 standard error 3.807135 8.02E-05 0.000252 0.000207 0.004647 0.056276 0.000428 0.000163 t-statistic 4.841397 3.627588 -3.472991 -1.508919 218.3132 -3.242213 8.819145 -6.914172 p-value 1.49E-06 0.0003 0.000537 0.131634 0 0.001225 5.02E-18 8.36E-12 1012 0.995
Table 4.6: Estimation of linear regression model with the reduced set of fundamental factors. WTI crude oil spot price.
43
4.4 Forecasting and comparison
capacity and number of rigs) have a negative eect on prices, as well as the interest rate, as it was discussed previously. Eect of GDP seems to be somewhat surprising: its estimator is negative implying that there is a negative trade-o between the value of gross domestic product and oil prices.
WTI crude oil price returns coecient const DEM CAP RIGS F4 INT SP GDP 28.6469 0.00024 -0.00156 -0.00205 0.03556 0.02843 0.00152 -0.00046 standard error 9.16471 0.00019 0.00061 0.0005 0.01119 0.13547 0.00103 0.00039 t-statistic 3.12579 1.25768 -2.57946 -4.11467 3.17935 0.20984 1.47861 -1.18171 p-value 0.00182 0.2088 0.01004 4.2E-05 0.00152 0.83384 0.13956 0.2376 1012 0.0285
Table 4.7: Estimation of linear regression model with the reduced set of fundamental factors. WTI crude oil returns.
As of the dynamics of oil price returns, results of estimation suggest that mainly supply factors inuence their behavior: renery capacity and number of rigs are two signicant supply variables with a negative impact on crude oil returns.
4.4
4.4.1
Forecasting and comparison

Forecasting returns series
Estimated parameters of the linear regression can be used to construct a forecast assuming that all the necessary factors are known. Estimation of parameters is performed on in-sample data and forecast is made for the rest of observations and compared to the actual data corresponding to this time period. In our analysis forecasts for n = 40, 20, 10 are built and compared to corresponding forecasts obtained from time series models presented above. Since the frequency of data used to determine the fundamental factors is weekly, the same frequency is also used for parameter esimtaion procedure applied to the time series models. As a basis for comparison such forecast performance measures as MSE and MAD are used. Results can be found in the Table 4.8.
44
Returns GARCH MSE 10 20 40 MAD 10 20 40 0.0009755459 0.001556476 1.481004e-05 0.02378043 0.03128204 0.04246252 GJR-GARCH 0.0009753049 0.001556359 1.481004e-05 0.02376722 0.03127564 0.04245880 EGARCH 0.0009749707 0.001556191 1.481004e-05 0.02374860 0.03126629 0.04245410 APARCH 0.0009752605 0.001556337 1.481004e-05 0.02376477 0.03127441 0.04245805 Factors 0,000717371 0,001045355 0,001504633 0,020689527 0,024970522 0,029540268
Table 4.8: Forecast of returns series. Results of comparison.
According to the Table 4.8, the forecasts that are based on the linear regression model are characterized by the lowest values of MSE and MAD almost for all considered forecasting horizons, which indicates that linear regression model of returns series performs better than various time series models. However, we are interested in forecasting not only returns series, but also conditional volatility.
4.4.2
Forecasting volatility
Based on the results of the previous section we conclude that it may make sense to include selected factors as external regressors in order to improve the accuracy of forecasts. It is done in the following way: rst 7 selected factors are added to the mean equation described by the ARM A(1, 1) (see Table 4.9), then parameters of four models of conditional volatility are estimated using the usual procedure and forecast performance measures are computed (see Table 4.11).
ARMA(1,1) with external regressors ar1 -0.552*** (0.0899) ma1 0.693*** (0.0764) intercept 0.001 (0.0014) V1 0.004** (0.0028) V2 -0.015*** (0.0056) V3 -0.013*** (0.0031) V4 0.011*** (0.0031) V5 0.0006 (0.0028) V6 0.006** (0.0040) V7 -0.009** (0.0074)
Table 4.9: Modelling weekly WTI returns: mean equation ARMA(1,1) with external regressors. Results of estimation
45
GARCH MSE 10 20 40 MAD 10 20 40 QLIKE 10 20 40 R2LOG 10 20 40 5.638722 7.150143 6.272863 7.876753 39.88312 27.77909 0.001136795 0.001216589 0.001740749 1.517961e-06 2.093317e-06 6.252697e-06
GJR-GARCH 1.500587e-06 2.058361e-06 6.227854e-06 0.001130764 0.001202223 0.001704235 7.717529 38.70198 26.16065 5.612724 7.090884 6.159161
EGARCH 1.725124e-06 2.178011e-06 6.242845e-06 0.001219098 0.001256519 0.001713302 8.998919 40.56881 26.69147 5.951330 7.308943 6.215167
APARCH 1.532729e-06 2.075371e-06 6.229685e-06 0.001144428 0.001209904 0.001703925 7.902191 38.82546 26.06326 5.665262 7.117136 6.159177
Table 4.10: WTI returns. Mean equation ARMA(1,1) without external regressors and four models of conditional variance. Forecast performance measures
GARCH MSE 10 20 40 MAD 10 20 40 QLIKE 10 20 40 R2LOG 10 20 40 5.699623 7.207705 6.447323 8.139567 40.55618 30.24834 0.001149439 0.001225348 0.001790515 1.544896e-06 2.091572e-06 6.273180e-06
GJR-GARCH 1.540868e-06 2.033588e-06 6.162839e-06 0.001149370 0.001202221 0.001695993 8.053292 38.10286 26.03483 5.698564 7.106061 6.152935
EGARCH 1.740230e-06 2.135187e-06 6.176375e-06 0.001226605 0.001249980 0.001685312 9.169439 39.50314 25.75115 5.978434 7.284573 6.136302
APARCH 1.518638e-06 2.025813e-06 6.163844e-06 0.001139979 0.001198018 0.001699521 7.926220 38.10412 26.20963 5.663473 7.092730 6.163711
Table 4.11: Modelling weekly WTI returns: mean equation ARMA(1,1) with external regressors, four models of conditional variance. Forecast performance measures
46
4.5 Summary
Comparing results presented in Tables 4.11 and 4.10 for WTI crude oil market we come to a conclusion that contrary to our expectations, forecast accuracy was not improved by adding external factors to time series models. However, superiority of APARCH model for WTI crude oil market in the sense of forecast accuracy is consistent with the results obtained in the previous chapter.
4.5
Summary
As a result of analysis presented in this chapter, 7 fundamental factors explaining dynamics of weekly WTI crude oil spot prices were selected. It was shown that such factors as volume of US demand for crude oil, price of future contracts and S&P index have a positive impact on spot prices series, while renery capacity and interest have a negative inuence on their dynamics. As of returns series, fundamental factors are mainly related to features of the supply side, such as renery capacity and number of operating rigs. Use of these fundamental factors allows us to eectively model dynamics of crude oil returns, moreover, forecast performance of such a model is shown to be superior to performance of four commonly used models of conditional variance (GARCH, GJR-GARCH, EGARCH and APARCH). However, model based on these fundamental factors is not able to capture the volatility of crude oil markets as successfully as time series models.
47
Concluding remarks
Commodity prices, especially prices of crude oil, have considerable macroeconomic and microeconomic impacts, which make examination of their dynamics particularly signicant. The goal of this project was to examine crude oil price behavior, compare dierent models of forecasting its volatility and determine fundamental factors explaining its dynamics. In order to capture specic features of crude oil markets we rst analyzed dierent conditional heteroskedasticity models, which allow to model dynamics of oil price returns. In addition to the traditional GARCH(1,1) model, three other models were used as well: GJR-GARCH(1,1), EGARCH(1,1) and APARCH(1,1), with the aim of examining whether or not shocks have asymmetric and persistent eects on oil price volatility. Our nding conrms these hypotheses and suggest that shocks have permanent and asymmetric eects on volatility for both markets considered. Among four models GJR-GARCH and APARCH models capture persistence better than GARCH and EGARCH models. Moreover, APARCH model for WTI and GJR-GARCH for Brent crude oil market provide superior performance in out-of-sample volatility forecasts. Furthermore, a set of possible factors aecting WTI oil prices was examined and the most important fundamental factors were sorted out. The model based on these factors gives a good t for crude oil returns, however it is outperformed by models of conditional volatility in the sense of out-of-sample volatility forecasts.
48
Bibliography
[1] Askari, H., Krichene, N. (2008). Oil price dynamics (2002-2006). Energy Economics, 30(5), 21342153. [2] Bina, C., Vo, M. (2007). OPEC in the Epoch of Globalization: An Event Study of Global Oil Prices. Global Economy Journal, Berkeley Electronic Press, vol. 7(1). [3] Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31 (3), 307-327. [4] Chen, S.S., Chen, H.C. (2007). Oil prices and real exchange rates. Energy Economics, 29, 390-404. [5] Dees, S., Gasteuil, A., Kaufmann, R.K., Mann, M. (2008). Assessing the factors behind oil price changes. European Central Bank Working Paper Series, 85. [6] Dees, S., Karadeloglou, P., Kaufmann, R.K., Sanchez, M. (2007). Modelling the world oil market: Assessment of a quarterly econometric model. Energy Policy, 35, 178-191. [7] Diebold, F.X., Mariano, R.S. (1995). Comparing Predictive Accuracy. Journal of Business & Economic Statistics, 13(3), 253-63. [8] Enders, W. (1995). Applied Econometric Time Series. John Wiley & Sons, Inc. [9] Engle, R. (1982). Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Ination. Econometrica, 50(4), 987-1007. [10] Engle, R. , Lilien, D., Robins, R. (1987). Estimating Time Varying Risk Premia in the Term Structure: The Arch-M Model. Econometrica, 55(2), 391-407. [11] Ferderer, P. (1996). Oil price volatility and the macroeconomy. Journal of Macroeconomics, 18, 1-26. [12] Fleming, J., Ostdiek, B. (1999). The impact of energy derivatives on the crude oil market. Energy Economics, 21, 135-167. [13] Glosten, L., Jagannathan, R., Runkle, D. (1993). On the Relation between the Expected Value and the Volatility of the Nominal Excess Return on Stocks. The Journal of Finance, 48(5), 17791801.
49
BIBLIOGRAPHY
[14] Green, S.L., Mork, K.A. (1991). Toward eciency in the crude-oil market. Journal of Applied Econometrics, 6, 4566. [15] Hamiltion, J. (1994). Time Series Analysis. Princeton University Press. [16] Huang, R.D., Masulis, R.W., Stoll, H.R. (1996). Energy shocks and nancial markets. Journal of Futures Markets, 16, 3956. [17] Huang, B.N., Hwang, M.J., Yang, C.W., Ohta, H. (2004). Oil Price Volatility. Encyclopedia of Energy, 4, 691-699. [18] Huang, B.N., Hwang, M.J., Peng, H.P. (2005). The asymmetry of the impact of oil price shocks on economic activities: an application of the multivariate threshold model. Energy Economics, 27, 455476. [19] Jarque, C.M., Bera, A.K. (1980). Ecient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters, 6(3), 255259. [20] Kang, S.H., Kang S.M., Yoon S.M. (2009). Forecasting volatility of crude oil markets. Energy Economics, 31, 119-125. [21] Kaufmann, R.K., Dees, S., Karadeloglou, P., Sanchez, M. (2004). Does OPEC matter? An econometric analysis of oil prices. The Energy Journal, 25(4), 67-90. [22] Krichene, N. (2002). World crude oil and natural gas: a demand and supply model. Energy Economics 24, 557576. [23] Lee, K., Ni, S., Ratti, R.A. (1995). Oil shocks and the macroeconomy: the role of price variability. The Energy Journal, 16, 3956. [24] Ljung, G.M., Box G.E.P. (1978). On a Measure of a Lack of Fit in Time Series Models. Biometrika, 65, 297303. [25] Morana, C. (2001). A semiparametric approach to short-term oil price forecasting. Energy Economics, 23, 325-338. [26] Narayan, P.K., Narayan S. (2007). Modelling oil price volatility. Energy Policy, 35, 6549-6553. [27] Nelson, D. (1991). Conditional Heteroskedasticity in Asset Returns: A New Approach. Econometrica, 59(2), 347-370. [28] Oberndorfer, U. (2009). Energy prices, volatility, and the stock market: Evidence from the Eurozone. Energy Policy, 37, 5787-5795. [29] Papapetrou, E. (2001). Oil price shocks, stock market, economic activity and employment in Greece. Energy Economics, 23, 511-532. [30] Pindyck, R.S. (1999). The Long-Run Evolution of Energy Prices. The Energy Journal, 20(2), 1-28. [31] Regnier, E. (2007). Oil and energy price volatility. Energy Economics, 29, 405-427.
50
BIBLIOGRAPHY
[32] Sadorsky, P. (1999). Oil price shocks and stock market activity. Energy Economics, 21, 449-469. [33] Sadorsky, P. (2003). The macroeconomic determinants of technology stock price volatility. Review of Financial Economics, 12, 191205. [34] Sadorsky, P. (2006). Modeling and forecasting petroleum futures volatility. Energy Economics, 28, 467-488. [35] Tabak, B.M., Cajueiro, D.O. (2007). Are the crude oil markets becoming weakly ecient over time? A test for time-varying long-range dependence in prices and volatility. Energy Economics, 29 (1), 28-36. [36] Tsay, R. (2002). Analysis of Financial Time Series. Financial Econometrics. John Wiley & Sons, Inc. [37] Uri, N.D. (1996). Crude oil price volatility and unemployment in the United States. Energy, 21 (1), 29-38. [38] Vo, M.T. (2009). Regime-switching stochastic volatility: Evidence from the crude oil market. Energy Economics, 31, 779-788. [39] Yang, C.W., Hwang, M.J., Huang, B.N. (2002). An analysis of factors aecting price volatility of the US oil market. Energy Economics, 24, 107-119.
51
BIBLIOGRAPHY
52

Crude Oil Review

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Crude Oil Review

Uploaded by

Copyright:

Available Formats

Econometrics of Crude Oil Markets

A thesis submitted for the degree of Master in Economics 2010 June

General methodology and statistical tools . . . . . . . . . . . . . . . . .

Choice of time series model based on information criteria . . . .

Forecasting and comparison . . . . . . . . . . . . . . . . . . . . . . . . .

5 Concluding remarks Bibliography

1.3 Literature review

Crude oil markets and their impact on macroeconomy

Oil price volatility

1.3 Literature review

Modelling crude oil prices and their volatility

1.3 Literature review

1.3 Literature review

Determining fundamental factors

Methodology of time series analysis

Modelling nancial time series

2.1 Modelling nancial time series

2.1 Modelling nancial time series

p q Yt (1) = E[Yh+1 |Yh , Yh1 , ...] = 0 + i Yh+1i j h+1j i=1 j=1

2.1 Modelling nancial time series

2.1 Modelling nancial time series

2.1 Modelling nancial time series

2.1 Modelling nancial time series

htk ) + k (|tk / htk | 2/)]

2.2 General methodology and statistical tools

General methodology and statistical tools

Choice and validation of time series model

2.2 General methodology and statistical tools

2.2 General methodology and statistical tools

the chi-square distribution with k degrees of freedom.

Choice of time series model based on information criteria

BIC - Bayes information criterion

Choice of time series model based on forecasting performance

2.2 General methodology and statistical tools

2. Mean Absolute Deviation M AD =

3. Quasi Likelihood QLIKE =

4. Mean Squared Forecast Error R2LOG =

since t N (0, 1) and ht is a conditional variance of 2 given information available at t

2.2 General methodology and statistical tools

Properties of the data set

The data set was kindly provided by Total.

3.1 Properties of the data set

Kurtosis is greater than 3 in all cases, thus density function is characterized by

3.1 Properties of the data set

3.1 Properties of the data set

3.2 GARCH modelling

3.2 GARCH modelling

ht = + 1 2 + 1 ht1 t1 GJR-GARCH(1,1): t = ht t ht = + 1 2 + 1 ht1 + 2 I{t1 0} t1 t1 EGARCH(1,1): t = ht t

log(ht ) = + 1 t1 + 1 log(ht1 ) + 1 (|t1 | APARCH(1,1): t = ht t

3.3 Results of estimation

WTI crude oil market

Estimation procedure based on the maximum likelihood method was performed in R

3.3 Results of estimation

Residuals. Test statistics

3.3 Results of estimation

Brent crude oil market

3.3 Results of estimation

Residuals. Test statistics

Forecast performance measures

2.2943e-07 2.7318e-07 4.5561e-07

2.1831e-07 2.6552e-07 4.5042e-07

2.4925e-07 3.0198e-07 4.931e-07

2.1533e-07 2.6199e-07 4.4717e-07