You are on page 1of 50

A Comparison of Neural Network, ARIMA and

GARCH Forecasting of Exchange Rate Data

Namrata Devi Jory


UNIVERSITY OF MAURITIUS

Dissertation submitted to the Department of Mathematics, Faculty of Science,


University of Mauritius, as a partial fulfilment of the requirement for the degree
of BSc (Hons) Mathematics with Computer Science.
April 2012

Contents
List of Figures

iii

List of Tables

iv

Acknowledgements

Abstract
1

vi

Introduction

1.1

Forecasting Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Aims and Organization of the Project . . . . . . . . . . . . . . . . . . . . . . .

Artificial Neural Networks

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3

Backpropagation Learning Algorithm . . . . . . . . . . . . . . . . . . . . . .

2.4

Training set and Testing set . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Linear and Non-linear Time Series

10

3.1

Stationary Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

3.1.1

Autoregressive process . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

3.1.2

Moving Average Process . . . . . . . . . . . . . . . . . . . . . . . . . .

11

3.1.3

Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

ARIMA (p, d, q) Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

3.2.1

13

3.2

Autocorrelation and Partial Autocorrelation Functions . . . . . . . .


i

CONTENTS

3.3

GARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.3.1

ARCH Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.3.2

GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

Analysis of Exchange Rate Data

16

4.1

Analysis of Period I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

4.1.1

Analysis of Period II data . . . . . . . . . . . . . . . . . . . . . . . . .

21

4.2

Forecasting Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

4.3

Fitting Neural Network Model to the Exchange Rates Data . . . . . . . . . .

23

4.4

Fitting ARIMA Model to the Foreign Exchange Rates Data . . . . . . . . . .

24

4.5

Fitting the GARCH model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

4.6

Empirical findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

4.6.1

Forecasts performance of Period I . . . . . . . . . . . . . . . . . . . .

26

4.6.2

Results and Discussions for Period I . . . . . . . . . . . . . . . . . . .

32

4.6.3

Forecasts performance of Period II . . . . . . . . . . . . . . . . . . . .

34

4.6.4

Results and Discussions for Period II . . . . . . . . . . . . . . . . . .

39

Conclusion

41

Bibliography

42

ii

List of Figures
2.1

Three-Layer feedforward Network . . . . . . . . . . . . . . . . . . . . . . . .

2.2

The neuron weight adjustment . . . . . . . . . . . . . . . . . . . . . . . . . .

4.1

Daily MUR/USD data and its corresponding first differences of the logs . .

17

4.2

Daily MUR/EU data and its corresponding first differences of the logs . . .

17

4.3

Daily MUR/GBP data and its corresponding first differences of the logs . .

18

4.4

Histogram and Kernel Density estimate of daily log return of MUR/USD .

19

4.5

Histogram and Kernel Density estimate of daily log return of MUR/GBP

19

4.6

Histogram and Kernel Density estimate of daily log return of MUR/EU . .

19

4.7

Scatter plot of yt against yt1 for the MUR/USD log return data. . . . . . . .

20

4.8

Scatter plot of yt against yt1 for the MUR/EU log return data. . . . . . . . .

20

4.9

Scatter plot of yt against yt1 for the MUR/GBP log return data. . . . . . . .

20

4.10 Daily MUR/USD Jan 03 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . .

21

4.11 Daily MUR/EU Jan 03 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . . .

22

4.12 Daily MUR/GBP Jan 02 - Dec 11 . . . . . . . . . . . . . . . . . . . . . . . . .

22

4.13 In-sample and out-of-sample forecast for MUR/USD . . . . . . . . . . . . .

33

4.14 In-sample and out-of-sample forecast for MUR/EU . . . . . . . . . . . . . .

33

4.15 In-sample and out-of-sample forecast for MUR/GBP . . . . . . . . . . . . .

34

iii

List of Tables
3.1

Behaviour of ACF and PACF . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

4.1

Summary statistics for the daily exchange rates: log first difference . . . . .

18

4.2

Summary statistics of log first difference daily exchange rate . . . . . . . . .

21

4.3

In-sample performance of MUR/USD for data Jan 2003-Dec 2008 . . . . . .

26

4.4

In-sample performance of MUR/EU for data Jan 2003-Dec 2008 . . . . . . .

27

4.5

In-sample performance of MUR/GBP for data Jan 2002-Dec 2008 . . . . . .

28

4.6

Out-of-sample performance of MUR/USD for data Jan 2003-Dec 2008 . . . .

29

4.7

Out-of-sample performance of MUR/EU for data Jan 2003-Dec 2008 . . . .

30

4.8

Out-of-sample performance of MUR/GBP for data Jan 2002-Dec 2008 . . . .

31

4.9

First 10 forecasts values of MUR/GBP for ANN, ARIMA and GARCH models 31

4.10 In-sample performance MUR/USD data . . . . . . . . . . . . . . . . . . . . .

34

4.11 In-sample performance MUR/EU data . . . . . . . . . . . . . . . . . . . . . .

35

4.12 In-sample performance MUR/GBP data . . . . . . . . . . . . . . . . . . . . .

36

4.13 Out-of-sample performance MUR/USD data . . . . . . . . . . . . . . . . . .

37

4.14 Out-of-sample performance of MUR/EU data . . . . . . . . . . . . . . . . .

38

4.15 Out-of-sample performance of MUR/GBP data . . . . . . . . . . . . . . . . .

39

4.16 In-sample and Out-of-sample forecasts of MUR/USD . . . . . . . . . . . . .

40

4.17 In-sample and Out-of-sample forecasts of MUR/EU . . . . . . . . . . . . . .

40

4.18 In-sample and Out-of-sample forecasts of MUR/GBP . . . . . . . . . . . . .

40

iv

Acknowledgements
I am deeply indebted to my supervisor, Professor Muddun Bhuruth, for his exceptional guidance and encouragement which helped me to carry out this research and write
this dissertation. It was a wonderful experience to work with such a mentor and I am very
grateful for all the criticisms to make my work better each time. I am also thankful to my
co-supervisor Associate Professor Ravindra Boojhawon who has helped and motivated me
throughout the project.

I am very grateful to my parents for their love and for always being by my side in good
and bad times and special thanks to my dearest sister for her understanding and help. I
would like to thank my friends khush, Prish and Bho for always being there for me.

I also extend my gratitude to all those who helped me directly or indirectly to make this
work a success.

Finally, I thank GOD for his blessings from the bottom of my heart.

Abstract
We consider linear and non linear models for forecasting exchange rates of the Mauritian Rupee against US Dollar, Euro and the British Pound. The linear models considered
are the ARIMA processes and the non linear models considered are the Artificial Neural
Networks and the GARCH model. Since no guidelines were available to choose the parameters of the Neural Network, they were chosen through extensive experimentation.
Two periods of analysis were carried out first from January 2002 to December 2008 and
for January 2002 to December 2011 and the in-sample and out-of-sample forecasts were
produced. The reason for this choice is that we wanted to test the ability of our forecasting
procedures during the financial crisis.
Using three forecast evaluation criteria RMSE, MAE and MAPE, we found that the ARMAGARCH model performs slightly better than the ARIMA model. The two mentioned models give better performance when compared with the ANN model for the in-sample forecasts of the first period data. However the ANN was found to outperform the ARIMA and
ARMA-GARCH models in the out-sample forecasting for both periods data.

vi

Chapter 1

Introduction
Foreign exchange rates are among the most important determinants of the economic health
of a country. They describe the price of one currency in terms of another one and they
play a vital role in the trading relationship between countries, which in turn affect the
world economy. For this reason, they are the most watched, analysed and governmentally manipulated economic measures. Understanding the evolution of exchange rates is
important for many essential issues in international economics and finance, such as international trade, capital flows, international portfolio management, currency option pricing,
and foreign exchange risk management. The foreign exchange market has experienced
many unexpected growth and downfall over the last few decades. The dynamics of the
exchange market depend entirely on the exchange rates. Thus the appropriate prediction
of exchange rates is a very crucial factor for the success of many businesses on the global
market. The exchange market is in itself well known for being extremely unpredictable
and volatile. A volatile exchange market makes international trade and investment decisions more difficult because volatility increases exchange rate risk which may result into a
potential loss due to a change in the rates.
Forecasting any time series accurately is very difficult and to add up to it, exchange
rates prediction is one of the most challenging applications of modern time series forecasting. The exchange rates are generally noisy, non-stationary and deterministically chaotic
(Yaser & Atiya 1996) which suggest that there is no past behaviour information which can
produce a relation between the past and the future behaviour. However, though all the
constraints mentioned, numerous techniques have been devised by researchers to forecast

1.1 Forecasting Models

the exchange rates. And the search for a reliable model to predict exchange rates is still
ongoing.

1.1

Forecasting Models

Forecasting foreign exchange rates is a very important issue in the economic world. During
the past years, different models were developed using the linear and nonlinear framework.
A linear model is one which can predict the future values of exchange rates by identifying
and magnifying the existing linear structure in the data. And the most commonly used
linear models for exchange rates forecasting are the Box and Jenkins Autoregressive Integrated Moving Average (ARIMA) models. One of the attractive features of Box and Jenkins
approach to forecasting is that there is a very rich class of possible models and it is usually
possible to find a process which provides an adequate description of the data. The ARIMA
model is also a very powerful instrument for construction of accurate forecasts with small
distance of forecasting. Since it is extremely popular, this model is used as a benchmark
to evaluate new modelling approaches. ARIMA models are very effective techniques for
forecasting when the dynamics of the time series is linear and stationary (Cao & Tay 2001).
However, nonlinearities in exchange rates are supported by many evidences. Thus approximation by ARIMA models may not be adequate since it cannot capture nonlinear patterns
in the exchange rates data.
Ever since the inadequacy of linear models (Racine 2001) was observed, there has
been considerable development in modeling nonlinearities. Thus nonlinear models such
as the Autoregressive Conditional Heteroscedasticity (ARCH) model and the Generalized
Autoregressive Conditional Heteroscedasticity (GARCH) model were developed. They
have been found to be useful in capturing certain nonlinear features of financial time series
such as clusters of outliers. (Bollerslev & Ghysels 1996) had shown successful application
of GARCH model in describing the dynamic behavior of the exchange rates. However
there exists no distinguished theory that can be applied to all such nonlinear models as
they require specific assumptions concerning the precise form of nonlinearity. Moreover
there exist too many possible nonlinear patterns in a particular data set, thus a specific
nonlinear model may not be sufficient in capturing all of them.
In response to all the constraints of the linear and nonlinear models, artificial neural

1.2 Aims and Organization of the Project

networks (ANNs) have been used to forecast the exchange rates. ANNs resemble and operate in the same way as our biological neural system. Due to their unique non-parametric,
non-assumable, noise-tolerant and adaptive properties (Haoffi & Han 2007), ANNs can
deal better with non-stationary and volatile data. ANNs have flexible nonlinear function
mapping which can approximate any continuous measurable function with arbitrary desired accuracy. They are found to have an upper hand on various traditional linear and
nonlinear models in the exchange rates forecasting. Many researchers such as (Wang
& Leu 1996), (Tang & Fishwich 1993), (T. Hill & Remus 1996) have shown that ANNs
perform better than ARIMA models (linear models), especially for more irregular series
and for multiple-period-ahead forecasting.

(Gencay 1999) and (R. K. Bissoondeeal &

Mootanah 2008) find that forecasts generated by neural network are superior to those of
ARIMA and GARCH models. (Panda & Narasimhan 2007) have shown that the performance of ANNs is better than the linear autoregressive model and random walk model
for the one-step-ahead prediction, thus suggesting that there always exists a possibility of
forecasting exchange rate. However other studies have reported inconsistent results for
example (W. R. Foster & Ungar 1992) have shown the ANNs are inferior to linear regression and (Meade 2002) find no evidence that the foreign exchange rate behaviour is better
represented by the ANNs than the linear model.

1.2

Aims and Organization of the Project

This project studies neural network methods for modeling the Mauritian Rupee (MUR)
against the three most important currencies which are the US Dollar, the Euro and the
British Pound. We focus on various neural network models and assess their performance
for in-sample and out-of-sample forecasts and the results are compared against forecasts
produced by ARIMA and GARCH models.
The organization of the project is as follows: In Chapter 2, we discuss how the Artificial
Neural Network works and learning algorithm used. Chapter 3 describes linear and non
linear time series models. In Chapter 4, we analyse our data and compare the forecasting
accuracies of the different models.

Chapter 2

Artificial Neural Networks


2.1

Introduction

Artificial neural networks, which imitate the human brains ability to classify patterns and
make prediction based on past experience, have found applications in different areas such
as; financial forecasting, medical diagnostics, flight control and product inspection. The
artificial neural networks have been widely used in applied forecasting due to their ability
to model complex relationships between input and output variables and also because of
the presence of nonlinearities in many time series.

2.2

Feedforward Network

The feedforward neural network is the most commonly used network in applied work
due to its capability of resolving a large number of problems. It consists of a considerable
number of simple processing units known as neurons which are organised in layers. A
feedforward neural network begins with an input layer which is connected to a hidden
layer. The hidden layer can then be connected to another hidden layer or directly to the
output layer. In this architecture data enters at the input layer and passes through the network layer by layer until it arrives at the output layer. The hidden layer is known to be
a very important layer in the neural network since it is responsible to approximate a continuous function and achieve the desired accuracy. And since a single hidden layer is less
4

2.2 Feedforward Network

complex and is enough to produce the desired result, researchers have preferred these to
two hidden layer models. There exists a transfer function in both the hidden layer and the
output layer which transfers the signals it receives into an output. In a feedforward neural
network, each neuron in one layer has a directed connection to the neurons in the preceding layer. The communication links between the neurons have an associated numerical
value which is known as the weight which is a very important parameter in an ANN since
it stores the knowledge of a specific problem. In a feedforward neural network two stages
can be taken into consideration, firstly the running stage, where the input pattern is presented to the network and transmitted through all the layers of neurons until it reaches an
output and secondly the training or learning stage, where the free parameters such as the
weights and the biases are iteratively modified in order to finally obtain an output which
does not deviate too much from the users desired output

Figure 2.1: Three-Layer feedforward Network


The given network in Figure 2.1 has n input neurons, k neurons in the hidden layer
and only one output y. The Input vector X=(x1 , x2 , , xn )T is connected to each neuron
input through the weight matrix

1,1
..
..
W = .
.

wk,1
5

w1,n

..
.

wk,n

(2.1)

2.2 Feedforward Network


The hidden Layer Vector is Z=(z1 , z2 , , zk )T
Each hidden neuron has a bias which is being added to the received weighted sum of all
the inputs to form a net input. The bias may be viewed as simply being added to shift the
function to the left by an amount . It is much like a weight, except that it has a constant
input of 1.
n
X

net input =

(2.2)

wi,j xi

i=1

The net input is the argument of the transfer function f which is applied to construct the
output of a specific neuron.

zj = f +

n
X

!
wi,j xi

wherej = 1, 2, , k

(2.3)

i=1

In the output layer, the output neuron receives the weighted sum of the processed signals
obtained from the hidden layer neurons. Another function is applied to produce the
final output.

y = +

k
X

(2.4)

j zj

j=1

Where is the transfer function, is the bias unit, j is the weight from the hidden neuron
j to the output unit.
Replacing zj in the equation 2.4, we get

y = +

k
X

j f +

j=1

n
X

!
wi,j xi

(2.5)

i=1

In our feedforward network, a hyperbolic tangent sigmoid transfer function is used in


the hidden layer and a linear transfer function is used in the output layer. This is so be6

2.3 Backpropagation Learning Algorithm

cause these two transfer function had proved to be successful earlier.


The hyperbolic tangent sigmoid is given by f (x) = tanh(x) and the linear function transfer
function is f (x) = x. A nodes transfer function serves the purpose of controlling the output signal strength for the node. These functions set the output signal strength between 1.0
and 1.0. It was also noticed that the hyperbolic tangent sigmoid function can accelerate
learning for some models and also have an impact on predictive accuracy. The learning
rule commonly used in this type of network is the backpropagation algorithm.

2.3

Backpropagation Learning Algorithm

The process of learning is implemented by modifying the weights iteratively until the desired response is achieved at the output node. Backpropagation is known to be the most
popular supervised learning algorithm. A supervised learning algorithm is one which accepts input values, computes the output values and compares it with the desired output
values, then adjusts the weights to minimize the deviance. This process is carried out until
the network cannot further reduce the error.

Figure 2.2: The neuron weight adjustment


Backpropagation learning updates the network weights and biases in the direction in
which the performance function decreases the most. The gradient is computed and the
weights are updated after each input in the increment mode. Backpropagagtion starts at
7

2.4 Training set and Testing set

the output layer with the following equations:


0

(2.6)

wij = wij + l.ej .xi


0

Where ith input of the j th neuron, wij is the weight, wij is the previous weight value, l is
the learning rate, ej is the error term and xi is the ith input.
The backpropagation algorithm looks for the global error from the error function in the
weight space using the method of gradient descent. In gradient descent, weights are
changed in proportion to the negative of an error derivative with respect to each weight. In
this specific algorithm, the network follows the curvature of the error surface with weight
update moving in the direction of the steepest descent. However there is a high probability that the network does not reach the global minimum since it may be stuck at the local
minima which does not represent optimal solution.
Momentum is a technique that can help the network out of the local minima. It is an
extension of the backpropagation algorithm which can be helpful in speeding the convergence and avoiding local minima. With momentum, if the weights are moving in a
particular direction, they tend to continue in the same direction. It is also noticed that
momentum smoothes the weight changes. The momentum factor determines the effect of
past changes on current changes of weights and also increases the speed of the learning
rate. The momentum factor which is constantly used has a value very close to 1. The ratio
which influences the speed and quality of learning is called the learning rate. The learning
rate plays a very important role in the learning process of a network, as it controls the size
of the changes in weight at each iteration. Thus the right choice of the learning rate is extremely important since a too small or a too big change in the size of the weight will affect
the result of the network. A learning rate between 0.05 and 0.5 was found to provide good
results in many practical cases.

2.4

Training set and Testing set

In neural network forecasting, we divide our data into a training set and a test set. The
training set consists of the input data and the target data which need to be presented to the
8

2.4 Training set and Testing set

network. However some transfer functions need the input and target data to be scaled so
as they fall within a specified range. Thus in order to meet this requirement we pre-process
our data by normalising the inputs and targets so that they fall in the interval of [1, 1].
When these input data are presented to the network, the latter makes a guess of the correct
answer and compares it with the target data. The network goes through the data again
and again depending on the number of epochs used, adjust the weight value so as to reach
a value close to the target value. The training set is used to build up the model whereas
the test set which is independent of the training set is used to measure the performance of
the model. More precisely it is used to evaluate the out-of-sample performance. After the
forecasts are obtained from the network we need to convert the data back to the original
scale by the process called post-processing. Moreover, we find from previous studies that
there is no precise rule on the optimum size of the two data set.

Chapter 3

Linear and Non-linear Time Series


In this chapter we consider the widely used time series models for forecasting. We first
briefly describe the Autoregressive Integrated Moving Average (ARIMA) model which
is a linear model and we then consider the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) model as the non-linear model.

3.1

Stationary Processes

A stochastic process XtZ is a family of random variables defined on a probability space.


The joint cumulative distribution function of Xt is defined as
Ft1 ,t2 ,...,tn (x1 , x2 , . . . , xn ) = P (Xt1 x1 , Xt2 x2 , . . . , Xtn xn )

(3.1)

and the process Xt is said to be strictly stationary if


Ft1 +s,t2 +s,...,tn +s (x1 , x2 , . . . , xn ) = Ft1 ,t2 ,...,tn (x1 , x2 , . . . , xn )

(3.2)

for all n-tuples (x1 , x2 , . . . , xn ), (t1 , t2 , . . . , tn ) and for any s The mean function of Xt is
given by
Z
t = E(xt ) =

xdFt (x)
<

10

(3.3)

3.1 Stationary Processes

and the autocovariance function is given by


(t, `) = E[(xt t )(xt` t` )]

(3.4)

The process is said to be weakly stationary if t = for all t and (t, `) = ` . In this case we
have ` = ` .
A white noise process t is a process such that = 0 and ` = 2 and ` = 0 for ` 6= 0. We
denote such a process by t W N (0, 2 ).

3.1.1

Autoregressive process

The autoregressive process of order p is denoted by AR(p) and it is defined as follows:

Xt =

p
X

r Xtr + t

(3.5)

r=1

where t W N (0, 2 ) or (B)Xt = t , where B is the backshift operator, such that


(B) = 1 1 B 2 B 2 p B p
The process is invertible since

Pp

r=1 |r |

(3.6)

< and for the process to be stationary the roots

of (B) = 0 must lie outside the unit circle.

3.1.2

Moving Average Process

The moving average process of order q is denoted by MA(q) and is defined by

Xt =

q
X

s ts

(3.7)

s=0

or the process can also be written as Xt = (B)t where,


(B) = 1 1 B 2 B 2 q B q

11

(3.8)

3.2 ARIMA (p, d, q) Processes

Because MA processes consist of a finite sum of stationary white noise terms, they are stationary hence they have mean zero. The process is invertible when the roots of (B) all
exceed unity in absolute value.

3.1.3

Random Walk

In a random walk model at each point of time, the series moves randomly away from its
current position. The model can then be written as
Xt = Xt1 + t

(3.9)

We see that the random walk model has the same form as an AR(1) process, but, since
= 1, it is not stationary.
Repeatedly substituting for past values gives

Xt = X0 +

t1
X

tj

(3.10)

j=0

We find that the first difference of the random walk is stationary, as it is just white noise:
Xt Xt1 = t

3.2

(3.11)

ARIMA (p, d, q) Processes

An ARIMA model is a combination of the Autoregressive (AR), differencing and Moving


Average (MA) processes.
If the original process Xt is not stationary, we can look at the first order difference process
Yt = Xt Xt1
or the second order differences and so on.
A general ARIMA model of order (p, d, q) can be represented as follows

12

(3.12)

3.2 ARIMA (p, d, q) Processes

(B)d Xt = (B)t

(3.13)

where (B) is the AR operators of order p and (B) is the MA operators of order q.
Xt is the observed value at time t, t W N (0, 2 ) and d is the number of time the data
series must be differenced to produce a stationary time series.
Fitting an ARIMA model to the raw data involves the following four steps iterative cycles:
model identification, estimation of parameters p, d, q, diagnostic checking and the forecasting process.

3.2.1

Autocorrelation and Partial Autocorrelation Functions

The autocorrelation function (ACF) can be used to detect non-randomness in data and also
to identify an appropriate time series model if the data are not random.
Given the data, x1 , x2 ,. . . , xN at time t1 , t2 ,. . . , tN , the lag k autocorrelation function is
defined as:

PN k
k =

(xi x
)(xi+k x
)
Pk
2
)
i=1 (xi x

i=1

(3.14)

It is assumed that the observation is equi-spaced thus the time variable t, is not used in the
formula for autocorrelation. The correlation is between two values of the same variable
at times ti and ti+k . When the autocorrelation is used to identify an ARIMA model, the
autocorrelations are plotted for many lags.
The use of the partial autocorrelation function (PACF) was introduced to time series modelling, where one could easily determine the appropriate lags p in AR(p) model or the
extended ARIMA (p, d, q) model by just plotting the PACF
The PACF is the conditional correlation,
Correlation(xt , xt+k |xt+1 , ......., xt+k1 )

(3.15)

Given a time series , the partial autocorrelation of lag k, denoted, is the autocorrelation
between xt and xt+k with the linear dependence of xt+1 through to xt+k1 removed.

13

3.3 GARCH Model

Process

ACF

PACF

AR(p)

Dies down exponentially or sinusoidally

cuts off after lag p

MA(q)

Cuts off after lag q

Dies down exponentially of sinusoidally

ARMA(p, q)

Dies down exponentially or sinusoidally

Dies down exponentially or sinusoidally

Table 3.1: Behaviour of ACF and PACF

3.3

GARCH Model

The GARCH model which is a generalisation of the autoregressive conditional heteroskedaticity (ARCH) model was introduced by (Bollerslev 1986) and has been used by many
researchers in modeling financial time series. It has been found that a wide range of financial data exhibit time varying volatility clustering which is the property that there are
periods of high and low variance. In response to these, (Engle 1982) suggested the ARCH
model as an alternative to the usual time series process.

3.3.1

ARCH Model

The autoregressive conditional heteroskedasticity (ARCH) model is the very first model of
conditional heteroskedasticity. It is a forecasting model which forecasts the error variance
at time t on the basis of information known at time t 1 and it is expressed as a moving
average of the past error terms. The following conditional variance defines an ARCH
model of order q:
t2

= 0 +

q
X

i 2ti

(3.16)

i=1

0 0, i 0 where i must be estimated from the data.


The error term will have the form:
t = t zt

(3.17)

where zt is a sequence of identically and independent distributed random variables with


zero mean and unit variance.

14

3.3 GARCH Model

3.3.2

GARCH model

The GARCH model is an extended version of the ARCH model. In this GARCH(p, q)
model the variance is a linear function of its own lags and has the form

2
2
t2 = 0 + 1 2t1 + + q 2tq + 1 t1
+ + p tp

= 0 +

q
X
i=1

i 2ti +

p
X

2
i ti

(3.18)
(3.19)

i=1

The rate of decay of the ARCH model is considered to be to be too fast when compared
with the usual financial series, unless the value of q in 3.16, is large. Thus the GARCH
model is prefered since it enables very complicated heteroscedasticity patterns to be modeled at low orders of p and q. The most popular GARCH model in applications has been
the GARCH(1, 1) model, that is, p = q = 1 in 3.18.

15

Chapter 4

Analysis of Exchange Rate Data


We study the daily rates of MUR against the US dollar (MUR/USD), the Euro (MUR/EU)
and the British Pound (MUR/GBP). The data sets were obtained from the Bank of Mauritius for the period of January 2003 to December 2011 for MUR/USD and MUR/EU and
for the period of January 2002 to December 2011 for MUR/GBP.
For our analysis, we choose two periods: the first period January 2003 to December 2008
for MUR/USD and MUR/EU and January 2002 to December 2008 for MUR/GBP which
represents the data till the financial crisis and for our second period data we take the original data obtained.
The daily returns are calculated as the log differences of the levels. Let xt be a given exchange rate time series, then the exchange rate return series yt is given by

yt = ln

4.1

xt
xt1


(4.1)

Analysis of Period I

Figure 4.1, 4.2 and 4.3 the daily exchange rates data and their corresponding return of the
MUR against USD, EU and GBP:

16

4.1 Analysis of Period I

30
26

28

USD

32

MUR/USD

2003

2004

2005

2006

2007

2008

2009

2008

2009

Time

returns

0.010

0.005

0.000

0.005

0.010

Log differenced of MUR/USD

2003

2004

2005

2006

2007

Time

Figure 4.1: Daily MUR/USD data and its corresponding first differences of the logs

30

35

EURO

40

45

MUR/EURO

2003

2004

2005

2006

2007

2008

2009

2008

2009

Time

returns

0.03

0.02

0.01

0.00

0.01

0.02

0.03

Log differenced of MUR/EURO

2003

2004

2005

2006

2007

Time

Figure 4.2: Daily MUR/EU data and its corresponding first differences of the logs

17

4.1 Analysis of Period I

55
45

50

GBP

60

65

MUR/GBP

2002

2003

2004

2005

2006

2007

2008

2009

2008

2009

Time

0.06

0.04

0.02

returns

0.00

0.02

0.04

Log differenced of MUR/GBP

2002

2003

2004

2005

2006

2007

Time

Figure 4.3: Daily MUR/GBP data and its corresponding first differences of the logs
In each of the three daily log returns, we observe that there is large random fluctuations. However all the three series appear to be stationary which mean thats the random
variation is constant over time. There is volatility clustering in each of the log return series
since we can see periods of high and low variation.

Mean

Median

Max

Min

S.D

Skewness

Kurtosis

MUR/USD

6.46 e-005

0.0000

0.0119

-0.0101

0.0018

1.0889

11.9486

MUR/EU

2.70 e-004

1.32 e-004

0.0311

-0.279

0.0061

0.0698

5.1469

MUR/GBP

3.80 e-005

0.0000

0.0462

-0.416

0.0058

-0.1038

9.8907

Table 4.1: Summary statistics for the daily exchange rates: log first difference
The standard deviations indicate that the MUR/EU return data is more volatile than
that of the MUR/USD and MUR/GBP return series. The skewness coefficient for MUR/USD
and MUR/EU are positive which indicate that the tail on the right is longer than the left
side. For MUR/GBP the bulk of values lie to the right of the mean. Kurtosis is a measure
of whether the data peaks or flattens relative to a normal distribution. For all the three
18

4.1 Analysis of Period I

return series the kurtosis is larger than that of the normal distribution (which is equal to
3), which in turn indicates leptokurtosis. The leptokurtosis indicates that the series are
clustered during certain periods and the volatility changes at a relatively low rate thats is
large changes tend to be followed by large changes and small changes by small changes.

kernel density estimation

Density

400

600

400
300

100

200

200

Frequency

800

500

1000

600

1200

700

Histogram with Normal Curve

0.010

0.005

0.000

0.005

0.010

0.010

0.005

return

0.000

N = 1500

0.005

0.010

Bandwidth = 0.0001054

Figure 4.4: Histogram and Kernel Density estimate of daily log return of MUR/USD

kernel density estimation

Density

20

40

400
200

Frequency

60

600

80

800

Histogram with Normal Curve

0.06

0.04

0.02

0.00

0.02

0.04

0.06

0.04

return

0.02

N = 1750

0.00

0.02

0.04

Bandwidth = 0.00095

Figure 4.5: Histogram and Kernel Density estimate of daily log return of MUR/GBP

kernel density estimation

40
0

100

20

200

Density

Frequency

300

60

400

80

500

Histogram with Normal Curve

0.03

0.02

0.01

0.00

0.01

0.02

0.03

0.03

return

0.02

0.01
N = 1500

0.00

0.01

0.02

0.03

Bandwidth = 0.001075

Figure 4.6: Histogram and Kernel Density estimate of daily log return of MUR/EU
19

4.1 Analysis of Period I

The plots indicate that the normality assumption is questionable for all the three daily
log return series.

Figure 4.7: Scatter plot of yt against yt1 for the MUR/USD log return data.

Figure 4.8: Scatter plot of yt against yt1 for the MUR/EU log return data.

Figure 4.9: Scatter plot of yt against yt1 for the MUR/GBP log return data.
In the scatter plots shown above the data at time t is plotted against the value at time
t 1. It is one way of showing the degree of correlation in the data

20

4.1 Analysis of Period I

4.1.1

Analysis of Period II data


Mean

Median

Max

Min

S.D

Skewness

Kurtosis

MUR/USD

8.91 e-006

0.0000

0.0122

-0.0101

0.0021

0.7574

7.871

MUR/EU

1.07 e-004

1.04 e-004

0.0311

-0.0279

0.0061

0.0843

5.044

MUR/GBP

2.02 e-005

0.0000

0.0462

-0.0416

0.0060

-0.2152

8.379

Table 4.2: Summary statistics of log first difference daily exchange rate
We find that the data sets for the period of January 2003 to December 2011 for MUR/USD
and MUR/EU and for the period of January 2002 to December 2011 produce the same
characteristics as described above for the sample data sets. We observe that the minimum
and the maximum value remain the same for the two periods, showing that in the sample
data sets the three series had already attain their maximum and minimum value.

26

28

30

USD

32

34

MUR/USD

2004

2006

2008

2010

Time

Figure 4.10: Daily MUR/USD Jan 03 - Dec 11

21

2012

4.2 Forecasting Accuracy

30

35

EURO

40

45

MUR/EURO

2004

2006

2008

2010

2012

Time

Figure 4.11: Daily MUR/EU Jan 03 - Dec 11

55
45

50

GBP

60

65

MUR/GBP

2002

2004

2006

2008

2010

2012

Time

Figure 4.12: Daily MUR/GBP Jan 02 - Dec 11

4.2

Forecasting Accuracy

For comparing forecasts produced by different models, the root mean square error, the
mean absolute error and the mean absolute percentage error are calculated.
et = Yt Yt
denote the forecast error where Yt is the forecasted value of the observed value Yt .
1. Root Mean Square Error
v
u n
u1 X
RMSE = t
e2t
n
t=1

22

4.3 Fitting Neural Network Model to the Exchange Rates Data

2. Mean Absolute Error

1X
|et |
n

MAE =

t=1

3. Mean Absolute Percentage Error


n

MAPE =

1 X |et |
n
Yt

!
100

t=1

The MSE which is the most common measure of forecasting acccuracy indicates the degree of spread, however large errors are given additional weight. RMSE is considered,
since the forecast error is then denoted in the same dimensions as the actual and forecast
values themselves. It is found to be most informative for the errors with near normal distribution. The mean absolute error is a very popular measure of forecast error since it
compares forecast with their eventual outcomes. However it was emphasized by previous
research that the MAPE is the most useful measure to compare the accuracy of forecasts
since it measures relative performance.

4.3

Fitting Neural Network Model to the Exchange Rates Data

The choice of an appropriate architecture is in itself a very difficult task and to add up to it
there is a large number of parameters which must be estimated.
For our Neural Network model we used the feedforward neural network with a single
hidden layer. The number of neurons in the hidden layer was varied between 1 and 5.
The activation function in the hidden layer neuron is Tansigmoid (tansig) and that in the
output layer in the linear function (purelin). To set up the learning rate we ran the network with a large number of different learning rates between 0.05 and 0.5 before settling
on 0.25 which gave us the best results. The momentum value was analyzed and varied
between 0.1 and 0.9 and we found that 0.4 and 0.8 gave us better results. However we
use 0.8 during the experiments since it is common to choose the momentum value close to
1. The training algorithm Traingdm was used since it provides faster convergence in our

23

4.4 Fitting ARIMA Model to the Foreign Exchange Rates Data

feedforward network. The number of epochs used while training was between 30000 and
50000 depending on the performance graph which shows us the point where the network
was sufficiently trained.
After having estimated these parameters, we focus our case study on the following issues
which are firstly the number of input variables and secondly the number of hidden neurons in the hidden layer. All the computations regarding the neural network were carried
out in Matlab.
The number of input variables are based on the number of lagged past observation. The
first input consists of training the data using only the first previous value Lag (1) that is Yt
is the target value thus we used Yt1 as the input. When we have 2 input variables we use
Lag (1, 2) that is we use Yt1 and Yt2 as input and Yt as target. The experiment is carried
out using Lag (1) to Lag (1-12).
We use the daily data of MUR/USD, MUR/EU and MUR/GBP we divide the data into
the training set (In-sample) and testing set (Out-of-sample). The test set for all the data
consists of the values of the whole year of 2008 which we shall try to forecast. For each
input variable set we only experiment with 1-5 hidden nodes.

4.4

Fitting ARIMA Model to the Foreign Exchange Rates Data

ARIMA modeling consists of three stages which are the identification stage, the estimation
and diagnostic checking stage and the forecasting stage.
During the identification stage, we take our data and convert it into time series and find
ACF and PACF. The stationary tests can be performed to determine if differencing is
needed. The analysis of the AFC and PACF graphs usually suggest one or more tentative models that can be fitted in our data.
In the estimation and diagnostic checking stage, the diagnostic statistics help us to judge
the adequacy of the models found in the identification stage. The goodness-of-fit statistics
aid in compairing the model to others. The model with the least Akaike Information Criterion (AIC) and Bayesian information criterion (BIC) is retained.
After inspecting the ACF and PACF for identifying the best model for ARIMA forecasting

24

4.5 Fitting the GARCH model

different values of p, d and q are fitted and compared, if any of these model proved to
be better then we fit model for the forecasting stage. Moreover there exists an auto.arima
function in R which can help us to find a fitted model for the data set.

4.5

Fitting the GARCH model

The first step while fitting a GARCH model is to select the parameters of the specific model
however we have check for the ARCH effect in the exchange rate data. From the log return
plots of the MUR/USD, MUR/EU and MUR/GBP we find that there are heavy fluctuations in the data along with the presence of volatility clustering which gives us a hint that
the data may not be identically and independently distributed (iid). The ACF and PACF
for the log return, the absolute log return and the square of the log return is taken and if
they produce significant autocorrelation then the data is not iid and thus we can say that
there exists the ARCH effect.
The ARMA-GARCH model is fitted to the data and the parameters of the equations of the
model is estimated by the ACF, PACF and EACF plot however the parameters values are
varied and different models are fitted to the in-sample data. Both the in-sample and outof-sample forecast performance are produced.

25

4.6 Empirical findings

4.6

Empirical findings

4.6.1

Forecasts performance of Period I

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(3,1,0)

0.0273

0.0172

0.0586

ARIMA(2,1,3)

0.0270

0.0172

0.0587

AR(1)-GARCH(1,1)

0.0271

0.0172

0.0587

50000

0.0384

0.0249

0.1000

1-2

50000

0.0321

0.0219

0.0744

1-3

50000

0.0308

0.0194

0.0658

1-4

50000

0.0303

0.0199

0.0673

1-5

50000

0.0322

0.0207

0.0706

1-6

50000

0.0327

0.0216

0.0735

1-7

50000

0.0322

0.0210

0.0712

1-8

50000

0.0338

0.0224

0.0760

1-9

50000

0.0310

0.0197

0.0672

1-10

50000

0.0306

0.0197

0.0670

1-11

50000

0.0310

0.0203

0.0689

1-12

50000

0.0354

0.0264

0.0894

Table 4.3: In-sample performance of MUR/USD for data Jan 2003-Dec 2008

26

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

0.1993

0.1553

0.4219

ARIMA(0,1,0)

0.1993

0.1552

0.4214

AR(1)-GARCH(1,2)

0.1990

0.1551

0.4213

40000

0.2026

0.1595

0.4336

1-2

50000

0.2007

0.1569

0.4260

1-3

50000

0.2008

0.1579

0.4282

1-4

50000

0.2007

0.1573

0.4273

1-5

50000

0.1996

0.1558

0.4230

1-6

50000

0.1995

0.1559

0.4233

1-7

50000

0.2010

0.1574

0.4269

1-8

40000

0.2029

0.1599

0.4343

1-9

50000

0.2020

0.1578

0.4278

1-10

50000

0.1996

0.1563

0.4235

1-11

50000

0.2006

0.1563

0.4320

1-12

50000

0.2001

0.1564

0.4243

Table 4.4: In-sample performance of MUR/EU for data Jan 2003-Dec 2008

27

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

0.2710

0.2032

0.3828

ARIMA(0,1,0)

0.2709

0.2030

0.3823

AR(1)-GARCH(1,1)

0.2700

0.2026

0.3943

40000

0.2776

0.2097

0.3960

1-2

50000

0.2733

0.2050

0.3868

1-3

40000

0.2732

0.2052

0.3872

1-4

50000

0.2739

0.2052

0.3872

1-5

40000

0.2773

0.2096

0.3957

1-6

50000

0.2750

0.2067

0.3888

1-7

50000

0.2716

0.2057

0.3879

1-8

50000

0.2710

0.2041

0.3848

1-9

50000

0.2752

0.2095

0.3945

1-10

50000

0.2736

0.2058

0.3865

1-11

50000

0.2725

0.2056

0.3865

1-12

50000

0.2718

0.2057

0.3874

Table 4.5: In-sample performance of MUR/GBP for data Jan 2002-Dec 2008

28

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(3,1,0)

2.011

1.588

5.331

ARIMA(2,1,3)

2.331

1.615

5.250

AR(1)-GARCH(1,1)

2.161

1.562

5.131

50000

0.0979

0.0682

0.2338

1-2

50000

0.0704

0.0493

0.1687

1-3

30000

0.0754

0.0548

0.1871

1-4

50000

0.0760

0.0537

0.1835

1-5

30000

0.0746

0.0554

0.1896

1-6

30000

0.0864

0.0639

0.2188

1-7

30000

0.0843

0.0621

0.2121

1-8

50000

0.0786

0.0576

0.1970

1-9

30000

0.0791

0.0589

0.2014

1-10

50000

0.0815

0.0570

0.1942

1-11

50000

0.0864

0.0638

0.2178

1-12

50000

0.0784

0.0574

0.1964

Table 4.6: Out-of-sample performance of MUR/USD for data Jan 2003-Dec 2008

29

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

1.271

0.8503

1.955

ARIMA(0,1,0)

1.119

0.8037

1.868

AR(1)-GARCH(1,2)

1.265

0.8439

1.940

30000

0.3646

0.2642

0.6196

1-2

50000

0.3640

0.2654

0.6229

1-3

30000

0.3647

0.2643

0.6200

1-4

30000

0.3637

0.2636

0.6182

1-5

50000

0.3634

0.2632

0.6174

1-6

30000

0.3642

0.2642

0.6197

1-7

30000

0.3650

0.2649

0.6219

1-8

50000

0.3631

0.2646

0.6213

1-9

30000

0.3650

0.2653

0.6224

1-10

50000

0.3638

0.2658

0.6237

1-11

30000

0.3638

0.2657

0.6236

1-12

30000

0.3631

0.2655

0.6230

Table 4.7: Out-of-sample performance of MUR/EU for data Jan 2003-Dec 2008

30

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

4.429

3.755

7.244

ARIMA(0,1,0)

4.537

3.853

7.431

ARMA(1,1)-GARCH(1,1)

4.137

3.477

6.708

30000

0.4796

0.3431

0.6497

1-2

30000

0.4772

0.3410

0.6456

1-3

50000

0.4772

0.3405

0.6447

1-4

30000

0.4794

0.3402

0.6443

1-5

50000

0.4776

0.3438

0.6498

1-6

30000

0.4845

0.3433

0.6501

1-7

50000

0.4838

0.3477

0.6581

1-8

50000

0.4825

0.3450

0.6530

1-9

30000

0.4841

0.3431

0.6498

1-10

30000

0.4848

0.3435

0.6506

1-11

50000

0.4807

0.3430

0.6495

1-12

30000

0.4850

0.3439

0.6512

Table 4.8: Out-of-sample performance of MUR/GBP for data Jan 2002-Dec 2008

ANN

ARIMA

GARCH

Year/daily

Actual values

Forecasts

Errors

Forecasts

Errors

Forecast

Errors

2008/1

57.1351

57.4654

-0.3303

57.2970

-0.1619

57.3049

-0.1698

2008/2

57.1221

57.2047

-0.0826

57.2961

-0.174

57.3041

-0.182

2008/3

57.1823

57.1134

0.0689

57.2953

-0.113

57.3032

-0.1209

2008/4

57.3500

57.1434

0.2066

57.2944

0.0556

57.3024

0.0476

2008/5

56.9235

57.253

-0.3295

57.2936

-0.3701

57.3106

-0.3871

2008/6

57.2858

56.8477

0.4381

57.2928

-0.007

57.3007

-0.0149

2008/7

57.2978

57.2938

0.004

57.2919

0.0059

57.2999

-0.0021

2008/8

57.2383

57.1821

0.0562

57.2911

-0.0528

57.2991

-0.0608

2008/9

57.3006

57.2182

0.0824

57.2903

0.0103

57.2982

0.0024

2008/10

57.4102

57.235

0.1752

57.2894

0.1208

57.2974

0.1128

Table 4.9: First 10 forecasts values of MUR/GBP for ANN, ARIMA and GARCH models

31

4.6 Empirical findings

4.6.2

Results and Discussions for Period I

The tables 4.3 to 4.8 show the minimum error obtained when we vary the number of neurons (between 1 5) in the hidden layer for each Lag (1) to Lag (1 12). We observe from
our data that as the number of neurons in the hidden layer increases the amount of epochs
to train the network must also increase so as to reach a point where the forecast value does
not change anymore. Thus increasing the number of epochs when the network has already
reached the global minima is useless and very time consuming. We reach the conclusion
that the number of epochs used is dependent to the number of neurons in the hidden layer.
Moreover we also notice that as the number of neurons increases the performance of the
network changes. We tend to achieve the best forecasts when we have 2 or 3 neurons in
the hidden layer and as the number of neurons increases from 5 the forecast values tend
to move away from the target values since great increase in the RMSE, MAE and MAPE
values are observed.
In the tables above the in-sample performance are presented followed by the out-of-sample
performance. Here the In-sample data set for MUR/USD and MUR/EU is taken from January 2003-December 2007 and for MUR/GBP is taken from January 2002-December 2007.
The out-of-sample data set in all the three cases is taken for January 2008-December 2008.
As for the number of inputs presented to the network, we cannot find any trend about how
it affects the performance. In our experiment for both the in-sample and out-of-sample
forecast we observe that each data set has a different number of inputs where the neural
network works best.
Considering the in-sample forecast we find that for MUR/USD the ARIMA(2, 1, 3) model
outperforms the ANN models however the AR(1)-GARCH(1, 1) model gives almost the
same forecast accuracy. As for the MUR/EU and MUR/GBP model the GARCH models
perform much better than the other models. Thus we can conclude that for in-sample forecasts the GARCH model produces better forecasts than the ARIMA and the ANN models.
For out-of-sample forecasts the GARCH models used for both MUR/USD and MUR/GBP
gives a superior results than the ARIMA models when the RMSE, MAE and MAPE are
considered as evalution criteria. However when compared with the ANN model the forecast performance is found to be far behind.
The table 4.9 show that first 10 out-of-sample forecasts of MUR/GBP for ANN, ARIMA
and the GARCH models. We notice that the three models perform very accurately for the
32

4.6 Empirical findings

first 10 forecasts and since the MAPE values for the forecasts of year 2008 of the ARIMA
and GARCH models are quite inferior to the ANN model value we can say that as the
ANN continue to predict more values it out-perform the other two models.
The diagrams 4.13, 4.14 and 4.15 shows the In-sample and Out-of-sample forecasts
produced by the ANN models:

Figure 4.13: In-sample and out-of-sample forecast for MUR/USD

Figure 4.14: In-sample and out-of-sample forecast for MUR/EU

33

4.6 Empirical findings

Figure 4.15: In-sample and out-of-sample forecast for MUR/GBP

4.6.3

Forecasts performance of Period II

Lags

No of Epochs

No of Neurons

RMSE

MAE

MAPE

ARIMA(5,1,0)

0.0489

0.0313

0.1025

ARIMA(2,1,3)

0.0489

0.0314

0.1026

AR(1)-GARCH(1,1)

0.0049

0.0314

0.1031

50000

0.0671

0.0454

0.1505

1-2

50000

0.0547

0.0368

0.1214

1-3

50000

0.0551

0.0364

0.1200

1-4

50000

0.0537

0.0363

0.1198

1-5

50000

0.0544

0.0374

0.1236

1-6

40000

0.0562

0.0396

0.1312

1-7

50000

0.0536

0.0363

0.1200

1-8

50000

0.0534

0.0359

0.1186

1-9

40000

0.0561

0.0392

0.1299

1-10

50000

0.0544

0.0362

0.1193

1-11

50000

0.0581

0.0389

0.1284

1-12

50000

0.0536

0.0357

0.1177

Table 4.10: In-sample performance MUR/USD data

34

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

0.2445

0.1806

0.4571

ARIMA(0,1,0)

0.2445

0.1805

0.4568

AR(1)-GARCH(1,1)

0.2245

0.1806

0.4571

50000

0.2442

0.1817

0.4587

1-2

50000

0.2445

0.1818

0.4591

1-3

50000

0.2453

0.1833

0.4629

1-4

50000

0.2461

0.1841

0.4646

1-5

50000

0.2436

0.1807

0.4560

1-6

50000

0.2448

0.1815

0.4581

1-7

50000

0.2437

0.1813

0.4575

1-8

50000

0.2431

0.1805

0.4553

1-9

50000

0.2445

0.1820

0.4589

1-10

50000

0.2439

0.1811

0.4565

1-11

50000

0.2442

0.1809

0.4560

1-12

50000

0.2452

0.1826

0.4611

Table 4.11: In-sample performance MUR/EU data

35

4.6 Empirical findings

Lags

No of epochs

No of neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

0.3181

0.2318

0.4443

ARIMA(0,1,0)

0.3181

0.2318

0.4441

AR(1)-GARCH(1,1)

0.2704

0.2026

0.3942

50000

0.3120

0.2270

0.4383

1-2

50000

0.3118

0.2273

0.4358

1-3

30000

0.3098

0.2262

0.4263

1-4

50000

0.3113

0.2284

0.4405

1-5

50000

0.3088

0.2260

0.4363

1-6

50000

0.3115

0.2287

0.4410

1-7

50000

0.3110

0.2275

0.4391

1-8

50000

0.3092

0.2251

0.4392

1-9

50000

0.3143

0.2316

0.4477

1-10

50000

0.3113

0.2278

0.4396

1-11

50000

0.3105

0.2278

0.4398

1-12

50000

0.3098

0.2271

0.4381

Table 4.12: In-sample performance MUR/GBP data

36

4.6 Empirical findings

Lags

No of Epochs

No of Neurons

RMSE

MAE

MAPE

ARIMA(5,1,0)

1.234

1.220

4.082

ARIMA(2,1,3)

1.233

1.220

4.080

AR(1)-GARCH(1,2)

1.233

1.221

4.082

30000

0.0501

0.0356

0.1189

1-2

30000

0.0539

0.0422

0.1441

1-3

30000

0.0502

0.0404

0.1349

1-4

30000

0.0494

0.0394

0.1317

1-5

50000

0.0495

0.0386

0.1291

1-6

30000

0.0501

0.0387

0.1295

1-7

50000

0.0512

0.0410

0.1371

1-8

30000

0.0505

0.0396

0.1324

1-9

50000

0.0512

0.0406

0.1357

1-10

30000

0.0472

0.0368

0.1229

1-11

30000

0.0518

0.0405

0.1352

1-12

30000

0.0222

0.0166

0.0553

Table 4.13: Out-of-sample performance MUR/USD data

37

4.6 Empirical findings

Lags

No of Epochs

No of Neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

1.772

1.735

4.408

ARIMA(0,1,0)

1.780

1.742

4.426

AR(1)-GARCH(1,1)

1.776

1.739

4.419

30000

0.1427

0.1099

0.2783

1-2

30000

0.1424

0.1103

0.2791

1-3

50000

0.1381

0.1047

0.2645

1-4

30000

0.1424

0.1102

0.2790

1-5

30000

0.1423

0.1107

0.2802

1-6

30000

0.1437

0.1122

0.2841

1-7

50000

0.1433

0.1120

0.2835

1-8

30000

0.1452

0.1147

0.2904

1-9

50000

0.1590

0.1237

0.3133

1-10

50000

0.1470

0.1105

0.2798

1-11

30000

0.1410

0.1107

0.2803

1-11

50000

0.1394

0.1170

0.2961

1-12

30000

0.1389

0.1079

0.2730

Table 4.14: Out-of-sample performance of MUR/EU data

38

4.6 Empirical findings

Lags

No of Epochs

No of Neurons

RMSE

MAE

MAPE

ARIMA(1,0,0)

1.993

1.981

4.249

ARIMA(0,1,0)

1.934

1.921

4.119

AR(1)GARCH(1,2)

1.996

1.985

4.257

30000

0.2396

0.1807

0.3694

1-2

30000

0.2440

0.1831

0.3745

1-3

50000

0.1985

0.1624

0.3332

1-4

30000

0.2048

0.1648

0.3381

1-5

50000

0.1990

0.1463

0.3003

1-6

30000

0.1998

0.1470

0.3007

1-7

50000

0.1994

0.1506

0.3089

1-8

50000

0.2025

0.1593

0.3270

1-9

50000

0.1955

0.1535

0.3149

1-10

30000

0.2141

0.1689

0.3466

1-11

50000

0.2142

0.1711

0.3511

1-12

50000

0.2029

0.1506

0.3087

Table 4.15: Out-of-sample performance of MUR/GBP data

4.6.4

Results and Discussions for Period II

In this experiment the in-sample set for MUR/USD and MUR/EU consists the values from
January 2003 to November 2011 whereas the in-sample set for MUR/GBP consists the values from January 2002 to November 2011. The out-of-sample set for each of the three series
consists the value of December 2011.
Considering the in-sample forecasts of the MUR/USD data we find that the ARIMA model
outperform all the other models. As for the MUR/EU data the ANN model using the
Lag(1 8) produces a superior results since the RMSE, MAE and MAPE values are small
compare to the random walk, ARIMA and GARCH model. In the MUR/GBP forecasts the
GARCH model used produces better forecasts.
For the out-of-sample forecast we find that the ANN models outperform the ARIMA,
GARCH and random walk model for all the three series.

39

4.6 Empirical findings

Period I

In-sample

Out-of-sample

Period II

ANN

ARIMA

GARCH

ANN

ARIMA

GARCH

RMSE

0.0308

0.0273

0.0271

0.0536

0.0489

0.0499

MAE

0.0194

0.0172

0.0172

0.0357

0.0313

0.0314

MAPE

0.0658

0.0586

0.0587

0.1177

0.1025

0.1031

RMSE

0.0704

2.011

2.161

0.0222

1.233

1.233

MAE

0.0493

1.588

1.562

0.0166

1.220

1.221

MAPE

0.1687

2.331

5.131

0.0553

4.080

4.082

Table 4.16: In-sample and Out-of-sample forecasts of MUR/USD

Period I

In-sample

Out-of-sample

Period II

ANN

ARIMA

GARCH

ANN

ARIMA

GARCH

RMSE

0.1996

0.1993

0.1990

0.2436

0.2445

0.2245

MAE

0.1563

0.1553

0.1551

0.1807

0.1806

0.1806

MAPE

0.4235

0.4219

0.4213

0.4560

0.4571

0.4571

RMSE

0.3634

1.271

1.265

0.1389

1.772

1.776

MAE

0.2632

0.8503

0.8439

0.1079

1.735

1.739

MAPE

0.6174

1.955

1.940

0.2730

4.408

4.419

Table 4.17: In-sample and Out-of-sample forecasts of MUR/EU

Period I

In-sample

Out-of-sample

Period II

ANN

ARIMA

GARCH

ANN

ARIMA

GARCH

RMSE

0.2710

0.2710

0.2700

0.3098

0.3181

0.2704

MAE

0.2041

0.2032

0.2026

0.2262

0.2318

0.2026

MAPE

0.3848

0.3828

0.3843

0.4263

0.4443

0.3942

RMSE

0.4794

4.429

4.137

0.1990

1.993

1.996

MAE

0.3402

3.755

3.477

0.1463

1.981

1.985

MAPE

0.6443

7.244

6.708

0.3003

4.249

4.257

Table 4.18: In-sample and Out-of-sample forecasts of MUR/GBP

40

Chapter 5

Conclusion
In this project, linear and nonlinear models were used to forecast the exchange rates of the
Mauritius Rupee against three foreign currencies: the US Dollar, the Euro and the British
Pound. Accuracy of forecasting models for two periods of data first from January 2002
to December 2008 and the second from January 2002 to December 2011 were taken into
consideration. The reason for this choice is that we wanted to test the ability of our forecasting procedures to provide accurate out-of-sample forecasts during the financial crisis.
The artificial neural network (ANN) which is a nonlinear model was used as an alternative
model to the linear ARIMA processes and the nonlinear GARCH models. The empirical
results show that the ANN models has a superior out-of-sample forecasting performance
for both periods data when compared to the other models. For the in-sample forecasts we
observed that ARIMA and ARMA-GARCH models provided better goodness-of-fit than
the ANN models.
One of the reasons behind this is that there is still no well defined guidelines to build up an
ANN model to solve a specific problem. Thus to obtain the best possible ANN forecasting
model rigorous experiments were carried out so as to determine the different parameters
to build up the model. Considering the out-of-sample performance of the ANN models
we can say that they proved to be successful when fitted to the foreign exchange rates data
provided extreme care is taken in designing the network. Thus we can conclude that the
ANN model can be used as a complementary tool to different time-series models and the
forecast accuracies can be further improved.

41

Bibliography
Bollerslev, T. (1986), Generalized autoregressive conditional heteroscedasticity, Journal of
economics 31, 307327.
Bollerslev, T. & Ghysels, E. (1996), Periodic autoregressive conditional heteroscedasticity,
J.Bus.Eco.Stat 14, 139151.
Cao, L. & Tay, F. (2001), Financial forecasting using support vector machines, Neural Computer and Application 10, 184192.
Engle, R. F. (1982), Autoregressive conditional heteroscedasticity with estimates of the
variance of united kingdom inflation, Economica 50, 909927.
Gencay, R. (1999), Linear, non-linear and essential foreign exchange rate prediction with
simple technical trading rules, Journal of International Economics 47, 91107.
Haoffi, Z., X. G. Y. F. & Han, Y. (2007), A neural network model based on the multistage
optimization approach for short term food price forecasting in china, Expert Syst. Appl.
33, 347356.
Meade, N. (2002), A comparison of the accuracy of short term foreign exchange forecasting
methods, International Journal of Forecasting 18, 6783.
Panda, C. & Narasimhan, V. (2007), Forecasting exchange rate better with artificial neural
network, Journal of Policy Modeling 29, 227236.
R. K. Bissoondeeal, J. M. Binner, M. B. A. G. & Mootanah, V. P. (2008), Forecasting exchange
rates with linear and nonlinear models, Global Business and Economics Review 10, 414
429.
Racine, J. (2001), On the nonlinear predictability of stock returns using financial and economic variables, Business Econ.Stat. 19, 380382.
42

BIBLIOGRAPHY

T. Hill, M. O. & Remus, W. (1996), Neural network models for time series forecasts, Management Science 42, 10821092.
Tang, Z. & Fishwich, P. A. (1993), Backpropagation neural nets as models for time series
forecasting, ORSA Journal on Computing 5, 374385.
W. R. Foster, F. C. & Ungar, L. H. (1992), Neural network forecasting of short, noisy time
series, Computers and Chemical Engineering 16, 293297.
Wang, J. H. & Leu, J. Y. (1996), Stock market trend prediction using arima-based neural
networks, Proc.of IEEE Int.Conf.on Neural Networks 4, 21602165.
Yaser, S. & Atiya, A. (1996), Introduction to financial forecasting, Applied Intelligence
6, 205213.

43

You might also like