You are on page 1of 10

A New Approach to Time Series Forecasting using

Simulated Annealing Algorithms


Mai Thai Son1, Nguyen Luong Anh Tuan1, Nguyen Tran Cao Tan Khoa2,
Le Quang Loc3, Lu Nhat Vinh4,
1

Department of Information Technology, Ho Chi Minh City University of Transport


Department of Information Technology, Ho Chi Minh City University of Industry
3
Faculty of Computer Science & Engineering, Ho Chi Minh City University of Technology
4
Department of Information Technology, Thu Dau Mot University, Binh Duong
mtson@hcmutrans.edu.vn, sondung1671@yahoo.com
2

Abstract. Time series forecasting is an important problem and attracts a lot of


research efforts recently due to its wide variety of real-life applications in
finance, power generation, medicine, water resource and environmental science.
One of the most popular techniques used in prediction is genetic algorithms
(GAs). However, within our knowledge, the other similar novel optimization
techniques such as simulated annealing algorithms (SAs), which are much
simpler, more efficient and more suitable than GAs in many cases, are not paid
attention. In this work, we combined simulated annealing and genetic
algorithms with a two-level learning algorithm to find models and parameters
for auto regressive moving-average (ARMA) - one of the most common time
series forecasting models. Our experiments, compared with conventional
forecasting methods such as ARIMA, ES and Meta-GAs on diverse datasets,
have shown very encouraging results, especially on trended and nonlinear time
series.
Keywords: Simulated Annealing, Time Series Forecasting, Model Selection.

1 Introduction
The time series forecasting system predicts future values of time series variables by
looking at the collected variables in the past. The importance of time series
forecasting and analysis in science and business has increased rapidly and still attracts
efforts from engineers and scientists. In some domain such as process and production
industry, forecast values in the future is particularly important in production
monitoring or in optimal processes control where the predicted time series data values
help in deciding about the subsequent control action to be taken.
For forecasting future values of time series, dozens of methods have been proposed
from the use of model-based methods such as autoregressive integrated movingaverage (ARIMA) to model-free techniques as used in Exponential Smoothing (ES)
[5]. Recently, the use of artificial intelligent networks (ANNs) for forecasting time
series suggests that ANNs can be a promising method to forecasting time series [8],

[9]. The other new methods include the use of fuzzy logic, genetic programming,
support vector machines and wavelet networks for time series predicting [5].
However, despite its age, autoregressive moving-average (ARMA) is still one of the
most popular models for predicting time series. Kizilkaya et al. [3] investigated the
relation between parameters of an ARMA and its equivalent moving average (EMA)
model and proposed a method for determining the ARMA model parameters from the
coefficients of a finite-order EMA model. Valenzuela et al. [8] developed a hybrid
ARMA-ANN model to increase the accuracy of prediction compared with the use of
each model separately.
Genetic algorithms (GAs) are optimization techniques which are based on natural
evolution process [4]. Motivated by its abilities to find global optimal solutions over
feature space, GAs are often used in forecasting to learn parameters for ARMA model
[2], parameters or configurations for ANNs [9]. Simulated annealing algorithms (SAs)
[1] are the other similar techniques which based on the manner in which liquids freeze
or metals recrystalize in the process of annealing. However, they have little use in
forecasting. Within our knowledge, we have never seen the use of SAs in forecasting
time series although it is widely used in many other domains such as graph coloring
and time tabling.
Cortez et al. [2] proposed a meta-genetic (Meta-GAs) algorithm to learn models
and parameters for ARMA models. It is two-level algorithm. The meta-level (highlevel) uses genetic algorithm with binary value encoding to represent ARMA models
and the low-level also uses genetic algorithm with real value encoding to estimate
ARMA model parameters with each models gained from meta-level. Meta-Genetic
(Meta-GAs) is a very interesting algorithm. However, the use of GAs in meta-level is
inefficient due to the complexity of GAs itself. In this case, compared with GAs, we
found that SAs are much simpler and more efficient algorithms to be used. Moreover
Meta-GAs does not use any kind of statistical data analysis or require prior knowledge
about the behaviors of the series. That makes the model selection process
meaningless. In our opinion, we had better choose predicting model based on time
series behaviors. However, with crossover operator in GAs, we can not easily exploit
special characteristics of each kind of time series to build predicting models. Thus, the
use of SAs in this case is more suitable because its neighborhood selection mechanism
easily allows us to use analysis techniques to select more suitable models from the
previous model.
In this work, we combined SAs and GAs in a two-level algorithm to find ARMA
models and parameters for time series forecasting. The high-level or the meta-level
uses SA to find predicting models. With each founded models, the low-level GA is
called to find optimized model parameters. Our algorithm is the improve version of
Meta-GAs with the use of SA in meta-level and more efficient strategy for GA in lowlevel. Our experiments, which are conducted in various kind of time series from
seasonal, trended to nonlinear time series and compared with some traditional
methods include ARIMA, ES and Meta-GAs itself, have shown very encouraged
result, especially on trended and nonlinear time series.
This paper organized as follow: firstly, the basic concepts of time series analysis
are defined; secondly, we talk about local search techniques; then, we describe the use
of SA and GA in two-level algorithm to learn ARMA model; finally, the obtained
results are presented and compared with other conventional methods.

2 Time series analysis


A Time Series (TS) is a sequence of data points, measured typically at successive
times and spaced at (often uniform) time intervals. Time series can be seen
everywhere, from stock prices in businesses, electrocardiogram in medicine to
product monitoring in industries. The time series forecasting models assume that past
patterns will occur in the future. The forecasting error e of a model is given by the
difference between predicting value yt and actual value xt of time series.
e = xt yt
Performance of the forecasting model is evaluated by some accuracy measurement
methods, which are named sum square error (SSE), root mean square error (RMSE)
and normalized mean square error (NMSE) [2] and described as follow:
l

SSE ei2

RMSE

i 1

SSE
l

NMSE

SSE
l

(x

x) 2

i 1

where l is the number of predicted values and x is the mean of time series.
Autoregressive moving average (ARMA) model is one of the most common
methods for time series forecasting and is the combination of autoregressive model
(AR) and moving average model (MA). We denote ARMA(P,Q) as a linear
combination of P past values, Q errors and white noise as show in equation below:
P

yt Ai xt i M i et i
i 1

i 1

where is white noise, Ai and Mi is AR and MA coefficients.

3 Simulated Annealing and genetic algorithms

3.1 Simulated annealing


Simulated Annealing (SA) as an optimization technique was first suggested by S.
Kirkpatrick et al. in 1983. In an annealing process, if you heat a solid past melting
point and then cool it, the structural properties of the solid depends on the rate of
cooling. SA simulates the cooling process by gradually lowering the temperature of
the system until it converges to a steady, frozen state. When applying to optimization
problems, simulated annealing can search for feasible solutions and converges to an
optimal solution.

With SAs, potential moves are sampled randomly and all improving moves are
accepted automatically. Other moves are accepted with probability exp(-/t), where
is the change in cost function and t is a temperature control parameter as showed in
Figure 1. Important features of SAs are number of annealing phase maxphase, number
of repetitions at each temperature nrep, starting temperature t0 and temperature
reduction function . These values are problem-specific as they must be chosen with
respect to the shape of the solution landscape.
3.2 Genetic algorithms
Genetic algorithm (GA) is a population-based search method. Genetic algorithms are
acknowledged as good solvers for tough problems. However, no standard GA takes
constraints into account. GAs mimic natural evolution processes include the selection
of parents, the mate processes, mutation of children and generational reproduction
phase.
Figure 1 shows genetic frame-works used in our algorithm; at each generation, we
do recombination process num_mate times. All children will be evaluated and only
satisfied individuals are selected for the next generation.
Select an initial solution s0
Select an initial temperature t = t0 > 0
Select number of phase maxphase
Select a temperature reduction coefficient
while phase < maxphase
while iteration_count < nrep
/* s is a neighbor solution of s0 */
Randomly select s N(s0);
/* compute the change in cost function */
= f(s) f(s0)
if < 0 then s0 = s
else
generate random x [0,1]
if x < exp(-/t) then s0 = s
t=t*

Initialize
population
with
random
candidate solutions
Evaluate each candidate
repeat
repeat
Select parents
Recombine pairs of parents
Mutate the resulting children
until iteration_count = num_mate
Evaluate children
Select individuals for the next generation
until Termination-Condition is satisfied

Figure 1. Simulated annealing (left) and genetic frame-work (right) used in our proposed
algorithm

4 Our proposed algorithm


When we predict future values of time series using ARMA models, the important
questions are what ARMA model coefficients should be used and what is the value for
each coefficient? It is difficult to answer those questions due to very large search
space for each question.

Two-level algorithm approach


To solve those problems, we used a two-level algorithm approach. The high-level
takes responsibility to find suitable predicting models. With each models found by
high-level, the low-level takes responsibility to find optimal parameters for the
predicting ones. The high-level uses an array of binary (or boolean) value to represent
ARMA model. Each value of array represents a possible coefficient. If the value is
true, the corresponding coefficient exist, otherwise it is not considered in the model.
The low-level array contains real-value parameters for each corresponding coefficient
represented by high-level structure. If a coefficient does not exist in meta-level, the
following value in low-level is 0.0 (or dont care) as showed in Figure 2.

A1

A2

0.2

0.0

0.3

A13

M1

M2

...

...

HIGH-LEVEL

...

0.0

0.1

0.0

...

0.5

LOW-LEVEL

M13

Figure 2. The structure of a solution used in our algorithm.

With a given training set, which contains past values of time series, the algorithm
must find the most suitable predicting model and value for each coefficient on this
model. After that, we use this predicting model to forecast future value of time series.
The low-level algorithm
Given an ARMA model, which is found by high-level algorithm, to find the optimal
parameters for it, we used GAs as a search process. GAs are multi-point search
processes which can be simultaneously search for optimal solutions in different
solution spaces at the same time. Due to enormous search space over real-value
parameters in low-level, the use of GAs, in this case, is very efficient to ensure that we
can find good solutions. So, our choice is the same with other authors choices
namely Cortez [2].
We used GA frame-work as showed in Figure 1. The initial populations genes are
randomly assigned values within the range [-1, 1]. The fitness function of each
individual is measured by RMSE cost over training set. At each epoch, parents are
selected by tournament mechanism [4]. To enhance the diversity of search process, at
each epoch, we do reproduction process a number of times and only some of good
children are selected to next generations. The recombination processes are done by
randomly choosing between arithmetic crossover [6] and arithmetic heuristic
crossover [4]. The mutation processes are done by randomly using zero-preserving Gaussian perturbation [6] and relative -Gaussian perturbation [6]. In our algorithm,
the recombination and mutation processes use more than one technique to improve
the ability to find optimal solutions. We called those techniques variant
recombination and variant mutation processes.

The high-level algorithm


Cortez et al. [2] used GAs to learn ARMA models. However, the use of GAs in this
case is not efficient due to the complexity of GAs. Compared with GAs, SAs are much
simpler and efficient algorithms to be used. Moreover, it is hard to construct better
model from parent models by using crossover operator. Meta-GAs, for example, used
simple two-point crossover, which do not have much meaning in model selection. In
SA, each new solution is generated from its previous solution by neighbor selection
mechanism. With this mechanism, we find it easy and efficient to use analysis
techniques to select more suitable model to be considered next. Thus, we chose SA as
search algorithm in high-level.
In high-level, basic simulated annealing algorithm shown in Figure 1 takes
responsibility for finding suitable ARMA predicting models. To evaluate the fitness of
each found model, the cost function which we used is famous Bayesian Information
Criterion (BIC) [2], which adds a penalty to the model that is a function its
complexity, as described below:
SSE
BIC N ln
p ln( N )
N

where N denotes the number of training examples and p is the number of model
parameters.
The initial model is generated randomly with boolean value. With each model, the
neighbor model is generated via three mechanisms: perturbation, swap and flip.
Perturbation: each coefficient has a chance to switch from true to false and
vice versa.
Swap: swap value from two random coefficients.
Flip: switch value of any coefficients from true to false and vice versa.
The use of perturbation mechanism allows us to jump to different search spaces to
ensure diversity and to increase the ability to escape local optimum. The use of swap
and flip mechanism allows us to explorer search spaces carefully. We named this
technique variant neighborhood selection mechanism.

5 Experiments
To conduct our experiments, we use the same time series which are used in [2]. Those
time series are showed in Table 1, ranging from financial markets to natural processes
and classified into four main categories namely Seasonal, Trended, Seasonal and
Trended, and Nonlinear. Those time series can be found at Time Series Data Library
(source: www-personal.buseco.monash.edu.au/~hyndman/TSFL/).
Each TS was divided into a training set with the first 90% values and a test set with
the last 10%. We only use the training set for model selection and parameter
optimization. The test set is used to compare the forecasting ability of our proposed
algorithm with other forecasting methods.

Table 1. Time series data [2].

Series

Type

Passengers
Paper
Deaths
Maxtemp
Chemical
Prices
Sunspots
Kobe

Domain

Seasonal &
Trended

Description

Tourist
Sale
Traffic
Meteorology
Chemical
Economy
Physics
Geology

Seasonal
Trended
Nonlinear

Monthly international airline passengers


Monthly sales of French paper
Monthly deaths & injuries in UK roads
Maximum temperature in Melbourne
Chemical concentration readings
Daily IBM common stock closing prices
Annual Wolfs Sunspot Number
Seismograph of the Kobe earthquake

700

1200

600

1000

500

800

400
600

300

400

200

200

100
0

0
1

21

41

61

81

101

121

141

21

41

passengers

61

81

101

paper

3000

35

2500

30
25

2000

20
1500

15

1000

10

500

0
1

21

41

61

81

101

121

141

161

181

21

41

61

81

deaths

101 121 141 161 181 201 221

maxtemp

18.5

700

18

600

17.5

500

17

400

16.5

300

16

200

15.5

100
0

15
1

21

41

61

81

101

121

141

161

181

51

101

151

chemical

201

251

301

351

prices

200
180
160
140
120
100
80
60
40
20
0

15000
10000
5000
0
-5000
-10000
-15000
1

51

101

151

201

251

21

41

sunspots

61

81

kobe
Figure 3. Time series of Table 1.

101 121

141

161 181

In this work, the maximum AR and MA orders (P and Q) were set to 13, a value which
was considered sufficient to encompass seasonable and trended effects 0. So the size
of array in low and high level is 27 (1 for the constant and 13 for each AR and MA
coefficients). The most difficult work in our experiment was to choose the suitable
input parameters for both SA and GA. To do that, we tried many parameter sets, some
of which are suggested by other authors, and chose the most suitable one. Because our
result is compared with Meta-GAs [2], used parameters must guarantee the equality
between our algorithm and Meta-GAs. Table 2 describes parameters used in our
experiment.
Table 2. Our proposed algorithm parameter value setup.

SA

GA

Encoding
Neighborhood selection
Cost function
Initial solution
Number of phase (maxphase)
Number of iteration (nrep)
Starting temperature (t0)
Temperature reduction coefficient ()
Encoding
Cost function
Population size
Initialization
Crossover
Mutation
Maximum generation
Number of mate at each epoch
Number of selected child at each epoch
Tournament size

Boolean value
Perturbation (20%), swap, flip
BIC
Random in { true, false }
5000
5
10000
0.98
Real value
RMSE
50
Random in [-1, 1]
Arithmetic, arithmetic heuristic (80%)
Gaussian perturbation (20%)
1000
20
10
10

The best ARMA models, which are obtained by our algorithm, are shown in Table 3.
For each TS, we show the set of AR and MA coefficients used by the best model, as
well as the present of white noise and total number of parameter p.
Table 3. Best ARMA model found by our proposed algorithm.

Series

passengers
paper
deaths
maxtemp
chemical
prices
sunspots
kobe

false
false
false
false
false
false
true
false

AR
2
2
3,13
2,3,9,13
7,12
13
4,12,13
1,2,5,8,11,12,13

MA
3,11,12,13
2,4,6,8
1,11
2,3,9
1,7,12,13
8
1,5,10
3,4,7,8,11,13

p
5
5
4
7
6
2
7
13

As an example, Figure 4 shows the forecasts value for the last 10% of Kobe series,
considering the average of ten runs over the optimized ARMA model. Both curves
show the good fit between real values and forecasting values.

Real value

100

Forecast

80
60
40
20
0
-20
-40

Figure 4. One-step Ahead Kobe forecasts.

Table 4. Comparison between the Different Time Series Forecasting Approaches.

Series
passengers
paper
deaths
maxtemp
chemical
prices
sunspots
kobe

ES
16.5 (0.70%)
49.2 (4.4%)
135 (37%)
0.72 (2.5%)
0.35 (51%)
7.54 (0.39%)
28.4 (35%)
3199 (105%)

ARIMA
17.8 (0.81%)
61.0 (6.8%)
144 (42%)
1.07 (5.6%)
0.36 (53%)
7.72 (0.41%)
21.4 (20%)
582 (3.5%)

Meta-GAs
17.2 (0.75%)
52.5 (5%)
137 (38%)
0.93 (4.3%)
0.34 (48%)
7.48 (0.38%)
17.6 (14%)
492 (2.5%)

SAGA
17.74 (0.83%)
49.17 (4.39%)
142 (41%)
0.85 (3.6%)
0.33 (44.89%)
7.54 (0.39%)
16.57 (12%)
491 (2.5%)

Table 4 shows the comparison between our algorithm (SAGA) with Meta-GAs and
conventional models. The error values in the table are given in terms of two measures,
namely RMSE and NMSE (in brackets). Our result is compared with the reported
result for ES, ARIMA and Meta-GAs in [2].
Our experiments have shown that SAGA achieves better performance than MetaGAs in many cases. Although using the same underlying ARMA model, the higher
flexibility of SAGA and Meta-GAs algorithm allowed them to exceed forecast
performance of the ARIMA approach, for all series considered. For seasonal series, ES
still gives a better overall performance, since it was developed specifically for these
kinds of series.

6 Conclusions and Future Works


The use of bio-inspired optimization techniques such as GAs in time series forecasting
has become a trend recently with many proposed algorithms. However, the use of
other novel optimization techniques like SAs is still not cared properly. SAs are much
simpler, more efficient and much more suitable than GAs in many cases. Our
algorithm, which is named SAGA and is two-level algorithm with the combination of
SA in high-level and efficient GA in low-level, shows better results than Meta-GAs,
which only uses GAs. Although our algorithm is not better than ES on seasonal TSs, it

shows its strength over this conventional method on more complex TSs such as
nonlinear and trended series.
Through this work, the use of SA has been proved to be a new and promising
approach to solve forecasting problems. Our works in the future include the use of
time series analysis techniques in neighborhood selection mechanism to select more
appropriate models in high-level algorithm, as well as the use of SAs to take place
GAs in many other approaches.
Acknowledgements
We would like to express our gratitude to members of our research group include
Nguyen Nam Giang, Phan Thanh Huy, Huynh Huu Nghia, Nguyen Tuan Phong, Tran
Duy Dong, Luong Thi Mai Nhan, Trinh Duy Sam, Pham Thi Hieu, Do Hoang Dinh
Tien, Nguyen Huyen Thao Vy, Bui Sy Vuong and Le Xuan Vuong. Without their
supports, suggestion and endless efforts, this project would not have been completed.
We also specially thank to Vo Thi Minh Hanh for her help during the preparation of
this paper.

References
1. Anagnostopoulos, A., Michel, L., Van Hentenryck, P. and Vergados, Y.: A Simulated
Annealing Approach to the Traveling Tournament Problem. In: Proc. of International
Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR)
techniques in Constraint Programming for Combinatorial Optimization Problems
(CPAIOR03) (2003)
2. Cortez, P., Rocha, M., and Neves, J.: A Meta-Genetic Algorithms for Time Series
Forecasting. In: Proc. of Workshop on Artificial Intelligence Techniques for Financial Time
Series Analysis (AIFTSA-01), 10th Portuguese Conference on Artificial Intelligence
(EPIA'01), Springer, (2001)
3. Kizilkaya, A., and Kayran, A. H.: ARMA Model Parameter Estimation based on The
Equivalent MA Approach. Digital Signal Processing, vol. 16, no. 6, pp. 670-681, ACM,
(2006)
4. Michalewicz, Z.: Genetic Algorithms + Data Structures = Evolution Programs. 3rd edition,
Springer-Verlag, USA (1996)
5. Palit, A. K. and Popovic, D.: Computational Intelligence in Time Series Forecasting
Theory and Engineering Applications. Springer-Verlag, London (2005)
6. Pardo, L. M., Pinkus, A., Sli, E. and Todd, M. J.: Foundations of Computational
Mathematics, Santander 2005. Cambride University Press (2006)
7. Qian, G., Zhao, X.: On Time Series Model Selection Involving Many Candidate ARMA
Models. Computational Statistic & Data Analysis, vol. 51, no. 12, ACM, (2007)
8. Valenzuela, O., Rojas, I., Rojas, F., Pomares, H., Herrera, L. J., Guillen, A., Marquez, L. and
Pasadas, M.: Hybridization of Intelligent Techniques and ARIMA Models for Time Series
Prediction. Fuzzy Sets and Systems, vol. 159, no. 7, pp. 821-845, ACM, (2008)
9. Versace, M., Bhatt, R., Hinds, O., and Shiffer, M.: Predicting the exchange traded fund DIA
with a combination of genetic algorithms and neural networks. Expert Systems with
Application (2004)

You might also like