GP Forecast Programing

Transactions of the Institute of Measurement and Control 28, 3 (2006) pp.
285 297
Genetic programming with

wavelet-based indicators for
financial forecasting
Jin Li1, Zhu Shi2 and Xiaoli Li1
1
The Centre of Excellence for Research in Computational Intelligence and

Applications (CERCIA), School of Computer Science, The University of Birmingham,
Edgbaston, Birmingham B15 2TT, UK
2
School of Software Engineering, The University of Science and Technology of China,
P.R. China
Wavelet analysis, as a promising technique, has been used to approach numerous problems in
science and engineering. Recent years have witnessed its novel application in economic and
finance. This paper is to investigate whether features (or indicators) extracted using the wavelet
analysis technique could improve financial forecasting by means of Financial Genetic
Programming (FGP), a genetic programming-based forecasting tool. More specifically, to
predict whether the Dow Jones Industrial Average (DJIA) Index will rise by 2.2% or more
within the next 21 trading days, we first extract some indicators based on wavelet coefficients of
the DJIA time series using a discrete wavelet transform; we then feed FGP with those waveletbased indicators to generate decision trees and make predictions. By comparison with the
prediction performance of our previous study, it is suggested that wavelet analysis be capable
of bringing in promising indicators, and improving the forecasting performance of FGP.
Key words: financial forecasting; genetic programming; stock data; wavelet analysis.
1.
Introduction
In the past decade, researchers in the fields of applied mathematics and electrical
engineering have developed the useful wavelet analysis methods for the multi-scale
Address for correspondence: Jin Li, CERCIA, School of Computer Science, The University of
Birmingham, Edgbaston, Birmingham B15 2TT, UK. E-mail: J.li@cs.bham.ac.uk
2006 The Institute of Measurement and Control
10.1191/0142331206tim177oa
286
Wavelet analysis
representation and the analysis of complicated signals. Examples of wavelet applications are turbulence analysis, image compression, earthquake prediction, biomedical
signal processing and so forth (eg, Daubechies, 1992; Li and Yao, 2005; Li et al., 2000;
Meyer, 1993; Percival and Walden, 2000). More recent years have witnessed its novel
applications in economic and finance (eg, Pan and Wang, 1998; Ramsey and Lampart,
1998; Ramsey and Zhang, 1997). An overview of wavelets in economic and finance can
be found in the references of Gencay et al. (2002) and Ramsey (2002). Often, the
wavelet transform is applied as a decomposition tool to analyse financial time series.
Applications of wavelets are usually focused on studying the dynamics and
correlation of financial time series, including scaling properties of foreign exchange
volatility (Gencay et al., 2001), systematic risk in a capital asset pricing model (Gencay
et al., 2003), and the relationship between financial variables and real economic activity
(Kim and In, 2003). Apart from these, there are also wavelet applications for financial
forecasting where wavelet coefficients are directly transformed as features, and
input to neural networks for predictions (eg, Arino, 1996; Aussem et al., 1998;
Murtagh et al., 2003).
The interest in wavelet analysis in empirical finance is attributed to its advantages.
As opposed to the traditional Fourier techniques, wavelet analysis is able to reveal
localized information within the data in the time-scale plane. More specifically, it is
capable of decomposing an observed time series into a set of multi-scale or multiresolution constituent time series. This makes it suitable for the analysis of non-linear
and non-stationary financial time series. The time-scale decomposition leads to a
number of benefits for financial analysis. Firstly, in theory, one is able to study a
financial time series at as many more time-scales as possible, rather than at few
traditional time-scales, like long run and short run. In practice, signals are usually
decomposed into a number of constituent signals by discrete wavelet transforms.
Secondly, through the decomposition, many of anomalies or noises in data can be
revealed and therefore can be treated (eg, being removed) separately if necessary.
Finally, from the forecasting point of view, it has been made possible to tailor specific
computational forecasting techniques to different constituent time scales and thereby
gain efficiency of forecast (Ramsey, 2002).
This study applies a discrete wavelet transform to decompose a financial time
series. A number of features are derived based on wavelet coefficients. The differences
between this study and the existing studies in the literature (eg, Arino, 1996; Aussem
et al., 1998; Murtagh et al., 2003) are as follows. Firstly, the forms of features are
different. The features are derivatives from the wavelet coefficients, rather than the
values of coefficients themselves. The indicators extracted in this study reflect the
properties of the time series in respect of dynamics and statistics. Secondly, our
features are generated using coefficients at a certain level, rather than at all levels, with
an attempt at removing possible noise from original data. Finally, our approach adopts
the genetic programming technique, rather than neural networks, which is able to
generate comprehensible decision trees. This makes our method superior to others,
simply due to the fact that solutions need to be understood by human beings for
decision making in finance and economics.
The purpose of this study is to investigate whether wavelet analysis could be
exploited to improve financial forecasting. We carry out this investigation through a
Li et al.
287
prediction task addressed previously by a genetic programming-based tool, Financial

Genetic Programming (FGP; Li, 2001). The task is to predict whether an index shall
rise by r% or more within the next n periods. Our earlier studies (Li and Tsang, 1999a,
b, 2000; Tsang and Li, 2002; Tsang et al., 1998; Tsang et al., 2004) made predictions
using some of indicators derived based on some technical analysis rules in textbooks.
In this study, predictions shall be made using a number of novel indicators extracted
by means of the wavelet analysis technique. It is worth pointing out that such
novel indicators would probably have more merit to predictability to future price
movement because they could possibly remove potential noises in original data to
some extent. Provided all experimental settings are same, any improvement in the
forecasting performance reported in this study could suggest that wavelet analysis be
of value to FGP.
The structure of the paper is as follows. Section 2 describes the FGP system
used. Section 3 introduces wavelet analysis and discusses how the wavelet-based
indicators are generated using the wavelet analysis technique. In section 4, experiments and results are reported. The discussions and conclusions are given in the
final section.
2.
FGP for financial forecasting
This section reviews the history of FGP and briefly presents its technical detail for
financial forecasting. The measures of its prediction performance are also given.
2.1
Overview of FGP
FGP is a major implementation of the Evolutionary Dynamic Data Investment

Evaluator (EDDIE; Tsang and Li, 2002; Tsang et al., 1998; Tsang et al., 2004),
which is an interactive genetic programming-based financial forecasting tool. It aims
to help analysts search the space of interactions and make financial decisions.
Given a set of indicators (or features from the point of view of data mining),
FGP attempts to find interactions among indicators and discover appropriate
corresponding thresholds for indicators. Using genetic programming, FGP generates
Genetic Decision Trees (GDTs), which can be understood by human experts. Human
expertise is channelled into FGP through indicators as the input to the system.
In this way, experts are allowed to experiment with a variety of indicators more
easily. The forecasting performance of FGP crucially depends on the quality of the
indicators chosen. This study aims to examine the effectiveness of alternative
indicators based on wavelets, instead of some technical analysis indicators used in
our previous studies.
FGP system has two versions: namely FGP-1 and FGP-2. FGP-1 is designed to be
able to improve forecasting accuracy by combining experts forecasts from different
sources. FGP-2 is designed to be able to improve prediction precision by a constraint
handle. The handle allows users to pick up a constraint, ie, the percentage of
opportunities, though possibly at the price of missing opportunities. FGP-2
excels FGP-1 in providing the constraint handle by means of a novel fitness function
(see the detail in section 2.2). Both FGP-1 and FGP-2 have been applied to a variety
288
Wavelet analysis
of financial forecasting problems with demonstrated accuracy (Tsang and Li, 2002).
In particular, the efficacy of FGP-2 has been examined intensively through a set of
prediction tasks: whether an index will rise by r% or more within the next n periods,
2.2
, on the prices of
denoted by Prn . In this study, FGP-2 is exploited to attack a task, P21
the Dow Jones Industrial Average (DJIA). Like our previous study (Li and Tsang,
2000), we still focus the performance of FGP-2 on the prediction precision (see the
definition in section 2.2).
FGP-2 generates GDTs to make predictions. An example of a GDT is shown below,
where a Positive prediction means that the goal can be achieved; Negative means
otherwise.
((IF (MV_50 B 18.45) THEN Positive
ELSE (IF TRB_5 19.48) AND (Filter_63 B 36.24) THEN Negative
ELSE Positive
/
MV_50, TRB_5 and Filter 63 involved in the GDT belong to three types of technical
indicators. They were derived on grounds of three simple technical analysis rules in
the financial literature, eg, Alexander (1964), Fama and Blume (1966) and Brock et al.
(1992), namely moving average rules, filter rules and trade range break rules. These
indicators have been argued to have merits to financial forecasting (Brock et al. 1992,
Sweeney 1988). Our previous study (ie, Li and Tsang, 2000) adopted six indicators as
follows to attack Prn on the DJIA.
1)
2)
3)
4)
5)
6)
MV_12 /Todays price the average price of the last 12 trading days;
MV_50 /Todays price the average price of the last 50 trading days;
Filter_5/Todays price the minimum price of the last 5 trading days;
Filter_63/Todays price the minimum price of the last 63 trading days;
TRB_5 /Todays price the maximum price of the last 5 trading days;
TRB_50/Todays price the maximum price of the last 50 trading days.
Nevertheless, to find alternative promising indicators is one of the important

motivations in this study. The hope is that any new derived wavelet-based indicators
would be better and have more merit to the prediction. As a result, the performance of
FGP can be improved in terms of the prediction precision.
For brevity, how FGP works can be explained in pseudo code below. To find more
details of genetic programming technique, interested readers can refer to Koza (1992).
Procedure FGP ( )
Begin
Partition whole data into training data and testing data
/* While training data is employed to train FGP to find the best-so-far-rule; the testing
data is used to determine the performance of predictability of the best-so-far-rule */
Pop1 InitializePopulation(Pop); /*randomly create a population of decision trees. */
Evaluation (Pop); /* calculate fitness of each individual in Pop */
Repeat
Pop1 Reproduction (Pop) Crossover (Pop);
/*new population is created after genetic operators of reproduction which repro-duces
M*Pr individuals and crossover which creates M*(1-Pr) individuals. In our case
Pr0.1, M is population size */
/
Li et al.
289
Pop1 Mutation (Pop); /*apply mutation to population */

Evaluation (Pop);
Until (TerminationCondition()) /* determine if the termination condition is fired */
Apply the best-so-far rule to the testing data;
End
/
2.2
Performance measures of FGP
The prediction problem Prn can be treated as a binary classification problem. Each day
can be classified as either a positive position or a negative position. A positive position
predicted by the GDT is sometimes called a buying signal or a recommendation to buy,
both of which will be referred to in the following context of this paper. For each GDT,
we define the Rate of Correctness (RC), the Rate of Missing Chance (RMC) and the
Rate of Failure (RF) as its prediction performance measures. The Rate of Precision (RP)
is also given as an important meaningful reference measure for the user, as it measures
the accuracy of buying signals. Formulae for each measure are given through a
contingency table (Table 1).
As mentioned earlier, FGP-1 generates GDTs, aimed at making prediction as
accurately as possible. Thus, RC on its own is an appropriate fitness function for
FGP-1. In contrast, FGP-2 attempts to improve prediction precision, ie, RP, which is
equivalent to reducing RF. A lower RF means that each positive recommendation
made by the GDT is more likely to be a good and correct opportunity for the investor
to make a bid. FGP-2 achieves this target by means of a novel constrained fitness
function, which is taken as follows.
f w_rcRCw_rmcRMCw_rf RF:
Where 0 5 w_rc; w_rmc; and w_rf 5 1

(1)
It involves three performance measures, ie, RC, RMC and RF, and three weights, ie,
w_rc, w_rmc and w_rf. The goodness of a GDT is no longer assessed only by its RC, but
by a synthetical value, which is the weighted sum of its three performance rates. The
user is allowed to reflect their preferences to any measure by adjusting the weights.
Table 1 A contingency table for the binary classification, where a specific prediction rule
is invoked
/
RC
TP TN
TP TN

;
O O N N
# of True Negative Positions

[TN]
# of False Negative Positions
[FN]
# of negative positions predicted
(N ) /TN/FN
RMC
FN
;
O
# of False Positive Positions

[FP]
# of True Positive Positions
[TP]
# of positive positions
predicted (N ) /FP/TP
RF (1RP)
FP
;
N
Actual # of negative positions

(O ) /TN/FP
Actual # of positive positions
(O ) /FN/TP
Number of Cases
290
Wavelet analysis
Due to the brittleness of the fitness function (cf., Tsang and Li, 2002), a novel constraint
parameter, R [Pmin, Pmax], is introduced into Function 1, which defines the minimum
and maximum percentage of recommendations that is used to enforce FGP to achieve
the training data (like most machine learning methods, the assumption is that the test
data exhibits similar characteristics). The effectiveness of the constraint in the fitness
function for achieving more reliable and accurate predictions has been demonstrated
in our numerous previous studies. In general, FGP-2 allows the user to tune a
parameter, ie, constraint R, in order to improve RP without affecting the RC
significantly, though at the price of increasing RMC. Such a scenario recurs in this
study as well (see section 4).
It is worth emphasizing again that the performance of FGP crucially depends on the
quality of indicators chosen by the users. We argue that the higher quality of the
indicators used would almost always lead to better performance of FGP. This is
evident in this study.
/
3.
Wavelet-based indicators
In this section, a brief introduction of wavelet analysis is given. We then describe the
indicators used in this study, which are derived from wavelet coefficients.
3.1
Discrete wavelet transform
Successful applications in science and engineering demonstrate that the wavelet

transform is a powerful signal or image processing method. Wavelet transform
overcomes the shortcomings of the STFT (short time Fourier transform) by performing
a multi-resolution analysis of signals (eg, Daubechies, 1992; Meyer, 1993). The wavelet
transform can be used to describe the content of the different frequency over time of a
non-stationary time series at a time-scale space. Thus, some of transients that are
hidden in the time series can be highlighted. The wavelet transform of a time series
x(t) is defined as

1
tb
W(a; b) p x(t)c
dt
(2)
a
a
where t is the time, a 0 and b are scale and translation parameters, respectively; c(.)
is a mother wavelet. W(a,b) is the coefficients of wavelet transform of x(t). 1/a is
proportional to the frequency of the wavelet function. For a small value of a, the
wavelet coefficient corresponds roughly to a high-frequency component of a time
series; whereas a big one corresponds to a low-frequency component of the time series.
By adjusting scale parameter a, the wavelet transform can flexibly decompose a time
series x(t) into multiple resolution constituent time series. Since the wavelet
coefficients obtained can indicate local characteristics of a non-stationary time series
at the time-scale space, to identify system states, in practice, one often extracts features
based on wavelet coefficients.
To derive wavelet-based indicators, we apply a discrete wavelet transform to
decompose the time series. The discrete wavelet decomposition of a discrete time
/
Li et al.
291
series x(n) (n 1, 2, . . ., N), where N is the number of data points in the time series,
can be defined as follows:
/
x(n)
CJ (k)fJ (2J nk)
J X
X
j1
dj (k)cj (2j nk)
(3)
where dj (k) are called wavelet coefficients at the level j {j 1, 2, . . ., J}, and CJ (k) are the
coefficients at the maximum resolution level J. Both values of coefficients are varied by
position, as indicated by the value of k. The value of J can be set up by users from 1 to
a maximum integer number, which is sustainable by N (ie, 2J B N). f(.) is called father
wavelets whereas c(.) is called mother wavelets, both of which are derivable from a
basic wavelet (eg, the Haar, the Daublet and the Morlet). The father wavelet provides
an approximate version of the time series at successive resolutions, whilst the mother
wavelet captures the detail at each resolution.
In summary, given a time series, a basis wavelet and a parameter J, both wavelet
coefficients, ie, CJ (k) and dj (k) (j 1, 2, . . ., J) can be calculated by a fast recursive
scheme (Meyer, 1993). CJ (k) represents the smooth coefficients that capture the trend of
the time series, whereas dj (k ), representing increasing finer resolution deviations from
the smooth trend, can capture higher-frequency oscillations. To what extent that
resultant coefficients CJ (k) smooth the time series is determined by the size of J
selected. The larger J is, the more smooth the part of the time series can be captured by
CJ (k ). The choice of J is crucial in applications of wavelet analysis to finance and shall
be discussed further in section 4.
/
3.2
Deriving wavelet-based indicators
In this paper, the energy, entropy and others of CJ (k), wavelet coefficients at level J, are
calculated and they are taken as indicators for FGP-2. The reason for choosing CJ (k),
rather than dj (k), is that CJ (k) captures major trends of a time series, whereas dj (k) only
captures deviations of the time series. Some of the derived indicators describe the
features of a financial time series in dynamics whilst others are mere statistics of a
financial time series. Given that a financial time series could potentially reflect
dynamics of the movements of financial markets, all indicators could have financial
meaning to some extent. The formulae for extracting those features are given below:
1) Energy feature . The feature is based on the amplitude with different frequency of a time
series. The energy of wavelet coefficient at each resolutions level j / 1, 2, . . ., J with a
sliding window (l is the window size) with index i is written as:
X
i
(BEj )i
C2j (l)
(4)
k
2) Entropy feature. The feature is to measure the uncertainty of the wavelet coefficients at
the different level j . The pk,j is the probability distribution of a wavelet coefficients at
the scale of j . The entropy at each resolutions level j /1, 2, . . ., J based on the wavelet
coefficients estimated with a sliding window l with index i is given by
292
Wavelet analysis
X
i
(Hj )i
pl;j ln(pl;j )
(5)
3) Curve length . The feature is to compute the trajectory of a wavelet coefficient. If a curve
length is long, the change of system is severe; slight otherwise. The formula of
computation of the curve length is below:
CL[j]
l
X
i1
Cj (i1)Cj (i)
(6)
4) Non-linear energy. The feature is to describe the local change of energy information,
which can be used to extract the spikes in the wavelet coefficients. The equation of
non-linear energy is below:
NE[j]
l
X
i1
Cj (i)Cj (i)Cj (i1)Cj (i1)
(7)
5) Statistic features . Some basic statistics can also be applied to extract some of features
from the wavelet coefficients. They are listed as follows:
Mean: Mean[j]
l
1X
Cj (i)
l i1
Maximum: f max(j)max(Cj (1); :::; Cj (j):::; Cj (l))

Minimum: f min(j)min(Cj (1); :::; Cj (j):::; Cj (l))
Median:
f med(j) median(Cj (1); :::; Cj (j):::; Cj (l))
Standard deviation: STD(j)
2
l
1X
Cj (i)mean(Cj (i)
l i
(8)
All nine different indicators are adopted by FGP. Note that any feature above at a time
index, i, is calculated using a fixed sliding window that covers preceding l coefficient
values (ie, widow size l), because only previous coefficients are available at time
index i. We did not conduct any feature selection process in this study, as genetic
programming itself has the capability of selecting more promising indicators
adaptively via its genetic operators such as reproduction, crossover and mutation,
while evolving decision trees.
/
4.
Experiments and results
As mentioned earlier, this study is to examine whether wavelets-based indicators

are able to bring in any benefit to FGP in forecasting. In particular, we are keen
Li et al.
293
to know any performance improvement of FGP-2 in reducing RF, or increasing

RP equivalently.
For a fair comparison, we follow our earlier study of Li and Tsang (2000). The
major parameter settings for FGP, such as population size, crossover rate, mutation
rate, selection strategy and termination criteria, etc., are the same and listed in
Table 2. The termination criteria are similar as well, which are either that FGP
reaches the maximum generation number (ie, 30), or that it runs out of the
30 min time that we set (as opposed to the maximum running time, 2 h in our
previous study), whichever is achieved first. Except for the nine novel indicators
used here, which replace six technical analysis indicators used previously, there
are no other differences. Indeed, the dataset used is the same, which are the
DJIA closing index data from 07/04/1969 to 09/04/1981 (a total of 3035 trading
days). We split up the whole dataset into the training dataset from 07/04/
1969 to 11/10/1976 (1900 trading days) and the test dataset from 12/10/1976 to
09/04/1981 (1135 trading days). Both the training period and the test period
contain roughly 50% of positive positions. GDTs, generated by FGP-2 on the training
dataset, are tested on the test dataset. We report performance results of the GDTs
on the test dataset.
A basis wavelet of Daubechies 4 is used in this study. To calculate those nine
indicators, we set J 2 and window size 64 (ie, l 64), which deliver better results
than those using other levels and/or other window sizes experimentally. Our
extensive experiments, which are not reported in this paper, suggest a rule of thumb
for selecting the level J and window size l. In general, J is preferable to be a smaller
figure more than 1, eg, 2 or 3, since J 1 usually results in indicators with lower
quality because of possible noisy information in data not being filtered; whilst a higher
/
Table 2 Tableau for the parameters of FGP-2 experiments

Input terminals (9 wavelets-based
indicators)
Prediction terminals
Non-terminals
Crossover rate
Mutation rate
Population size
Maximum no. of generations
Termination criterion
Selection strategy
Max depth of individual
programs
Max depth of initial individual
programs
Run times
Hardware and operating
system
BE3, H3, CL[3], NE[3], Mean[3], f max(3),

f max(3), f max(3), STD(3), and Real values
{0, 1}: 1 means Positive; 0 means Negative
If-then-else, And, Or, Not, /, ]/, B/, 5/, /
0.9
0.01
1200
30
The maximum number of generations has been
run or FGP-2 has run for more than 0.5 h
Tournament selection, size /4
17
4
0 0.5 h
Intel XeonTM PC 2.8 GHz running Windows
2000
with 2G RAM
Wavelet analysis
294
Table 3 The results of 10 GDTs generated by FGP-2 using R1 /[35, 50]

GDTs
GDT 1
GDT 2
GDT 3
GDT 4
GDT 5
GDT 6
GDT 7
GDT 8
GDT 9
GDT 10
MEAN
STD
Precision
RMC
RC
TP
FP
TN
FN
0.7358
0.7124
0.7128
0.7437
0.7557
0.6659
0.7564
0.7304
0.7169
0.7396
0.7270
0.0254
0.5389
0.5355
0.5304
0.5000
0.5557
0.5118
0.5541
0.5743
0.5338
0.5490
0.5383
0.0206
0.6326
0.6229
0.6247
0.6493
0.6352
0.6053
0.6361
0.6185
0.6256
0.6308
0.6281
0.0112
273
275
278
296
263
289
264
252
276
267
273.3
12.2
98
111
112
102
85
145
85
93
109
94
103.4
16.7
445
432
431
441
458
398
458
450
434
449
439.6
16.7
319
317
314
296
329
303
328
340
316
325
318.7
12.2
J (eg, 4 or 5) usually leads to ineffective indicators, because of useful data information

probably being eliminated. Window size l is preferable to be three times longer as the
length of future days that the prediction covers. Similarly, any bigger l or smaller l
would potentially worsen the indicators derived because of possibly improper
information being taken into account.
Similarly, we take four non-overlapped R [Pmin, Pmax] values (ie, R1 [35, 50];
R2 [20, 35]; R3 [15, 20]; R4 [10, 15]) for the constrained fitness function,
respectively, and run FGP-2 in turn. For each R, 10 independent runs were completed
and total 10 GDTs were produced consequently. Table 3 lists the performances of
all 10 GDTs and their mean and standard deviation for R1 [35, 50]. The RP, RC
and RMC are 0.7270, 0.6281 and 0.5383 (which increase 12.76% and 9.09% in RP
and RC, respectively and reduce 11.91% in RMC), compared with 0.5994, 0.5372
and 0.6574, respectively, obtained in the earlier study (see the last row in Table 5).
A statistical two-tailed unpaired t-test has been applied to determine whether
the results differences between two groups are statistically significant. Shown
in Table 4 are t-values and their corresponding p-values under each measure,
indicating that the generated GDTs statistically exhibit much better performance
under all of three criteria at a significant level of aB 1.0 107. The beauty is that
the performance improvement does not scarifice the overall number of buying
recommendations (ie, TP FP), as 377 here vs 339 previously is observed using
different indicators.
/
Table 4 t-statistics for comparing mean performances of two groups (the results using the
wavelet-based indicators versus the results using the technical indicators for R /[35%, 50%])
t values
p values
For RP
For RC
For RMC
13.32
2.42E-09
22.59
3.34E-11
10.21
2.06E-08
[35, 50]
[20, 35]
[15, 20]
[10, 15]
[35, 50]
[20, 35]
[15, 20]
[10, 15]
Mean
STD
Mean
STD
Mean
STD
Mean
STD
Mean
STD
Mean
STD
Mean
STD
Mean
STD
R
0.9919
0.0160
0.9331
0.0366
0.7620
0.1231
0.5383
0.0206
RMC
0.4813
0.0051
0.5012
0.0131
0.5665
0.0469
0.6281
0.0112
RC
4.8
9.5
39.6
21.7
140.9
72.9
273.3
12.2
TP
0.7140
0.0622
0.6898
0.0521
0.6400
0.0259
0.5994
0.0141
RF
0.9405
0.0165
0.8569
0.0641
0.7525
0.0550
0.6574
0.0299
RMC
0.4970
0.0076
0.5174
0.0167
0.5341
0.0119
0.5372
0.0048
RC
35.2
9.8
84.7
38.0
146.5
32.6
202.8
17.7
TP
Mean performance using techincal analysis indicators
0.8539
0.1206
0.7680
0.0892
0.7646
0.0407
0.7270
0.0254
Precision
Mean performance using wavelets-based indicators
14.1
5.2
40.4
26.5
83.3
23.4
136.1
17.5
FP
1.5
3.9
13.7
7.7
40.9
20.6
103.4
16.7
FP
528.9
5.2
502.6
26.5
459.7
23.4
406.9
17.5
TN
541.5
3.9
529.3
7.7
502.1
20.6
439.6
16.7
TN
556.8
9.8
507.3
38.0
445.5
32.6
389.2
17.7
FN
587.2
9.5
552.4
21.7
451.1
72.9
318.7
12.2
FN
Table 5 The results of means and standard deviations for 4 Rs in terms of two distinct types of indicators used in FGP-2
Li et al.
295
296
Wavelet analysis
For brevity, for other three Rs, we only list their means and standard deviations
in Table 5. The results suggest that the performance of RP have been improved for
all of four Rs, though the performances of RMC and RC have not been improved
for all four Rs. The experimental here is encouraging because the improvement on
RP is what FGP-2 aims to achieve. Therefore, in terms of the prediction precision
(ie, RP), the wavelet-based indicators are superior to the technical analysis
indicators in this study.
5.
Discussions and conclusions
Motivated by potential benefits that wavelet analysis could bring into financial
forecasting, in this paper, we have applied the wavelet analysis technique to extract
some indicators from wavelet coefficients at a certain level in respect of dynamics
and statistics of the time series concerned. In particular, we have examined whether
wavelet analysis could be used to improve the forecasting performance of our
genetic programming-based tool, FGP, through those indicators extracted. The
effectiveness of those novel indicators has been demonstrated by tackling a specific
prediction task with FGP-2. By comparison with the results in earlier work, our
experimental results on DJIA index data have suggested that the novel indicators be
capable of producing better forecasting performance in terms of the prediction
precision. We may argue that the wavelet-based indicators may have much more
merit to the prediction problem studied here in comparison with the technical
analysis indicators.
We are still far from understanding the role that wavelet analysis could play in
financial forecasting. Further research is worth being conducted through answering
the following questions. Firstly, should we derive more indicators at different
levels, rather than one level in this study, and then make predictions using all
indicators at different level; or should we derive indicators, then make a prediction
at different levels, and finally make ultimate predictions by combining those
individual forecasts? If we do, could we achieve better forecasting results?
Secondly, are we better equipped with the wavelet analysis techniques to handle
the prediction task here? For example, are our forecasting results sensitive to the
choices of wavelet types? Finally, what financial and economic problems are more
suitable for wavelet analysis to tackle? If all the above questions could be
answered, we believe that we certainly would be in a better position or have a
better chance to make wavelet analysis more successful in finance and economics.
Given the successes of wavelets in other diverse fields and in finance and
economics so far, we would expect positive return in financial applications with
wavelets in the future.
Li et al.
297
References
Alexander, S.S. 1964: Price movement in
speculative markets: trend or random walks,
No. 2. in: P. Cootner, editor, The random
character of stock market prices . MIT Press,
33872.
Arino, M.A. 1996: Forecasting time series via
discrete wavelet transform. Working paper,
University de Navarra, Barcelona.
Aussem, A., Campbell, J. and Murtagh, F.
1998: Wavelet-based feature extraction and
decomposition strategies for financial forecasting. Journal of Computational Intelligence in
Finance , March/April, 512.
Brock, W., Lakonishok, J. and LeBaron, B.
1992: Simple technical trading rules and the
stochastic properties of stock returns. Journal
of Finance 47, 173164.
Daubechies, I. 1992: Ten lectures on wavelets. Conference Series in Applied Math.
SIAM.
Fama, E.F. and Blume, M.E. 1966: Filter rules
and stock-market trading. Journal of Business
39, 22641.
Gencay, R., Selcuk, F. and Whitcher, B. 2001:
Scaling properties of foreign exchange volatility. Physica 289, 24966.
An introduction to wavelets and other filtering
methods in finance and economics . Academic
Press.
Systematic risk and time scales. Quantitative
Finance 3, 10816.
Kim, S. and In, F. 2003: The relationship
between financial variables and real economic activity: evidence from spectral and
wavelet analyses. Studies in Nonlinear Dynamics and Econometrics 7, 1183201.
Koza, J.R. 1992: Genetic programming: on the
programming of computers by means of
natural selection. MIT Press.
Li, J. and Tsang, E.P.K. 1999a: Improving
technical analysis predictions: an application
of genetic programming. Proceedings of the
12th International Florida AI Research Society
Conference , Orlando, FL, 10812.
Li, J. and Tsang, E.P.K. 1999b: Investment
decision making using FGP: a case study.
Proceedings of the 1999 Congress on Evolutionary Computation , IEEE Press, 125359.
Li, J. 2001: FGP: A genetic programming based tool
for financial forecasting . PhD thesis, University
of Essex.
Li, J. and Tsang, E.P.K. 2000: Reducing failures

in investment recommendations using genetic programming. Proceedings of the Sixth
International Conference on Computing in Economics and Finance . Society for Computational Economics, Barcelona.
Li, X. and Yao, X. 2005: Multi-scale statistical
process monitoring in machining. IEEE Trans.
Industrial Electronics 52, 92224.
Li, X., Tso, S. K. and Wang, J. 2000: Real time
tool condition monitoring using wavelet
transforms and fuzzy techniques. IEEE Transactions on Systems, Man, and Cybernetics Part
C: Applications and Reviews 31, 35257.
Murtagh, F., Starck, J.L. and Renaud, O. 2003:
On neuro-wavelet modeling. The Journal of
Decision Supprort System 37, 47584.
Percival, D.B. and Walden, A.T. 2000: Wavelet
methods for time series analysis. Cambridge
Press.
Meyer, Y. 1993: Wavelets, applications and algorithms . SIAM.
Pan, Z. and Wang, X. 1998: A stochastic nonlinear regression estimator using wavelets.
Computational Economics 11, 89102.
Ramsey, J. B. and Zhang, Z. 1997: The analysis
of foreign exchange data using waveform
dictionaries. Journal of Empirical Finance 4,
34172.
Ramsey, J.B. and Lampart, C. 1998: The
decomposition of economic relationships by
time scale using wavelets: expenditure and
income. Studies in Nonlinear Dynamics and
Econometrics 3, 2342.
Ramsey, J.B. 2002: Wavelets in economics and
finance: past and future. Studies in Nonlinear
Dynamics and Econometrics 6, 127.
Sweeney, R.J. 1988: Some new filter rule tests:
methods and results. Journal of Financial and
Quantitative Analysis 23, 285300.
Tsang, E.P.K. and Li, J. 2002: EDDIE for
financial forecasting, in S.-H. Chen editor,
Genetic Algorithms and Programming in Computational Finance. Kluwer Series in Computational Finance, Chapter 7, 16174.
Tsang, E.P.K., Li, J. and Butler, J.M. 1998:
EDDIE beats the bookies. International Journal
of Software, Practice and Experience 28, 1033
43.
Tsang, E.P.K., Yung, P. and Li, J. 2004: EDDIEautomation, a decision support tool for
financial forecasting. Journal of Decision Support Systems 37, 55965.

GP Forecast Programing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GP Forecast Programing

Uploaded by

Copyright:

Available Formats

Transactions of the Institute of Measurement and Control 28, 3 (2006) pp.

Genetic programming with

The Centre of Excellence for Research in Computational Intelligence and

2006 The Institute of Measurement and Control

prediction task addressed previously by a genetic programming-based tool, Financial

FGP for financial forecasting

FGP is a major implementation of the Evolutionary Dynamic Data Investment

Nevertheless, to find alternative promising indicators is one of the important

Pop1 Mutation (Pop); /*apply mutation to population */

Performance measures of FGP

Where 0 5 w_rc; w_rmc; and w_rf 5 1

# of True Negative Positions

# of False Positive Positions

Actual # of negative positions

Discrete wavelet transform

Successful applications in science and engineering demonstrate that the wavelet

CJ (k)fJ (2J nk)

dj (k)cj (2j nk)

Deriving wavelet-based indicators

Cj (i)Cj (i)Cj (i1)Cj (i1)

Maximum: f max(j)max(Cj (1); :::; Cj (j):::; Cj (l))

f med(j) median(Cj (1); :::; Cj (j):::; Cj (l))

Standard deviation: STD(j)

Experiments and results

As mentioned earlier, this study is to examine whether wavelets-based indicators

to know any performance improvement of FGP-2 in reducing RF, or increasing

Table 2 Tableau for the parameters of FGP-2 experiments

BE3, H3, CL[3], NE[3], Mean[3], f max(3),

Table 3 The results of 10 GDTs generated by FGP-2 using R1 /[35, 50]

J (eg, 4 or 5) usually leads to ineffective indicators, because of useful data information

Mean performance using techincal analysis indicators

Mean performance using wavelets-based indicators

Discussions and conclusions

Li, J. and Tsang, E.P.K. 2000: Reducing failures

You might also like

Pop1 Mutation (Pop); /apply mutation to population /

CJ (k)fJ (2J nk)

dj (k)cj (2j nk)

Cj (i)Cj (i)Cj (i1)Cj (i1)

Maximum: f max(j)max(Cj (1); :::; Cj (j):::; Cj (l))

f med(j) median(Cj (1); :::; Cj (j):::; Cj (l))

Standard deviation: STD(j)

Table 3 The results of 10 GDTs generated by FGP-2 using R1 /[35, 50]