You are on page 1of 7

Neural Network Time Series Prediction

With Matlab
By
Thorolf Horn Tonjum
School of Computing and Technology,
University of Sunderland, The Informatics Centre,
St Peter's Campus, St Peter's Way,
Sunderland, SR6 !!,
United "ingdom
#mail $ thorolf%ton&um'sunderland%ac%u(
Introduction
This paper describes neural network time series prediction project,
applied to forecasting the American S&P !! stock inde"#
$%& weeks of raw data is preprocessed and used to train a neural network#
The project is built with Matlab 'Mathworks inc#(
Matlab is used for processing and preprocessing the data#
A prediction error of !#!!))$ 'mean s*uared error( is achie+ed#
,n of the major goals of the project is to +isuali-e how the network
adapts to the real inde" course b. appro"imation, this is
achie+ed b. training the network in series of !! epochs each,
showing the change of the appro"imation 'green color( after each training#
/emember to push the 0Train !! 1pochs2 button at least ) times,
to get good results and a feel for the training# 3ou might ha+e to restart the whole program
se+eral times, before it 0lets loose2 and achie+es a good fit,
one out of reruns produce god fits#
To run4rerun the program in matlab, t.pe 5
66 preproc
66 t"
Dataset
$%7 weeks of American S&P !! inde" data#
8) basic forecasting +ariables#
The 8) basic +ariables are 5
8# S&P week highest inde"#
9# S&P week lowest inde"#
:# N3S1 week +olume#
)# N3S1 ad+ancing +olume 4 declining +olume#
# N3S1 ad+ancing 4 declining issues#
$# N3S1 new highs 4new lows#
%# NAS;A< week +olume#
7# NAS;A< ad+ancing +olume 4 declining +olume#
&# NAS;A< ad+ancing 4 declining issues#
8!# NAS;A< new highs 4new lows#
88# : Months treasur. bill#
89# :! 3ears treasur. bond .ield#
8:# =old price#
8)# S&P weekl. closing course#
These are all strong economic indicators#
The indicators ha+e not been subject to re>inde"ation or other alternations of the measurement
procedures, so the dataset co+ers an unobstructed span from ?anuar. 8&7! to ;ecember 8&&9#
@nterest rates and inflation are not included, as the. are reflected in the :! .ears treasur. bond
and the price of gold# The dataset pro+ides an ample model of the macro econom.#
Preprocessing
The weekl. change in closing course is used as output target for the network,
The 8) basic +ariables are transformed into ) features b. 5
Taking the first 8: +ariables and producing 5
@# The change since last week 'delta(#
@@# The Second power '"A9(#
@@@# The third power '"A:(#
And using the course change from last week as an input +ariable the week after, gi+es
) feature +ariables 'the 8) original static +ariables included(#
All input +ariables are then subjected to normali-ation, which
ensures that the input data follows the normal distribution, with a standard de+iation of 8,
and a mean of -ero# BMatlab command 5 prestdC
The dimensionalit. of the data is then reduced to 97 +ariables after a principal component
anal.sis with !#!!8 as threshold# The threshold is set low since we want to preser+e as much
data as possible for the 1lman network to work on# BMatlab command 5 prepcaC
We then scale the +ariables 'including the target data( to fit the B>8,8C range, as we use tansig
output functions# BMatlab command 5 premnm"C
S1 matlab file 0Preproc#m2 for further details#
Choice of Network architecture and algorithms.
We are doing time series prediction, but we are forecasting a stock inde", and rel. on current
economic data just as much as the lagged data from the time series being forecasted,
this gi+es us a wider specter of neural model options#
Multi De+el Perceptron networks 'MDP(,
Tapped ;ela.>line 'T;NN(, and a recurrent network model can be used#
@n our case, detecting c.clic patterns, becomes a priorit. together with good multi+ariate
pattern appro"imation abilit.#
The 1lman network is selected on behalf of its abilit. to detect both temporal and spatial
patterns# Ehoosing a recurrent network is fa+orable, as it accumulates historic data in its
recurrent connections#
Fsing a 1lman network for this problem domain, demands a high number of hidden weights,
: is found to be the best trade off in our e"ample, whereas if we used a normal MDP
network, around 8$ hidden weights would be enough#
The 1lman network needs more hidden nodes to respond to the comple"it. in the data,
as well as ha+ing to appro"imate both temporal and spatial patterns#
We train the network with gradient descent training algorithm, enhanced with momentum,
and adapting learning rate, this enables the network to performance>+ise climb past points
were gradient descent training algorithms without adapting learning rate
would get stuck#
We use the matlab learning function 5 learnpn for learning,
as we need robustness to deal with some *uite large outliers in the data#
Ma"imum +alidation failures Bnet#trainParam#ma"GfailH9C
@s set arbitraril. high , but this pro+ides the learning algorithm higher abilit. to escape local
minima, and continue to impro+e, were it would otherwise get stuck#
The momentum is also set high '!#&( to ensure high impact of pre+ious weight change,
This speeds up the gradient descent, helps keeps us out of local minima, and
resists memori-ation#
The learning rate is initiall. set relati+el. high at !#9, this is possible because of the high
momentum, and because it2s remote controlled b. the adapti+e learning rate rules of the
matlab training method traingd"#
We choose the purelin as the transfer function for the output of the hidden la.er,
as this pro+ided more appro"imation power, and tansig for the output la.er, as we scaled the
target data to fit the B>8,8C range#
The weight initiali-ation scheme init-ero is used to start the weights off from -ero,
this pro+ides the best end results, but heightens the trial and error factor, resulting in
ha+ing to restart the program between to 7 times to get a Iluck.J fit#
,nce .ou ha+e a Iluck.J fit, training the network for :> K !! epochs usuall.
.ields result in the !#!!) mse range#
Ma"imum performance increase is set to 8#9), gi+ing the algorithm some leewa. to test out
alternati+e routes, before getting called back on the path#
Bnet#trainParam#ma"GperfGinc H 8#9)C#
With well o+er )!! training cases to work with, : hidden neurons and 97 input +ariables,
we get &7! hidden la.er weights, which is well below the rule of thumb number )!!!
'8! " Eases(#
/esults in the !#!!) mse range, supports the conclusion that the model choice
was not the worst possible# Additional results could ha+e come from adding lagged
data, like e"ponentiall. smoothed a+erages from different time frames and with different
smoothing factors, efficientl. accumulating memor. of large time scales#
@ntegrating a tapped dela.>line setup, could also ha+e been beneficial#
Lut these alternati+es would ha+e added to the course of dimensionalit., probabl. not
.ielding great benefits in return, especiall. as long as the recurrent memor. of the 1lman
network seemed to perform with ample sufficienc.#
The training sets )!! weeks was taken from the start of the data, then came the 8)! weeks of
test set, and finall. 8:& weeks of +alidation data,
@n effect appro"imating data on the !#!!) mse le+el,
more than .ears '9%& weeks( into the future#
Training & Visualiation.
The data is as described abo+e, di+ided in the classic $! 9! 9! format for
training>set testing>set and +alidation>set#
The appro"imation is +isuali-ed b. the actual course 'blue( +ersus the appro"imation 'green(#
This is done for the training set, the testing set, and the +alidation set#
This clearl. demonstrates how the neural net is apro"imating the data 5
1rrors are displa.ed as red bars in the bottom of the charts#
The training is done b. training !! epochs, then displa.ing the results, then training a new
!! epochs, and so forth# Seeing the appro"imation 0li+e2 gi+es interesting insights into how
the algorithm adapts, and how changes in the model affect adaptation#

Push the button to train a new !! epochs#
The effect of the adapti+e learning rate is *uite intriguing, specificall. the effect
on the performance#
;.namic learning rate, controlled b. adaptation rules#
Mi+id performance change b. the changing learning rate#
The correlation plot gi+es ample insight into how close the model is mapping the data#
To see this push the 0correlation plot2 button#
The 0Sum'abs'errors((2 displa.s the sum of all the absolutes of the errors, as a steadfast
and unfiltered measurement#
Bi!liography
Malluru /ao 8&&: IENN Neural NettworksJ#
Neural Nettwork Toolbo" Fserguide# ) edition#
"ppendi#$ ". The %atla! code $
Preproc#m 5 Preprocessing the data#
T"#m 5 Setting up the network#
=ui#m 5 Training and displa.ing the network#

You might also like