Professional Documents
Culture Documents
Contents
CS 476: Networks of Neural
Sequences
Computation
Time Delayed I
Sequence Learning
Contents
Time Delayed Networks I: Implicit
Sequences
Representation
Time Delayed I
Time Delayed Networks II: Explicit
Time Delayed II Representation
Time Delayed II
Dynamic networks learn a mapping from a
single input signal to a sequence of response
Recurrent I signals, for an arbitrary number of pairs
Recurrent II (signal,sequence).
Conclusions Typically the input signal to a dynamic
network is an element of the sequence and
then the network produces as a response the
rest of the sequence.
To learn sequences we need to include some
form of memory (short term memory) to the
CS 476: Networks of Neural Computation, CSD, UOC, 2009
network.
Sequence Learning II
Time Delayed I
Time Delayed II
Recurrent I
Sequences
Time Delayed I
Time Delayed II
Recurrent I
Recurrent II
Conclusions
Recurrent I
Recurrent II
Conclusions
Time Delayed II
In case of a whole network, for example
Recurrent I
assuming a single output node and a linear
Recurrent II output layer, the responsep is given by:
m 1 m 1
y ( n) w j y j ( n) w j w j (l ) x ( n l ) b j b0
Conclusions j 1 j 1 l 0
Recurrent I
Recurrent II
Conclusions
Recurrent II
We can define an instantaneous value for the
sum of squared errors produced by the network
Conclusions as follows: 1
E ( n)
2 j
e
2
j ( n)
Time Delayed I
The idea is a minimise an overall cost function,
Time Delayed II calculated over all time:
Recurrent I
Etotal E (n)
n
Recurrent II
Conclusions
We could proceed as usual by calculating the
gradient of the cost function over the weights.
This impliesE
that we needto
E (calculate
n) the
instantaneous
total
gradient:
w ji w n ji
Time Delayed II
Note that in general holds:
Recurrent I
Recurrent II Etotal v j ( n) E ( n )
v j ( n) w ji w ji
Conclusions
The equality is correct only when we take the
sum over all time.
Time Delayed II
Where is the learning rate.
Recurrent I
We calculate the terms in the above relation as
Recurrent II
follows:
Conclusions v j ( n )
xi ( n)
w ji ( n)
Conclusions
We need to calculate the for the cases of
output and hidden layers.
Time Delayed II
For a hidden layer we assume that neuron j is
Recurrent I
connected to a set A of neurons in the next
Recurrent II layer (hidden or output).
E Then we have:
j ( n) total
Conclusions v j ( n )
Etotal vr (k )
r A k vr ( k ) v j ( n )
Sequences
Time Delayed I
Time Delayed II
Recurrent I
Recurrent II
Conclusions
Time Delayed II
Recurrent I
Recurrent II
Recurrent I (SRN).
Recurrent II The network is trained by using the
Backpropagation algorithm
Conclusions
A schematic
is shown in the
next figure:
Sequences
Time Delayed I
Time Delayed II
Recurrent I
Recurrent II
Conclusions
Conclusions
Recurrent I
Recurrent II
Conclusions
Time Delayed II
Recurrent I
Recurrent II
Conclusions
Time Delayed I
Time Delayed I
Time Delayed II
Recurrent I
Recurrent II
Conclusions
2 n n0 j A
Recurrent II
Conclusions
Recurrent II
n n0 1
j (n) xi ( n 1)
Conclusions
Conclusions