Professional Documents
Culture Documents
Andrew Parkes
Neural Networks 1
Neural Networks
AIMA
Section 20.5 of 2003 edition
Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994 An Introduction to Neural Networks (2nd Ed). Morton, IM, 1995
Brief History
Try to create artificial intelligence based on the natural intelligence we know: The brain
massively interconnected neurons
Neural Networks
Modelling a Neuron
:Activation value of unit j wj,i :Weight on link from unit j to unit i ini :Weighted sum of inputs to unit i ai :Activation value of unit i g :Activation function
Activation Functions
Neural Networks
Hebb (1949) developed the first learning rule
on the premise that if two neurons were active at the same time the strength between them should be increased
Neural Networks
During the 50s and 60s many researchers worked, amidst great excitement, on a particular net structure called the perceptron. Minsky & Papert (1969) demonstrated a strong limit on the power of perceptrons
saw the death of neural network research for about 15 years
Only in the mid 80s (Parker and LeCun) was interest revived because of their learning algorithm for a better design of net
(in fact Werbos discovered algorithm in 1974)
2 2 -1
X2
X3
2 2 -1
X2
X3
2 2 -1
X2
X3
The activation of a neuron is binary. That is, the neuron either fires (activation of one) or does not fire (activation of zero).
2 2 -1
X2
X3
For the network shown here the activation function for unit Y is f(y_in) = 1, if y_in >= else 0
where y_in is the total input signal received is the threshold for Y
2 2 -1
X2
X3
Originally, all excitatory connections into a particular neuron have the same weight, although different weighted connections can be input to different neurons Later weights allowed to be arbitrary
2 2 -1
X2
X3
Each neuron has a fixed threshold. If the net input into the neuron is greater than or equal to the threshold, the neuron fires
2 2 -1
X2
X3
The threshold is set such that any non-zero inhibitory input will prevent the neuron from firing
X2
AND Function
AND X1 1 1 0 0
X2 1 0 1 0
Y 1 0 0 0
Threshold(Y) = 2
X2
OR X1 1 1 0 0
X2 1 0 1 0
Y 1 1 1 0
Threshold(Y) = 2
X2
-1
AND NOT X1 1 1 0 0
X2 1 0 1 0
Y 0 1 0 0
Threshold(Y) = 2
Simple Networks
AND
Input 1 Input 2 Output
OR
NOT
0 0 0
0 1 1 1 0 1 0 0 1
0 0 0
0 1 1 1 0 1 1 1 1
0 1
1 0
Simple Networks
-1 W = 1.5 x W=1 y
t = 0.0
Perceptron
Synonym for SingleLayer, Feed-Forward Network First Studied in the 50s Other networks were known about but the perceptron was the only one capable of learning and thus all research was concentrated in this area
Perceptron
A single weight only affects one output so we can restrict our investigations to a model as shown on the right Notation can be simpler, i.e.
O Step0 j WjIj
0,1
0,0
1,0
AND
0,0
1,0
XOR
Functions which can be separated in this way are called Linearly Separable Only linearly separable functions can be represented by a perceptron XOR cannot be represented by a perceptron
Linear Separability is also possible in more than 3 dimensions but it is harder to visualise
XOR
XOR is not linearly separable Cannot be represented by a perceptron What can we do instead? 1. Convert to logic gates that can be represented by perceptrons 2. Chain together the gates Make sure you understand the following check it using truth tables
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)
X2
-1
Z2
2
2 XOR Function
XOR X1 1 1 0 0
X2 1 0 1 0
Y 0 1 1 0
Expectations
Be able to explain the terminology used, e.g. activation functions step and threshold functions perceptron feed-forward multi-layer, hidden layers linear separability XOR why perceptrons cannot cope with XOR how XOR is possible with hidden layers
Questions?