You are on page 1of 36

G51IAI Introduction to AI

Andrew Parkes

Neural Networks 1

Neural Networks
AIMA
Section 20.5 of 2003 edition

Fundamentals of Neural Networks : Architectures, Algorithms and Applications. L, Fausett, 1994 An Introduction to Neural Networks (2nd Ed). Morton, IM, 1995

Brief History
Try to create artificial intelligence based on the natural intelligence we know: The brain
massively interconnected neurons

G51IAI Neural Networks

Neural Networks

Natural Neural Networks


Signals move via electrochemical signals The synapses release a chemical transmitter the sum of which can cause a threshold to be reached causing the neuron to fire Synapses can be inhibitory or excitatory

Natural Neural Networks


We are born with about 100 billion neurons
A neuron may connect to as many as 100,000 other neurons

Natural Neural Networks


McCulloch & Pitts (1943) are generally recognised as the designers of the first neural network Many of their ideas still used today e.g.
many simple units, neurons combine to give increased computational power the idea of a threshold

G51IAI Neural Networks

Modelling a Neuron

ini j Wj, iaj aj

:Activation value of unit j wj,i :Weight on link from unit j to unit i ini :Weighted sum of inputs to unit i ai :Activation value of unit i g :Activation function

G51IAI Neural Networks

Activation Functions

Stept(x) = Sign(x) = Sigmoid(x)

1 if x t, else 0 threshold=t +1 if x 0, else 1 = 1/(1+e-x)

Building a Neural Network


1. Select Structure: Design the way that the neurons are interconnected 2. Select weights decide the strengths with which the neurons are interconnected weights are selected so get a good match to a training set training set: set of inputs and desired outputs often use a learning algorithm

Neural Networks
Hebb (1949) developed the first learning rule
on the premise that if two neurons were active at the same time the strength between them should be increased

Neural Networks
During the 50s and 60s many researchers worked, amidst great excitement, on a particular net structure called the perceptron. Minsky & Papert (1969) demonstrated a strong limit on the power of perceptrons
saw the death of neural network research for about 15 years

Only in the mid 80s (Parker and LeCun) was interest revived because of their learning algorithm for a better design of net
(in fact Werbos discovered algorithm in 1974)

Basic Neural Networks


Will first look at simplest networks Feed-forward
Signals travel in one direction through net Net computes a function of the inputs

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

Neurons in a McCulloch-Pitts network are connected by directed, weighted paths

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

If the weight on a path is positive the path is excitatory, otherwise it is inhibitory

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

The activation of a neuron is binary. That is, the neuron either fires (activation of one) or does not fire (activation of zero).

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

For the network shown here the activation function for unit Y is f(y_in) = 1, if y_in >= else 0

where y_in is the total input signal received is the threshold for Y

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

Originally, all excitatory connections into a particular neuron have the same weight, although different weighted connections can be input to different neurons Later weights allowed to be arbitrary

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

Each neuron has a fixed threshold. If the net input into the neuron is greater than or equal to the threshold, the neuron fires

G51IAI Neural Networks

The First Neural Neural Networks


X1

2 2 -1

X2

X3

The threshold is set such that any non-zero inhibitory input will prevent the neuron from firing

Building Logic Gates


Computers are built out of logic gates
Can we use neural nets to represent logical functions? Use threshold (step) function for activation function all activation values are 0 (false) or 1 (true)

G51IAI Neural Networks

The First Neural Neural Networks


X1 1

X2

AND Function

AND X1 1 1 0 0

X2 1 0 1 0

Y 1 0 0 0

Threshold(Y) = 2

G51IAI Neural Networks

The First Neural Neural Networks


X1 2 Y

X2

2 AND Function OR Function

OR X1 1 1 0 0

X2 1 0 1 0

Y 1 1 1 0

Threshold(Y) = 2

G51IAI Neural Networks

The First Neural Neural Networks


X1 2

X2

-1

AND NOT Function

AND NOT X1 1 1 0 0

X2 1 0 1 0

Y 0 1 0 0

Threshold(Y) = 2

G51IAI Neural Networks

Simple Networks

AND
Input 1 Input 2 Output

OR

NOT

0 0 0

0 1 1 1 0 1 0 0 1

0 0 0

0 1 1 1 0 1 1 1 1

0 1

1 0

G51IAI Neural Networks

Simple Networks

-1 W = 1.5 x W=1 y

t = 0.0

G51IAI Neural Networks

Perceptron
Synonym for SingleLayer, Feed-Forward Network First Studied in the 50s Other networks were known about but the perceptron was the only one capable of learning and thus all research was concentrated in this area

G51IAI Neural Networks

Perceptron
A single weight only affects one output so we can restrict our investigations to a model as shown on the right Notation can be simpler, i.e.

O Step0 j WjIj

G51IAI Neural Networks

What can perceptrons represent?


Input 1 Input 2 Output AND 0 0 0 0 1 0 1 0 0 1 1 1 XOR 0 0 0 0 1 1 1 0 1 1 1 0

G51IAI Neural Networks

What can perceptrons represent?


1,1 1,1 0,1

0,1

0,0

1,0

AND

0,0

1,0

XOR

Functions which can be separated in this way are called Linearly Separable Only linearly separable functions can be represented by a perceptron XOR cannot be represented by a perceptron

G51IAI Neural Networks

What can perceptrons represent?

Linear Separability is also possible in more than 3 dimensions but it is harder to visualise

XOR
XOR is not linearly separable Cannot be represented by a perceptron What can we do instead? 1. Convert to logic gates that can be represented by perceptrons 2. Chain together the gates Make sure you understand the following check it using truth tables
X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)

G51IAI Neural Networks

The First Neural Neural Networks


2 2 X1 -1 Z1 Y

X2

-1

Z2

2
2 XOR Function

XOR X1 1 1 0 0

X2 1 0 1 0

Y 0 1 1 0

X1 XOR X2 = (X1 AND NOT X2) OR (X2 AND NOT X1)

Single- vs. Multiple-Layers


Once we chain together the gates then we have hidden layers layers that are hidden from the output lines Have just seen that hidden layers allow us to represent XOR Perceptron is single-layer Multiple layers increase the representational power, so e.g. can represent XOR Generally useful nets have multiple-layers typically 2-4 layers

Expectations
Be able to explain the terminology used, e.g. activation functions step and threshold functions perceptron feed-forward multi-layer, hidden layers linear separability XOR why perceptrons cannot cope with XOR how XOR is possible with hidden layers

Questions?

You might also like