You are on page 1of 42

Comparison of LMS and Neural Network Algorithms for Reducing Inter Symbol Interference

ABSTRACT
Protection of digital data during its transmission in the environment from the noises like channel noise and inter symbol interference has become an important aspect in the communication system. As adaptive equalization is identified as efficient technology to minimize the distortions in the environment. Band width efficient data transmission over radio channels is made possible by use of adaptive equalization to compensate time dispersion introduced by the channel. Adaptive equalizers are capable of minimizing the distortions in the digital data due to noises which are mixed in the environment. Besides this, it also performs echo cancellation and noise cancellation. In this project, we design a neural network which is based on general least mean square algorithm. We apply stream of information as input to the network. By using a respective linear or nonlinear function we will produce an output and then we will compare it with the designed signal to generate an error. By using back propagation algorithm we update all the weights. This process will be continued until error becomes optimum. We compared the results of multilayered feed forward network trained with back propagation algorithm, in terms of bit error rate and mean square error with original LMS algorithm to achieve equalization. i

CONTENTS
PAGE NO. CHAPTER 1: INTRODUCTION
1.1 INTRODUCTION 1 1.2 INTER SYMBOL INTERFERENCE 1 1.3 OUTLINE OF THE THESIS 5

CHAPTER 2: EQUALIZATION
2.1 DEFINITION OF EQUALIZATION 6 2.2 TYPES OF EQUALIZERS 7

CHAPTER 3: ADAPTIVE CHANNEL EQUALIZATION


3.1 CHANNEL EQUALIZATION 10 3.2 ADAPTIVE CHANNEL EQUALIZATION 10

CHAPTER 4: LMS ALGORITHM


4.1 LMS ADAPTIVE FILTER 13 4.2 LMS ALGORITHM 15 4.3 MATLAB CODE FOR LMS ALGORITHM 16 4.3.1 MEAN SQUARE ERROR 20 4.3.2 BIT ERROR RATE 21

CHAPTER 5: ARTIFICIAL NEURAL NETWORK


5.1 INTRODUCTION TO ARTIFICIAL NEURAL NETWORKS 22 5.2 NEURAL NETWORKS VERSUS CONVENTIONAL COMPUTERS 23 5.3 ADVANTAGES OF NEURAL NETWORKS 24 5.4 DISADVANTES OF NEURAL NETWORKS 25

5.5 APPLICATIONS OF NEURAL NETWORKS 25 5.5.1 NEURAL NETWORKS IN MEDICINE 25 5.5.2 MODELLING AND DIAGNOSING THE CARDIOVASCULAR SYSTEMS 25 5.5.3 ELECTRONIC NOSES 26 5.5.4 INSTANT PHYSICIAN 26 5.5.5 NEURAL NETWORKS IN BUSSINESS 27 5.5.6 NEURAL NETWORKS IN SIGNAL PROCESSING(NNSP) 27 ii 5.5.7 MODEL-BASED NEURAL NETWORKS FOR IMAGE PROCESSING 28 5.6 ARCHITECTURE OF NEURAL NETWORKS 29 5.6.1 NETWORK LAYERS 29 5.6.2 FEED FORWARD NETWORKS 30 5.6.3 FEEDBACK NETWORKS 31 5.7 LEARNING IN NETWORKS 32 5.7.1 SUPERVISED LEARNINGS 33 5.7.2 UNSUPERVISED LEARNING 33 5.7.3 REINFORCEMENT LEARNING 33 5.8 MULTILAYER PERCEPTRONS 34

CHAPTER 6: BACK PROPAGATION ALGORITHM


6.1 BACK PROPAGATION 37 6.2 THE DELTA RULE 38 6.3 THE GENERALISED DELTA RULE 39 6.4 TRAINING MULTI-LAYER NEURAL NETWORK USING BACK PROPAGATION ALGORITHM 42 6.5 LEARNING RATE AND MOMENTUM 51 6.6 MATLAB CODE FOR BACK PROPAGATION ALGORITHM 53 6.6.1 MEAN SQUARE ERROR 60 6.6.2 BIT ERROR RATE 60

CHAPTER 7: COMPARISON
7.1 FACTORS EFFECTING PERFORMANCE OF THE SYSTEM 61 7.1.1 CONVERGENCE 61 7.1.2 SETTLING TIME 61 7.1.3 MEAN SQUARE ERROR 61 7.1.4 BIT ERROR RATE 62 7.2 COMPARISON BETWEEN LMS AND BACK PROPAGATION 63 7.3 COMPARISON OF MEAN SQUARE ERROR RESULTS 63 7.4 COMPARISON OF BIT ERROR RATE RESULTS 64

CONCLUSION 65 FUTURE SCOPE 65 BIBLIOGRAPHY 66


iii

LIST OF ABBREVIATIONS
PAM Pulse Amplitude Modulation QAM Quadrature Amplitude Modulation ASK Amplitude Shift Keying FSK Frequency Shift Keying

PSK Phase Shift Keying SISO Single Input Single Output SIMO Single Input Multiple Output MISO Multiple Input Single Output MIMO Multiple Input Multiple Output STTC Spacetime trellis code STBC Spacetime block code OSTBC Orthogonal Spacetime block code QOSTBC Quasi-Orthogonal Spacetime block code NSTBC Non-Orthogonal Spacetime block code
iv

LIST OF FIGURES FIGURES Page no.


1.1 Transmitted Signal 2 1.2 Received Signal 2 1.3 Sequence 101101 to be sent, the dashed line is the shape that is actually sent 3 1.4 Inter Symbol Interference 3 1.5 Received Signal vs. Transmitted Signal. 4 2.1 Output Signal at the Receiver 6 2.2 Decision Feedback Equalizer 8 2.3 Adaptive Equalizers 8 3.1 Adaptive Channel Equalization 11 4.1 Typical Adaptive Filter 15 5.1 Simple Neural Network 29 5.2 Multi-Layered Feed Forward Network 31 5.3 Feedback Neural Network 32 5.4 Multilayer Perceptron 35 6.1 The Descent in Weight Space 51
v

LIST OF SYMBOLS
x(n) Channel input h(n) Channel transfer function n(n) Noise signal C(z) Equalizer Output Ck Equalizer Coefficients e(k) Adaptive equalizer Error Signal s(n) Adaptive equalizer Channel Input ZDelay Function W(n) Weight Function () Filter Input Vector Step Size Of The Adaptive Filter

CHAPTER 1 INTRODUCTION
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 1

CHAPTER-1 INTRODUCTION
1.1 Introduction:
The growth in communication services during the past five decades has been phenomenal. Satellite and fibre optic networks provide high-speed communication services around the world. Currently, most of the wired line communication systems are being replaced by fibre optic cables which provide extremely high bandwidth and make possible the transmission of a wide variety of information sources, including voice, data, and video. With the unimaginable development of Internet technologies, efficient high-speed data transmission techniques over communication channels have become a necessity of the day. As the rate of the data transmission increases to fulfil the needs of the users, the channel introduces distortions in data. One major cause of distortion is Inter Symbol Interference (ISI). In digital communication, the transmitted signals are generally in the form of multilevel rectangular pulses. The absolute bandwidth of multilevel rectangular pulses is infinity. If these pulses passes through a band limited communication channel, they will spread in time and the pulse for each symbol may be smeared into adjacent time slot and interfere with the adjacent symbol. This is referred as inter symbol interference (ISI).

1.2 Inter Symbol Interference (ISI):


Inter-symbol interference (ISI) is an unavoidable consequence of both wired and wireless Communication systems. Due to Inter-symbol interference the actual information which is transmitted is corrupted and it is unable to retrieve the original signal. Hence the receiver efficiency is decreased. Generally Inter-symbol interference is mostly seen in digital communication where the information is transmitted in the form of binary digits. Morse first noticed it on the transatlantic telegraph cables transmitting messages using dots and dashes and it has not gone way since. He handled it by just slowing down the transmission. During these early attempts at transmission, it was noticed that the received signals tend to get elongated and smeared into each other. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 2

Fig 1.1: Transmitted signal Fig 1.2: Received signal Fig 1.3 shows a data sequence, 101101 which we use to transmit through a channel. This sequence is in form of square pulses. Square pulses are nice as an abstraction but in practice they are hard to create and also require far too much bandwidth. So we shape them as shown in the dotted line. The shaped version looks essentially like a square pulse and we can quickly tell what was sent even visually. Advantage of (an arbitrary) shaping at this point is that it reduces bandwidth

requirements and can actually be created by the hardware.


Amplitude Symbol Time Time Amplitude

Delay Spread
Time

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 3

Fig 1.3: Sequence 101101 to be sent, the dashed line is the shape that is actually sent. Fig 1.4 shows each symbol as it is received. We can see what the transmission medium creates a tail of energy that lasts much longer than intended. The energy from symbols 1 and 2 goes all the way into symbol 3. Each symbol interferes with one or more of the subsequent symbols. The circled areas show areas of large interference. Fig 1.4: Inter symbol Interference Fig. 1.5 shows the actual signal seen by the receiver. It is the sum of all these distorted symbols. Compared to the dashed line that was the transmitted signal, the received signal looks quite indistinct. The receiver does not actually see this signal, it sees only the little dots, the value of the amplitude at that timing instants. Notice that for symbol 3, this value is approximately half of the transmitted value, which makes Amplitude
Time

Symbol 134 25 Amplitude


Time
Interference into Symbol 3 from symbols 1 and 2 Interference into Symbol 4 from symbols 2 and 3

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 4

this particular symbol more susceptible to noise and incorrect interpretation and this phenomena is the result of this symbol delay and smearing. Fig 1.5: Received signal vs Transmitted signal. This spreading and smearing of symbols such that the energy from one symbol effects the next ones in such a way that the received signal has a higher probability of being interpreted incorrectly resulting in Inter Symbol Interference or ISI. Other factors like thermal noise, impulse noise, and cross talk cause further distortions to the received symbols. Signal processing techniques used at the receiver, to overcome these interferences, so as to restore the transmitted symbols and recover their information, are referred to as channel equalization or simply equalization. In principle, if the characteristics of the channel are precisely known,

then it is always possible to design a pair of transmitting and receiving filter that can minimize the effect of ISI and the additive noise. However, in general the characteristics of channel are random in the sense that it is one of an ensemble of possible channels. Therefore, the use of a fixed pair of transmitting and receiving filter designed on the basis of average channel characteristics, may not adequately reduce inter symbol interference. To overcome this problem adaptive equalization is widely used, which provides precise control over the time response of the channel. Adaptive Amplitude
Time

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 5

equalizers have therefore been playing a crucial role in the design of high-speed communication systems. The data transmitted through a band limited communication channel suffers from linear, nonlinear and additive distortions. In order to reduce the effects of these distortions an equalizer is used at the receiver end. The function of the equalizer is to reconstruct the transmitted symbols by observing the received noisy signal. In the present work a generalized neuron model has been used to develop an adaptive equalizer for the digital communication channel equalization. The generalized neuron model overcomes the problems of common neuron such as large number of neurons and layers required for complex function approximation. This reduces the training time of the network and hence improves the performance in terms of speed. . Nonlinear equalizers using artificial neural networks such as multi layer perceptron trained with Error back propagation method gives sufficiently good performance in terms of bit error rate and mean square error when compared with original LMS algorithm.

1.3 Outline of the Thesis


After this introductory chapter In chapter 2: defining equalization and types of equalization. In chapter 3: introducing adaptive channel equalization. In chapter 4: least mean square algorithm is explained in detail along with MATLAB code. In chapter 5: introducing artificial neural networks In chapter 6: back propagation algorithm is explained in detail along with MATLAB code. In chapter 7: comparison of LMS and back propagation algorithms in terms of simulation results. In chapter 8: conclusions and future enhancements have been presented.

CHAPTER 2 EQUALIZATION
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 6

CHAPTER-2

EQUALIZATION

2.1 Definition of Equalization:


Equalization is a phenomenon that can reduce ISI, CCI and noise effects due to environment. In other words, it is the process of using passive or active electronic elements or digital algorithms for the purpose of altering (originally flattening) the frequency response characteristics of a system.[ ISI has been recognized as major obstacle to high-speed data transmission over wireless channels. Equalization is the technique used to combat Inter symbol interference .The term equalization can be used to describe any signal processing operation that minimizes ISI. An equalizer within a receiver compensates for the average range of expected channel amplitude & delay characteristics, thus an equalizer attempts to mitigate ISI & hence improve the receivers performance.

Noise Inter Symbol Interferen ce m k k Desired Signal

nnnhknxnhnxny

Fig 2.1: Output Signal at the Receiver Equalization system is to compensate for transmission channel impairments such as frequency dependent phase and amplitude distortion. Besides correcting for channel frequency response anomalies, the equalizer can cancel the effect of multi path signal components, which can manifest themselves in the form of voice echoes, video ghost or Raleigh fading conditions in mobile communication channel Noise x h n n y n

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 7

channels. Equalizers specifically designed for multi-path correction are often termed as Echo cancellers. 2.2 TYPES OF EQUALIZERS: a) Linear Equalizers: In a Linear Equalizer, the current and the past values of the received signal are linearly weighted by equalizer coefficients and summed to produce the output, using the relation below.

() =

b) Zero forcing equalizer: In such type of equalizers, it removes the complete ISI without taking in consideration the resulting noise enhancement. Using this, there is a substantial increment in the noise power. () = () c) Mean-Square Error equalizer: Such type of equalizers attempt to minimize the total error between the slicer input and the transmitted data symbol. d) Decision Feedback Equalizer: It is a simple nonlinear equalizer which is particularly useful for channel with severe amplitude distortion. It uses decision feedback to cancel the interference from symbols which have already have been detected. The basic idea is that if the values of the symbols already detected are known (past decisions are assumed correct), then the ISI contributed by these symbols can be cancelled exactly. In a Decision Feedback Equalizer Architecture, the forward and feedback coefficients may be adjusted simultaneously to minimize the mean square error. The main blocks in decision feedback equalizer are Feed Forward Filter (FFF) Feed Back Filter(FBF) Decision Device Adder or Subtractor Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 8

Fig 2.2 : Decision Feedback Equalizer e) Adaptive Equalizers: These types of equalizers adapt the coefficients to minimize the noise and Inter symbol Interference (depending on the type of equalizer) at the output. Fig 2.3: Adaptive Equalizers There are two modes that adaptive equalizers work: a) Decision Directed Mode: The receiver decisions are used to generate the error signal. Decision directed equalizer adjustment is effective in tracking slow variations in the Feed forward filter (FFF) Feed back filter (FBF) Input Output Adjustment of filter coefficients

Symbol decision Channel Equalizer Error signal


e k

kx

Minimum: 2
k

Ee

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 9

channel response. However, this approach is not effective during initial acquisition. b) Training Mode: To make equalizer suitable in the initial acquisition duration, a training signal is needed. In this mode of operation, the transmitter generates a data symbol sequence known to the receiver. Once an agreed time has elapsed, the slicer output is used as training signal and the actual data transmission begins.

CHAPTER 3 ADAPTIVE CHANNEL EQUALIZATION


Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 10

CHAPTER-3 ADAPTIVE CHANNEL EQUALIZATION


3.1 Channel equalization:
The two principal causes of distortion in a digital communication channels are Inter Symbol Interference (ISI) and the additive noise. The ISI can be characterized by a Finite Impulse Response (FIR) filter .The noise can be internal to the system or external to the system. Hence at the receiver the distortion must be compensated in order to reconstruct the transmitted symbols. This process of suppressing channel induced distortion is called channel equalization. To combat the distortive channel effects, or in other words, invert the FIR filter (time or time-varying) representing the channel, for suppose in a system, the transmitter sends the information over an RF channel and obviously in that channel, the signal gets distorted before actually it gets received at the receiver end. Hence, it is the receiver task to figure out what signal was transmitted and judge whether the received signal is understandable and original information or not. The purpose of an equalizer is to reduce the ISI as much as possible to maximize the probability of correct decisions.

3.2 Adaptive Channel Equalization:


Communication systems transmit a signal from one point to another across a communication channel, such as an electrical wire, a fibre-optic cable, or a wireless radio link. During the transmission process, the signal that contains information might become distorted. Modern digital communications systems demand high-speed efficient transmission over bandwidth-limited channels, which distort the signal causing inter symbol interference (ISI). In addition, the digital signal is subject to other impairments such as noise, non-linear distortion, time-variant channels, etc. At the receiver, an equalizer is used to mitigate these effects and restore the transmitted symbols. Band width efficient data transmission over telephone and radio channels is made possible by the use of adaptive equalization. The purpose of adaptive channel equalization is to compensate for signal distortion in a communication Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 11

channel. To compensate for this distortion, you can apply an adaptive filter to the communication channel. The adaptive filter works as an adaptive channel equalizer which automatically updates the weight so that output changes according to the environment. The following figure shows a diagram of an adaptive channel equalization system. Fig3.1 : Adaptive Channel Equalization In this figure3.1 s(n) is the signal that is transmitted through the communication channel, and x(n) is the distorted output signal. To compensate for the signal distortion, the adaptive channel equalization system completes the following two modes: a) Training mode: This mode helps you determine the appropriate coefficients of the adaptive filter. When you transmit the signal s(n) to the communication channel, you also apply a delayed version of the same signal to the adaptive filter. In the previous figure, z is a delay function and d(n) is the delayed signal. y(n) is the output signal from the adaptive filter and e(n) is the error signal between d(n) and y(n). The adaptive filter iteratively adjusts the coefficients to minimize e(n). After the power of e(n) converges, y(n) is almost identical to d(n), which means that you can use the resulting adaptive filter coefficients to compensate for the signal distortion. z Channel

Adaptive Filter s e n n n x d n n y n

Decision Directed Decision Training

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 12

b) Decision-directed mode: After you determine the appropriate coefficients of the adaptive filter, you can switch the adaptive channel equalization system to decision-directed mode. In this mode, the adaptive channel equalization system decodes the signal y(n) and produces a new signal (n-) which is an estimation of the signal s(n) except for a delay of taps.

CHAPTER 4 LMS ALGORITHM


Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 13

CHAPTER-4 LMS ALGORITHM


4.1 Least Mean Square (LMS) Adaptive Filter:
In adaptive signal processing, one of the most popular algorithms to cope with the time-varying characteristics of a wireless channel is the least mean square (LMS) algorithm due to its simplicity for implementation. The performance of the equalizer consistently improves with increase the value of step size which results BER reaches almost to zero The linear filter can be different filter types such as finite impulse response (FIR) or infinite impulse response (IIR). An adaptive algorithm adjusts the coefficients of the linear filter iteratively to minimize the power of error signal e(n). The LMS (Least Mean Square) algorithm is an adaptive algorithm among others which adjusts the coefficients of FIR filters iteratively. FIR filters are stable inherently. But adaptive FIR filters are not always stable. Because adaptive algorithms adjust filter coefficients iteratively, the filter coefficients can become infinite. When filter coefficients become infinite, the adaptive filter is unstable. This algorithm uses the error signal to minimize a cost function and updates the equalizer weight in a manner that iteratively reduces the cost function. The computationally efficient least mean-square (LMS) adaptive algorithm is often used in the implementation of the equalizer. The general operating modes of an adaptive equalizer include training and tracking. A fixed known pseudorandom binary training sequence is sent by transmitter so that equalizer may adapt to proper settings for minimum BER detection. After that user data is sent, and equalizer utilizes a least mean-square (LMS) algorithm to evaluate channel and estimate filter coefficients to compensate for distortion even in the worst possible channel conditions. When the equalizer has been properly trained, it is said to have converged.

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 14

This LMS algorithm uses the adaptive filter adjusts the filter coefficients to minimize the cost function. This is based on the use of instantaneous values of the cost function. = + Where e(n) is the error signal measured using the following equation. () = () [(), (),. , ()]() = () ()() ( + ) = () + ()() Where is learning rate parameter w(n) is the weight vector d(n) is the desired signal and x(n) is the input vector An adaptive filter is a computational device that iteratively models the relationship between the input and output signals of a filter. An adaptive filter selfadjusts the filter coefficients according to an adaptive algorithm. Adaptive filters are digital filters capable of self adjusting or updating their filter coefficients in accordance to their input signals. The adaptive filter requires two inputs: The input signal x(n) The reference input d(n) The new coefficients are sent to the filter from a coefficient generator. The coefficient generator is an adaptive algorithm that modifies the coefficients in response to an incoming signal. Adaptive filters have uses in a number of applications, including noise cancellation, linear prediction, adaptive signal enhancement, and adaptive control. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 15

The following figure shows the diagram of a typical adaptive filter. Fig 4.1 : Typical Adaptive Filter Where x(n) is the input signal to a linear filter y(n) is the corresponding output signal d(n) is an additional input signal to the adaptive filter e(n) is the error signal that denotes the difference between d(n) and y(n). During the training phase the filter is trained using LMS algorithm. A known set of data is given to both the channel needed to be estimated and to the FIR filter to calculate the error between the two. Then the weights of the filter are adjusted using the error. Since the LMS algorithm is being used, training is very simple and also faster. In normal channel conditions, training the FIR filter of just 20-30 iterations is sufficient, compared to more than a minimum of 200 iterations of training required by either gradient based or stochastic algorithm.

4.2 LMS Algorithm:


The LMS algorithm performs the following operations to update the coefficients of an

adaptive FIR filter: 1. Calculates the output signal y(n) from the FIR filter. () = (). () Adaptive Algorithm

e d

n n

Linear Filter x n y n

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 16

Where () is the filter input vector and () is given by () = [()( ). ( + )] () is the filter coefficients vector and () = [()(). . ()] Calculates the error signal () by using the following equation: () = () () 2. Updates the filter coefficients by using the following equation: ( + ) = () + ()() Where is the step size of the adaptive filter w(n) is the filter coefficients vector u(n) is the filter input vector. The LMS algorithm is very simple, tractable and computationally efficient. Besides this it is robust, model independent which means that a small model uncertainty and small disturbances can only result in small estimation errors. The LMS algorithm typically requires a number of iterations equal to about 10 times the dimensionality of the input space for it to reach a steady-state condition. LMS algorithm operating in an environment with a temporally correlated interference gives us low BER. The LMS algorithm executes quickly but converges slowly, and its complexity grows linearly with the number of weights. The biggest limitations of LMS algorithm are its slow rate of convergence and sensitivity to variations of the input. The slow rate of convergence becomes particularly serious when the dimensionality of the input space becomes high which leads to more bit error rate which effects the performance of the system. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 17

4.3 MATLAB CODE FOR LMS ALGORITHM: clc;

clear all; close all;


Number of symbols.

N = 1000; Channel length M = 5; The source block transmits Binary Phase Shift Keying (BPSK) symbols with equal probability. The total number of transmitted symbols is denoted as N. The channel block introduces Intersymbol interference (ISI). u = randint(1,N); Channel to be equalized c = randint(M,1); c = c / norm(c); Channel output z = filter(c,1,u); Additive noise to the channel output SNR = 60; var_v = var(z) * 10^(-SNR/10); v = var_v^0.5 * randint(1,N); Input to the equalizer x = z + v; Adding delay by simply inserting zeroes to the beginning of the data sequence xn=[zeros(1,9)]; for i=1:N xn(9+i)=x(i); end Initialisation of weights Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 18

w=[-.12 .30 -.11 .35 .53 -.18 .01 .28 -.36 -.201]; % w=[zeros(1,10)]; Initialisation of required parameters used for calculation of MSE epsilon=10^-3; E=0; Eavg=[]; yy=[]; kk=1000; err2=[]; Calculating the output and for every iteration for j=1:kk for i=10: N+9 xx(1)=xn(i); xx(2)=xn(i-1); xx(3)=xn(i-2); xx(4)=xn(i-3); xx(5)=xn(i-4); xx(6)=xn(i-5); xx(7)=xn(i-6);

xx(8)=xn(i-7); xx(9)=xn(i-8); xx(10)=xn(i-9); y=xx*w'; yy=[yy y]; err=u(i-9)-y; for k=1:10 Weight is updated for every iteration % w(k)= w(k) + .001 * xx(k) * err / (xx(k) +epsilon); w(k) = w(k) + .0001*(err* xx(k)); end err2=[err2 err]; Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 19

Mean Square error is calculated using Cost function E = E+(0.5*((err)^2)); ylast(j,i-9)=y; elast(j,i-9)=err; end Eav=E/N; Eavg=[Eavg Eav]; E=0; end Plot the graph for mean square error plot(Eavg); Calculation of Bit-error rate for i=1:kk for j=1:N Select the threshold value such that the output is greater than the threshold value are considered as1. Otherwise it is considered as 0 if ylast(i,j)>0.48 yn(i,j)=1; else yn(i,j)=0; end end end Declare the variable count for which the bit error rate for every iteration is to be stored count=[]; c=0; k=1; for i=1:kk for j=1:N Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 20

Input signal transmitted is to be compared with the output signal to know about

error bits during transmission. if(u(k)==yn(i,j)) c=c; else c=c+1; end k=k+1; end count=[count c]; c=0; k=1; end Plot the graph for bit error rate for required number of iterations figure,plot(count);

4.3.1 Mean Square Error:


Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 21

4.3.2 Bit Error Rate:

CHAPTER 5 ARTIFICIAL NEURAL NETWORK


Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 22

CHAPTER-5 ARTIFICIAL NEURAL NETWORKS


5.1 Introduction to Artificial Neural Networks:
Adaptive or optimising learning algorithms might be used in the neural network learning or in the machine learning domains. Instead of fixed learning algorithms, these algorithms improve their own learning performance over time. This type of learning algorithms is known as the metalearning or the learning to learn approach. An artificial neural network (ANN) represents the new generation of information processing networks. ANN is a massively parallel distributed processing system made up of highly interconnected neural computing elements, called neurons that have the ability to learn and thereby acquire knowledge and make it available for use. Connections can be made either from units of one layer to the units of another layer or among the units within the layer or a combination of both. A Neural Network is a powerful data-modelling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks resemble the human brain in the following two ways: A neural network acquires knowledge through learning.

A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights. An Artificial Neural Network is an adaptive, most often nonlinear system that learns to perform a function (an input/output map) from data. Adaptive means that the system parameters are changed during operation, normally called the training phase. After the training phase the Artificial Neural Network parameters are fixed and the system is deployed to solve the problem at hand (the testing phase). The Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 23

Artificial Neural Network is built with a systematic step-by-step procedure to optimize a performance criterion or to follow some implicit internal constraint, which is commonly referred to as the learning rule. The input/output training data are fundamental in neural network technology, because they convey the necessary information to "discover" the optimal operating point. The nonlinear nature of the neural network processing elements (PEs) provides the system with lots of flexibility to achieve practically any desired input/output map, i.e., some Artificial Neural Networks are universal mappers. An input is presented to the neural network and a corresponding desired or target response set at the output (when this is the case the training is called supervised). An error is composed from the difference between the desired response and the system output. This error information is fed back to the system and adjusts the system parameters in a systematic fashion (the learning rule). The process is repeated until the performance is acceptable.

5.2 Neural Networks versus Conventional Computers:


Neural networks take a different approach to problem solving than that of conventional computers. Conventional computers use an algorithmic approach i.e. the computer follows a set of instructions in order to solve a problem. Unless the specific steps that the computer needs to follow are known the computer cannot solve the problem. That restricts the problem solving capability of conventional computers to problems that we already understand and know how to solve. But computers would be so much more useful if they could do things that we don't exactly know how to do. Neural networks on the other hand, process information in a similar way the human brain does. The network is composed of a large number of highly interconnected processing elements (neurons) working in parallel to solve a specific problem. Neural networks learn by example. They cannot be programmed to perform a specific task. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 24

The disadvantage of neural networks is that because the network finds out how to solve the problem by itself, its operation can be unpredictable. On the other hand, conventional computers use a cognitive approach to problem solving; the way the problem is to solve must be known and stated in small unambiguous instructions. These instructions are then converted to a high-level language program and then into machine code that the computer can understand. These machines are totally predictable; if anything goes wrong is due to a software or

hardware fault. Neural networks and conventional algorithmic computers are not in competition but complement each other. There are tasks are more suited to an algorithmic approach like arithmetic operations and tasks that are more suited to neural networks. Even more, a large number of tasks require systems that use a combination of the two approaches (normally a conventional computer is used to supervise the neural network) in order to perform at maximum efficiency.

5.3 Advantages of Neural Networks


Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience. Self-Organization: An ANN can create its own organization or representation of the information it receives during learning time. Real Time Operation: ANN computations may be carried out in parallel, and special hardware devices are being designed and manufactured which take advantage of this capability. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage. When an element of the neural network fails, it can continue without any problem by their parallel nature A neural network learns and does not need to be reprogrammed. A neural network can perform tasks that a linear program cannot. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 25

5.4 Disadvantages of Neural Networks:


The architecture of a neural network is different from the architecture of microprocessors therefore needs to be emulated. Requires high processing time for large size neural networks.

5.5 Applications of Neural Networks:


Artificial Neural Networks is advanced technology signal processing techniques and plays a key role in adaptive techniques. The following are the some of the applications of neural networks in various fields:

5.5.1 Neural Networks in Medicine:


Artificial Neural Networks (ANN) is currently a 'hot' research area in medicine and it is believed that they will receive extensive application to biomedical systems in the next few years. At the moment, the research is mostly on modelling parts of the human body and recognizing diseases from various scans (e.g. cardiograms, CAT scans, ultrasonic scans, etc.). Neural networks are ideal in recognizing diseases using scans since there is no need to provide a specific algorithm on how to identify the disease. Neural networks learn by example so the details of how to recognize the disease are not

needed. What is needed is a set of examples that are representative of all the variations of the disease. The quantity of examples is not as important as the 'quantity'. The examples need to be selected very carefully if the system is to perform reliably and efficiently.

5.5.2 Modelling and Diagnosing the Cardiovascular System:


Neural Networks are used experimentally to model the human cardiovascular system. Diagnosis can be achieved by building a model of the cardiovascular system of an individual and comparing it with the real time physiological measurements taken from the patient. If this routine is carried out regularly, potential harmful medical conditions can be detected at an early stage and thus make the process of combating the disease much easier. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 26

A model of an individual's cardiovascular system must mimic the relationship among physiological variables (i.e., heart rate, systolic and diastolic blood pressures, and breathing rate) at different physical activity levels. If a model is adapted to an individual, then it becomes a model of the physical condition of that individual. The simulator will have to be able to adapt to the features of any individual without the supervision of an expert. This calls for a neural network. Another reason that justifies the use of ANN technology is the ability of Anns to provide sensor fusion, which is the combining of values from several different sensors. Sensor fusion enables the Anns to learn complex relationships among the individual sensor values, which would otherwise be lost if the values were individually analyzed. In medical modelling and diagnosis, this implies that even though each sensor in a set may be sensitive only to a specific physiological variable, Anns are capable of detecting complex medical conditions by fusing the data from the individual biomedical sensors.

5.5.3 Electronic Noses:


Anns are used experimentally to implement electronic noses. Electronic noses have several potential applications in telemedicine. Telemedicine is the practice of medicine over long distances via a communication link. The electronic nose would identify odours in the remote surgical environment. These identified odours would then be electronically transmitted to another site where a door generation system would recreate them. Because the sense of smell can be an important sense to the surgeon, telesmell would enhance telepresent surgery.

5.5.4 Instant Physician:


An application developed in the mid-1980s called the "instant physician" trained an auto associative memory neural network to store a large number of medical records, each of which includes information on symptoms, diagnosis, and treatment for a particular case. After training, the net can be presented with input consisting of a set of symptoms; it will then find the full stored pattern that represents the "best" diagnosis and treatment. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 27

5.5.5 Neural Networks in Business:


Business is a diverted field with several general areas of specializations

such as accounting or financial analysis. Almost any neural network application would fit into one business area or financial analysis. There is some potential for using neural networks for business purposes, including resource allocation and scheduling. There is also a strong potential for using the neural networks for the database mining that is searching for patterns implicit within the explicitly stored information in databases. Most of the funded work in this area is classified as proprietary. Thus, it is not possible to report on the full extent of the work going on. Most work is applying neural networks, such as the Hopfield-Tank network for optimization and scheduling

5.5.6 Neural Networks in Signal Processing (NNSP):


Engineering in general and signal processing in particular are still exploring linear models such as gaussinary assumptions and stationarities, although the world is non-linear, non-Gaussian, and non-stationary. Neural networks are nonlinear adaptive systems that have the potential to push the technology barrier beyond conventional approaches. Growing numbers of neural-network solutions to real-world signal processing problems have been reported. Some recent examples in engineering include:

a) Auto mobile control:


Engine idle and misfiring controls are being developed by Ford Corporation using a recurrent neural network trained with multi-streaming, which is an adaptation of the Decoupled Extended Kalman Filter training. Ford researchers believe that recurrent networks are the most promising technology to help meet the new stringent emission levels of the Clean Air Act.

b) Self-Organizing Feature Maps:


The self organizing map (SOM) converts complex, non-linear statistical relationships between high-dimensional data into simple geometric relationships on a low-dimensional display. It there by compresses information while preserving the most important topological and metric relationships of the primary data elements. The SOMs have been applied to hundreds of practical systems. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 28

c) Aircraft Control:
Neural controllers for aircraft are being developed by several groups, both extending optimal control ideas to nonlinear systems as well as new ideas based on dynamic programming (adaptive circuits). The LoFLYTE hypersonic wave rider is a joint project of NASA and the US Air Force to design experimental aircraft control by Accurate Automation Inc.

d) Neural Vision System:


By integrating a model based network as a low-level vision subsystem and a hierarchically structured neural array for higher level analysis and recognition tasks , the neural vision system has been used for a number of complex real world image processing applications such as computer-assisted diagnosis of breast cancers in digital mammograms; biomedical vision to analyze neurological disorders; and identifying and classifying underwater mines through 3-D sonar image processing and visualization.

5.5.7 Model- Based neural networks for image processing:


In recent years, model-based neural networks (MNNs) have been

playing more and more important roles in architecture design for image processing applications. The biological and computational facts lead us to believe that designing appropriate network architectures for a particular task is at least as important as, if not more important than, twiddling with the connection weights through training in a fully connected network. The biological argument is that the natural networks have a hierarchically clustered architecture that is locally dense but globally sparse. The computational argument is that fully connected artificial neural networks may not generalize well and are difficult in implementation, especially when the problems on hand become large and complicated. MNNs have found broad applications in image processing/analysis/coding and pattern recognition. MNNs have been used in image segmentation, edge detection and enhancement, texture analysis, image regularization and recursive low-level vision modelling, neural-based gradual perception of structure from motion, category and object perception, colour constancy and colour induction, and model based adaptive transform coding . MNNs have also demonstrated their Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 29

capability in pattern reorganization-a field closely related to image processing /analysis.

5.6 Architecture of Neural Networks: 5.6.1 Network Layers:


The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input" units is connected to a layer of "hidden" units, which is connected to a layer of "output" units. The activity of the input units represents the raw information that is fed into the network. The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units. The behaviour of the output units depends on the activity of the hidden units and the weights between the hidden and output units. Fig 5.1 : Simple Neural Network In general two types of ANN architectures; feed forward and feedback, are widely used. In feed forward networks the outputs of the neurons of one layer are connected to the input of the neurons of next layer. But there are no feedback connections from the output of the neurons of a layer to the input of neurons of previous layer. Generally, feed forward networks are static i.e. they produce only one in 1 in 2 Input layer Hidden layer Output layer

out in 3 in 4 Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 30

set of output value rather than a sequence of values from a given input. A feedback neural network distinguishes itself from the feed forward networks in the sense that it has at least one feedback loop. Feedback network are dynamic systems. In a feedback neural network the outputs of the neurons are computed whenever the new input patterns are presented to the network.

5.6.2 Feed-Forward Networks:


A feed-forward network has a layered structure. Each layer consists of units which receive their input from units from a layer directly below and send their output to units in a layer directly above the unit. There are no connections within a layer. The Ni inputs are fed into the first layer of Nh, 1 hidden units. The input units are merely fan-out units; no processing takes place in these units. The output of the hidden units is distributed over the next layer of Nh,2 hidden units until the last layer of hidden units, of which the outputs are fed into a layer of No output units. The network consists of a set of sensory units (source nodes) that constitute the input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes. The input signal propagates through the network in a forward direction, on a layer-by-layer basis. These neural networks are commonly referred as multilayer perceptrons (MLPs). Feed-forward Anns allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed forward Anns tend to be straightforward networks that associate inputs with outputs. They are extensively used in pattern recognition. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 31

Fig 5.2 : Multi-layered Feed Forward Network

5.6.3 Feedback Networks:


Feedback networks can have signals travelling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organizations. ho

Ni Nh, 1 Nh, l-1 Nh, l-2 No


Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 32

Fig 5.3 : Feedback Neural Network

5.7 Learning in Neural Networks:


Learning is a process by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameter changes take place. Learning in a neural network is called training. Like training in athletics, training in a neural network requires a coach, someone that describes to the neural network what it should have produced as a response. From the difference between the desired response and the actual response, the error is determined and a portion of it is propagated backward through the network. Learning is the determination of the weights. At each neuron in the network the error is used to adjust the weights and threshold values of the neuron, so that the next time, the error in the network response will be less for the same inputs. Inputs Feedback Feedback Outputs Competition (or inhibition) Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 33

Following the way learning is performed; we can distinguish two major categories of neural networks: FIXED NETWORKS in which the weights cannot be changed, i.e. dW/dt=0. In such networks, the weights are fixed a priori, according to the problem to solve. ADAPTIVE NETWORKS which are able to change their weights, i.e. dW/dt not=0. All learning methods used for adaptive neural networks can be classified into two major categories:

5.7.1 Supervised learning:


It incorporates an external teacher, so that each output unit is told what its desired response to input signals ought to be. Paradigms of supervised learning include error-correction learning (back propagation algorithm), reinforcement learning and stochastic learning. An important issue concerning supervised learning is the problem of error convergence, i.e. the minimization of error between the desired and computed unit values. The aim is to determine a set of weights, which minimizes the error. One well-known method, which is common to many learning paradigms, is the least mean square (LMS) convergence.

5.7.2 Unsupervised learning:


It uses no external teacher and is based upon only local information. It is also referred to as self-organization, in the sense that itself organizes data presented to the network and detects their emergent collective properties. Paradigms of unsupervised learning are Hebbian learning and competitive learning.

5.7.3 Reinforcement Learning:


This type of learning may be considered as an intermediate form of the

above two types of learning. Here the learning machine does some action on the environment and gets a feedback response from the environment. The learning system grades its action good (rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its parameters. Generally, parameter adjustment is continued until an equilibrium state occurs, following which there will be no more Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 34

changes in its parameters. The self-organizing neural learning may be categorized under this type of learning.

5.8 Multilayer Perceptrons:


The network consists of a set of sensory units (source nodes) that constitute the input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes. The input signal propagates through the network in a forward direction, on a layer-by-layer basis. These neural networks are commonly referred as Multilayer perceptrons (MLPs). Multilayer perceptrons have been applied successfully to solve some difficult and diverse problems by training them in a supervised manner with a highly popular algorithm known as the Error back propagation algorithm this algorithm is based on the error-correction learning rule. A multilayer perceptron has three distinctive characteristics: The model of each neuron is the network includes a non-linear activation function. A commonly used form of nonlinearity that satisfies this requirement is a sigmoidal nonlinearity defined by the logistic function. = 1 1 + exp ( ) Here vj is induced local field (i.e., the weighted sum of all synaptic input plus the bias) of neuron j, and yj is the output of the neuron. The presence of non-linearties is important because otherwise the input-output relation of the network could be reduced to that of a single layer perceptron. Moreover the use of the logistic function is biologically motivated, since it attempts to account for the re factory phase of real neurons. The network contains one or more layers of hidden neurons that are not part of the input or output of the network. These hidden neurons enable the network to learn complex tasks by extracting progressively more meaningful features from the input patterns (vectors). The network exhibits high degrees of connectivity, determined by the synapses of the network. Change in the connectivity of the network Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 35

requires a change in the population of synaptic connections or their weights. The figure shown below is the architectural graph of a multilayer perceptron with two hidden layers and one output layer. This is a fully connected network in

which a neuron in any layer of the network is connected to all the nodes/neurons in the previous layer. Signal flow through the network progresses in a forward direction, from left to right and on a layer-by-layer basis. Fig 5.4: Multilayer Perceptron Two kinds of signals are identified in this network. a) Function signals: A function signal is an input signal (stimulus) that comes in at the input end of the network, propagates forward (neuron by neuron) through the network, and emerges at the output end of the network as an output signal. It is referred as a function signal for two reasons. First, it is presumed to perform a useful function at the output of the network. Second, at each neuron of the network through which a function signal passes, the signal is calculated as a function of the inputs and associated weights applied to that neuron. The function signal is also referred to as the input signal. b) Error signals: An error signal originates at an output neuron of the network, and propagates backward (layer by layer) through the network. It is referred as Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 36

an error signal because its computation by every neuron of the network involves an error-dependent function in one form or another. The output neurons (computational nodes) constitute the output layers of the network. The remaining neurons (computational nodes) constitute hidden layers of the network. Thus the hidden units are not part of the output or input of the network-hence their designation as hidden. The first hidden layer is fed from the input layer made up of sensory units (source nodes); the resulting outputs of the first hidden layer are in turn applied to the next hidden layer; and so on for the rest of the network. Each hidden or output neuron of a multilayer perceptron is designed to perform two computations: The computation of the function signal appearing at the output of a neuron, which is expressed as a continuous nonlinear function of the input signal and synaptic weights, associated with that neuron. The computation of an estimate of the gradient vector (i.e., the gradients of the error surface with respect to the weights connected to the inputs of a neuron), which is needed for the backward pass through the network.

CHAPTER 6 BACK PROPAGATION ALGORITHM


Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 37

CHAPTER-6 BACK PROPAGATION ALGORITHM


6.1 Back Propagation:
It is a supervised learning method, and is an implementation of the Delta rule. It requires a teacher that knows, or can calculate, the desired output for any

given input. It is most useful for feed-forward networks. The term is an abbreviation for "backwards propagation of errors". Back propagation requires that the activation function used by the artificial neurons (or "nodes") is differentiable. Basic back propagation is still the most widely used variant. Its two primary virtues are that it is simple and easy to understand, and it works for a wide range of problems. The back prop algorithm cycles through two distinct passes, a forward pass followed by a backward pass through the layers of the network. The algorithm alternates between these passes several times as it scans the training data. Typically, the training data has to be scanned several times before the networks "learns" to make good classifications. Forward Pass: Computation of outputs of all the neurons in the network The algorithm starts with the first hidden layer using as input values the independent variables of a case from the training data set. The neuron outputs are computed for all neurons in the first hidden layer by performing the relevant sum and activation function evaluations. These outputs are the inputs for neurons in the second hidden layer. Again the relevant sum and activation function calculations are performed to compute the outputs of second layer neurons. This continues layer by layer until we reach the output layer and compute the outputs for this layer. Backward pass: Propagation of error and adjustment of weights During the backward pass, on the other hand, the synaptic weights are all adjusted in accordance with an error-correction rule. Specifically, the actual response of the network is subtracted from a desired (target) response to produce an error signal. This error signal is then propagated backward through the network against the direction of synaptic connections-hence the name error back-propagation. The Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 38

synaptic weights are adjusted to make the actual response of the network move closer to the desired response in a statistical sense.

6.2 The Delta rule


For a single layer network with an output unit with a linear activation function the output is simply given by = + ------------------(1) Such a simple network is able to represent a linear relationship between the value of the output unit and the value of the input units. By thresholding the output value, a classifier can be constructed (such as Widrows Adaline), but here we focus on the linear relationship and use the network for a function approximation task. In high dimensional input spaces the network represents a (hyper) plane and it will be clear that also multiple output units may be defined. Suppose we want to train the network such that a hyper plane is fitted as well as possible to a set of training samples consisting of input values xp and desired (or target) output values dp . For every given input sample, the output of the network differs from the target value dp by (dp yp).where yp is the actual output for this pattern. The delta-rule now uses a cost or error function based on these differences to adjust the weights. The error function, as indicated by the name least mean square, is the summed squared error. That is, the total error E is defined to be

= =

( ) ------------------(2) where the index p ranges over the set of input patterns and Ep represents the error on pattern p. The LMS procedure finds the values of all the weights that minimise the error function by a method called gradient descent. The idea is to make a change in the weight proportional to the negative of the derivative of the error as measured on the current pattern with respect to each weight: =

-------------------(3) where is a constant of proportionality. The derivative is


= .

--------------------(4) Because of the linear units in eq.(1)


= --------------------(5) Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 39

And,

( ) --------------------(6) Such that,


=

= --------------------(7) Where = is the difference between the target output and actual output for pattern p. The delta rule modifies weight appropriately for target and actual outputs of either polarity and for both continuous and binary input and output units. These characteristics have opened up a wealth of new applications. 6.3 The Generalised Delta rule: Since we are now using units with nonlinear activation functions. We have to generalise the delta rule which was presented for linear functions to the set of nonlinear activation functions. The activation is a differentiable function of the total input given by

( ) --------------------(8) In which,
=

--------------------(9) To get the correct generalisation of the delta rule, we must set
+

------------------(10) The error measure is defined as the total quadratic error for pattern p at the output units: =

------------------(11)

where is the desired output for unit o when pattern p is clamped. We further set = as the summed squared error. We can write

------------------(12) we see that the second factor is

= -------------------(13) Comparison of LMS and neural network algorithms for reducing ISI

Department of ECE, GMRIT. Page 40

When we define,
=

-------------------(14) We will get an update rule which is equivalent to the delta rule as described in the previous topic resulting in a gradient descent on the error surface if we make the weight changes according to:

=
------------------(15)

The trick is to figure out what should be for each unit k in the network. The interesting result, which we now derive, is that there is a simple recursive computation of these s which can be implemented by propagating error signals backward through the network. To compute we apply the chain rule to write this partial derivative as the product of two factors, one factor reflecting the change in error as a function of the output of the unit and one reflecting the change in the output as a function of changes in the input. Thus, we have
=

= .

-------------------(16) Let us compute the second factor.

= ( ) ------------------(17) Which is simply the derivative of the squashing function F for the kth unit,

evaluated at the net input

to )

that unit. To compute the first factor of equation (

we consider two cases. First, assume that unit k is an output unit k = o of the network. In this case, it follows from the definition of Ep that

= (

) ------------------(18) Which is the same result as we obtained with the standard delta rule. Substituting this and equation (17) in equation (16) we get
=

( ) -----------------(19) Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 41

for any output unit o. Secondly, if k is not an output unit but a hidden unit k = h. we do not readily know the contribution of the unit to the output error of the network. However, the error measure can be written as a function of the net inputs from hidden to output layer: = (
,

. ,. . ) and we use the chain rule to write


,.

----(20)

Substituting this in eq.(16) yields


= )

----------------(21)

Equations (18) and (20) give a recursive procedure for computing the s for all units in the network, which are then used to compute the weight changes according to equation (15) This procedure constitutes the generalised delta rule for a feed-forward network of non-linear Units. The basic back propagation algorithm is implemented in three steps. The input pattern is presented to the input layer of the network. These inputs are propagated through the network until they reach the output units. This forward pass produces the actual or predicted output pattern. Because back propagation is a supervised learning algorithm, the desired outputs are given as part of the training vector. The actual network outputs are subtracted from the desired outputs and an error signal is produced. This error signal is then the basis for the back propagation step, whereby the errors are passed back through the neural network by computing the contribution of each hidden processing unit and deriving the corresponding adjustment needed to produce the correct output. The connection weights are then adjusted and the neural network has just learned from an experience. The backward propagation of weight adjustments along these lines

continues until we reach the input layer. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 42

6.4 Training multi-layer neural network using back propagation


The project describes teaching process of multi-layer neural network employing back propagation algorithm. To illustrate this process the three layer neural network with two inputs and one output, which is shown in the picture below, is used: Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realise non-linear function, called neuron activation function. Signal e is adder output signal, and y = f(e) is output signal of non-linear element. Signal y is also output signal of neuron. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 43

To teach the neural network we need training data set. The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) z. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Modification is calculated using algorithm described below: Each teaching step starts with forcing both input signals from training set. After this stage we can determine output signals values for each neuron in each network layer. Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 44

Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer Propagation of signals through the output layer. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 45

In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal of output layer neuron. It is impossible to compute error signal for internal neurons directly, because output values of these neurons are unknown. For many years the effective method for training multiplayer networks has been unknown. Only in the middle eighties the back propagation algorithm has been worked out. The idea is to propagate error signal (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 46

The weights' coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is

used for all network layers. If propagated errors came from few neurons they are added. The illustration is below: Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 47

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 48

When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified). Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 49

Comparison of LMS and neural network algorithms for reducing ISI


Department of ECE, GMRIT. Page 50

Coefficient affects network teaching speed. There are a few techniques to select his parameter. The first method is to start teaching process with large value of the parameter. While weights coefficients are being established the parameter is being decreased gradually. The second, more complicated, method starts teaching with small parameter value. During the teaching process the parameter is being increased when the teaching is advanced and then decreased again in the final stage. Starting teaching process with low parameter value enables to determine weights coefficients signs. Two major learning parameters are used to control the training process of a back propagation network. The learn rate is used to specify whether the neural network is going to make major adjustments after each learning trial or if it is only going to make minor adjustments. Momentum is used to control possible oscillations in the weights, which could be caused by alternately signed error signals. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 51

6.5 Learning rate and momentum:


The learning procedure requires that the change in weight is proportional to . True gradient descent requires that infinitesimal steps are taken. The constant of proportionality is the learning rate . For practical purposes we choose a learning rate that is as large as possible without leading to oscillation. One way to avoid oscillation at large is to make the change in weight dependent of the past weight change by adding a momentum term: ( + 1) =
+

() ---------------(29) where t indexes the presentation number and is a constant which determines the effect of the previous weight change. The role of the momentum term is shown in figure. When no momentum term is used, it takes a long time before the minimum has been reached with a low learning rate, whereas for high learning rates the minimum is never reached

because of the oscillations. When adding the momentum term, the minimum will be reached faster. Fig 6.1 : The descent in weight space. a) for small learning rate. b) for large learning rate; note the oscillations, and c) with large learning rate and momentum term added. This corrective procedure is called back propagation (hence the name of the neural network) and it is applied continuously and repetitively for each set of inputs and corresponding set of outputs produced in response to the inputs. This procedure continues so long as the individual or total errors in the responses exceed a Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 52

specified level or until there are no measurable errors. At this point, the neural network has learned the training material and you can stop the training process and use the neural network to produce responses to new input data. The learning rate applies a greater or lesser portion of the respective adjustment to the old weight. If the factor is set to a large value, then the neural network may learn more quickly, but if there is a large variability in the input set then the network may not learn very well or at all. As we train the network, the total error, that is the sum of the errors over all the training sets, will become smaller and smaller. Once the network reduces the total error to the limit set, training may stop. The learning rate applies a greater or lesser portion of the respective adjustment to the old weight. If the factor is set to a large value, then the neural network may learn more quickly, but if there is a large variability in the input set then the network may not learn very well or at all. The back prop algorithm is a version of the steepest descent optimisation method applied to the problem of finding the weights that minimize the error function of the network output. Due to the complexity of the function and the large numbers of weights that are being trained as the network learns, there is no assurance that the back prop algorithm will find the optimum weights that minimize error. The procedure can get stuck at a local minimum. It has been found useful to randomise the order of presentation of the cases in a training set between different scans. It is possible to speed up the algorithm by batching that is updating the weights for several exemplars in a pass. However, at least the extreme case of using the entire training data set on each update has been found to get stuck frequently at poor local minima. A single scan of all cases in the training data is called an epoch. Most applications of feed forward networks and back prop require several epochs before errors are reasonably small. A number of modifications have been proposed to reduce the epochs needed to train a neural net. One commonly employed idea is to incorporate a momentum term that injects some inertia in the weight adjustment on the backward pass. This is done by adding a term to the expression for weight adjustment Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 53

for a connection that is a fraction of the previous weight adjustment for that connection. This fraction is called the momentum control parameter. When no momentum term is used, it takes a long time before the minimum has been reached with a low learning rate, whereas for high learning rates the minimum is never reached

because of the oscillations. When adding the momentum term the minimum will be reached faster. Another idea is to vary the adjustment parameter so that it decreases as the number of epochs increases. Intuitively this is useful because it avoids over fitting that is more likely to occur at later epochs than earlier ones. The point of minimum validation error is a good indicator of the best number of epochs for training and the weights at that stage are likely to provide the best error rate in new data. 6.6 MATLAB CODE FOR BACK-PROPAGATION ALGORITHM: clc; clear all; close all; N = 1000; %No of Symbols M = 5; u = randint(1,N); %Training Signal c = randint(M,1); %Channel to be equalized c = c / norm(c); z = filter(c,1,u); %Channel output SNR = 60; %Additive noise to the channel output var_v = var(z) * 10^(-SNR/10); v = var_v^0.5 * randint(1,N); x = z + v; %Input to the Equalizer x1=[zeros(1,9)]; for i=1:N x1(9+i)=x(i); end Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 54

n=10; %no of nodes in input layer h1=6; %no of nodes in first hidden layer h2=3; %no of nodes in second hidden layer y=[]; eta=.002; alp=.00001; it=1000; %no of iterations th1=[-.202 .268 -.328 -.107 .295 -.106]; %Weight initialization for bias in first hidden layer th2=[.293 .329 -.552]; %Weight initialization for bias in second hidden layer tho=[.526]; %Weight initialization for bias in output layer a=1.2; Eavg=[]; E=0; %paarmeter initialization starts for j1=1:h1 for i=1:n pwh1(j1,i)=0.0; % wh1(j1,i)=0.0;

end end %----------------------First hidden layer---------------------% wh1=[.04 -.234 .0132 .53 -.77 -.012 .257 .994 -.421 .192; -.231 .435 -.93 -.351 .561 -.126 .789 .310 -.409 -.11; -.191 .312 -.141 .234 -.249 .335 .125 -.267 -.195 .021; .606 -.489 .143 -.245 .198 -.395 .477 -.199 -.356 .11; -.102 .426 .129 .410 -.230 -.138 -.104 .128 -.119 .25; -.250 .129 .109 .569 -.235 .150 -.334 .956 .185 -.156]; %-----------------------------------------------------------------% for j2=1:h2 for j1=1:h1 pwh2(j2,j1)=0.0; % wh2(j2,j1)=0.0; end Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 55

end %------------------Second hidden layer-----------------------% wh2=[-.240 .536 -.628 -.284 -.431 .191; .462 -.185 .421 -.104 -.118 -.278; -.104 -.143 .028 .78 -.726 -.198]; %------------------------------------------------------------------% for k=1:1 for j2=1:h2 pwo(k,j2)=0.0; % wo(k,j2)=0.0; end end %----------------Output layer--------------------------------% wo=[.441 -256 -.726]; %parameter initialization ends %---------------------------------------------------------------% for kk=1:it %Mean Square Error caculation E=.0; for i=10:(N+9) xx(1)=x1(i); xx(2)=x1(i-1); xx(3)=x1(i-2); xx(4)=x1(i-3); xx(5)=x1(i-4); xx(6)=x1(i-5); xx(7)=x1(i-6); xx(8)=x1(i-7); xx(9)=x1(i-8); xx(10)=x1(i-9); %--------------------------------for j1=1:h1 vvh1(j1)=0.0;

for i1=1:n Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 56

vvh1(j1)=vvh1(j1)+wh1(j1,i1)*xx(i1); end end %--------------------------------for j1=1:h1 vh1(j1)=vvh1(j1)+th1(j1); yh1(j1)=1/(1+exp(-a*vh1(j1))); end %--------------------------------for j2=1:h2 vvh2(j2)=0.0; for j1=1:h1 vvh2(j2)=vvh2(j2)+wh2(j2,j1)*yh1(j1); end end %--------------------------------for j2=1:h2 vh2(j2)=vvh2(j2)+th2(j2); yh2(j2)=1/(1+exp(-a*vh2(j2))); end %--------------------------------vvo=0.0; for j2=1:h2 vvo=vvo+wo(j2)*yh2(j2); end %--------------------------------vo=vvo+tho; yo=vo; ylast(kk,i-9)=yo; d=u(i-9); e=d-yo; elast(kk,i-9)=e; Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 57

%--------------------------------%calculation of error- gradient of outputlayer del=e; %calculation of error-gradient on hidden layer 2 %--------------------------------for j2=1:h2 dw2(j2)=0; dw2(j2)=dw2(j2)+del*wo(k,j2); end %----------------------------------

for j2=1:h2 delh2(j2)=0.5*a*(1-yh2(j2))*yh2(j2)*dw2(j2); end %---------------------------------%calculation of error gradient on hidden layer 1 %---------------------------------for j1=1:h1 dw1(j1)=0; for j2=1:h2 dw1(j1)=dw1(j1)+delh2(j2)*wh2(j2,j1);; end end %---------------------------------for j1=1:h1 delh1(j1)=0.5*a*(1.-yh1(j1))*yh1(j1)*dw1(j1); end %---------------------------------% updating of weight of output layer %---------------------------------for j2=1:h2 dwo(j2)=eta*del*yh2(j2); dwom(j2)=alp*(wo(j2)-pwo(j2)); pwo(j2)=wo(j2); Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 58

wo(j2)=wo(j2)+dwo(j2)+dwom(j2); end %---------------------------------%updation of weight of hidden layer 2 %---------------------------------for j2=1:h2 for j1=1:h1 dwh2(j2,j1)=eta*delh2(j2)*yh1(j1); dwhh2(j2,j1)=alp*(wh2(j2,j1)-pwh2(j2,j1)); pwh2(j2,j1)=wh2(j2,j1); wh2(j2,j1)=wh2(j2,j1)+dwh2(j2,j1)+dwhh2(j2,j1); end end %----------------------------------%updation of weight of hidden layer 1 %---------------------------------for j1=1:h1 for i1=1:n dwh1(j1,i1)=eta*delh1(j1)*xx(i1); dwhh1(j1,i1)=alp*(wh1(j1,i1)-pwh1(j1,i1)); pwh1(j1,i1)=wh1(j1,i1); wh1(j1,i1)=wh1(j1,i1)+dwh1(j1,i1)+dwhh1(j1,i1);

end end %---------------------------------ersq=e^2; E=E+.5*ersq; %--------------------------------end %--------------------------------------Eav=E/N; Eavg=[Eavg Eav]; end Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 59

plot(Eavg); for i=1:it for j=1:N if ylast(i,j)>.52 yn(i,j)=1; else yn(i,j)=0; end end end k=1; count=[]; c=0; for i=1:kk for j=1:N if(u(k)==yn(i,j)) c=c; else c=c+1; end k=k+1; end count=[count c]; c=0; k=1; end figure,plot(count); Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 60

6.6.1 Mean Square Error 6.6.2 Bit Error Rate

CHAPTER 7

COMPARISON
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 61

CHAPTER-7 COMPARISON
7.1 Factors Effecting the Performance of the System:
Convergence Settling time Mean Square Error Bit Error Rate

7.1.1 Convergence:
Adaptive filters optimize the filter coefficients to minimize the power of the error signal iteratively. The process of minimizing the power of the error signal is known as convergence. A fast convergence indicates that the adaptive filter takes a short time to calculate the appropriate filter coefficients that minimize the power of the error signal.

7.1.2 Settling time: It is the time period that adaptive filters take to converge. Smaller
settling time means quicker convergence speed. Convergence speed is also known as adoption rate. Steady state is the state when the adaptive filter converges and the filter coefficients no longer have significant changes. Because signals might include random noise or because adaptive filters are not optimum, the error signal outputs of adaptive filters are not necessarily zero when the adaptive filter converges. This error is called the steady state error.

7.1.3 Mean Square Error:


The Mean square error or MSE is the difference between an estimator and the true value of the quantity being estimated. MSE is a risk function, Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 62

corresponding to the expected value of the squared error loss or quadratic loss. MSE measures the average of the square of the "error." The error is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate.

7.1.4 Bit Error Rate:


BER is a performance measurement criterion, which describes the reliability of a communication system. In digital transmission the bit error rate or bit error ratio (BER) is the number of received binary bits that have been altered due to noise and interference, divided by the total number of transferred bits during a studied time interval. BER = number of errors / total number of bits sent

As an example, assume this transmitted bit sequence: 0 1 1 0 0 0 1 0 1 1, And the following received bit sequence: 0 0 1 0 1 0 1 0 0 1, The BER is in this case 3 incorrect bits (underlined) divided by 10 transferred bits, resulting in a BER of 0.3 or 30%. The bit error probability pe is the expectation value of the BER. The BER can be considered as an approximate estimate of the bit error probability. In a communication system, the receiver side BER may be affected by transmission channel noise, interference, distortion, bit synchronization problems, attenuation, wireless multipath fading, etc. The BER may be improved by choosing a strong signal strength (unless this causes cross-talk and more bit errors), by choosing a slow and robust modulation Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 63

scheme or line coding scheme, and by applying channel coding schemes such as redundant forward error correction codes.

7.2 Comparison between LMS and Back Propagation: Factor LMS BACK PROPAGATION No of Symbols 1000 1000 No of iterations required 997 575 Mean Square Error 0.137 0.02808 Bit Error Rate 58 14 Computational time 5hr 10 min 7.3 Comparison of Mean Square Error Result
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 64

7.4 Comparison of Bit Error Rate Result

CONCLUSION
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 65

CONCLUSION
Conclusion:
In our present work, we have undertaken the problem of inter symbol interference for digital data transmission. In order to overcome this problem, we have used Adaptive channel Equalization techniques like LMS and Multilayered feed forward network and compared the results. From the table 7.2 it is evident that for 1000 bits transmission in the channel, the Back propagation neural network performs better than the ordinary LMS algorithm, in performance wise like number of iterations for convergence, Mean Square Error, computational time and Bit Error Rate. For 1000 bits transmission channel the factors differences between LMS algorithm and multilayer perceptron model are number of iteration is 997 and 575, mean square errors 0.137 and 0.2808, bit error rate 58 and 14 respectively. One

more main factor is computation time. It is 5 hours for LMS algorithm to compute while 10 minutes for multilayer perceptron model. Hence we can conclude that the Multilayer Perceptron model out clashes the LMS algorithm. Still we can achieve the BER still smaller in number by suitable choice of activation function, threshold values in Perceptron learning mechanism including more no of layers and certain optimization techniques for faster learning of this model.

Future Enhancements:
We can use certain advanced techniques like Radial basis networks, Elman networks, and binary valued Boolean network for supervised cases with certain optimization techniques. This model can be extended to non-linear channels.

BIBLIOGRAPHY
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 66

BIBILOGRAPHY
1) K Kornik, M Stichcombe, and H White, Mutilayer feedforward networks are universal approximators, Neural Networks, vol.2, no.5, pp 359-366, 1989. 2) K Kornik, M Stichcombe, and H White, Universal approximation of an unknown mapping and its derivatives using multilayer feed forward networks, Neural Networks, vol.3, no.5, pp 551-560, 1990. 3) Yu Hen, Hu Jenq, Neng Hwang, Handbook of Neural Network Signal Processing, CRC Press 2002 4) H B Demuth & Beale, Neural Network Design, PWS Publishing Company 1995. 5) Rajkumar Thenua, S.K.Agarwal Simulation and Performance Analysis of Adaptive Filter in Noise Cancellation vol.2(9), 2010, 4373-4378, International Journal of Engineering Science and Technology. 6) J. G. Proakis and J. H. Miller An adaptive receiver for digital signalling through channels with inter symbol interference, fourth edition. 7) D. A. George, R. R. Bowen, and JR. Storey An adaptive decision feedback equalizer, first edition. 8) Haykin, S. 1996 Adaptive filter theory, third edition, Upper Saddle Riven, N. J., Prentice- Hall. 9) Simon Haykin Neural Networks, second edition. 10) B.P.Lathi Modern digital communication, third edition. 11) Farhang-Boroujeny, B. 1998 Adaptive Filters: Theory and Applications, Chichester, England,John Wiley & Sons 12) www.mathworks.com 13) www.ieeeexplore.com 14) Mark Hudson Beale, Martin T. Hagan, Howard B. Demuth, users guide Neural Network Toolbox 1992-2011. 15) Brian R. Hunt, Ronald L. Lipsman, Jonathan M. Rosenberg with Kevin R. Coombes,JohnE. Osborn, and Garrett J. Stuck A Guide to Users, for beginners and experienced users 2001 Cambridge University Press.