Professional Documents
Culture Documents
ABSTRACT
Protection of digital data during its transmission in the environment from the noises like channel noise and inter symbol interference has become an important aspect in the communication system. As adaptive equalization is identified as efficient technology to minimize the distortions in the environment. Band width efficient data transmission over radio channels is made possible by use of adaptive equalization to compensate time dispersion introduced by the channel. Adaptive equalizers are capable of minimizing the distortions in the digital data due to noises which are mixed in the environment. Besides this, it also performs echo cancellation and noise cancellation. In this project, we design a neural network which is based on general least mean square algorithm. We apply stream of information as input to the network. By using a respective linear or nonlinear function we will produce an output and then we will compare it with the designed signal to generate an error. By using back propagation algorithm we update all the weights. This process will be continued until error becomes optimum. We compared the results of multilayered feed forward network trained with back propagation algorithm, in terms of bit error rate and mean square error with original LMS algorithm to achieve equalization. i
CONTENTS
PAGE NO. CHAPTER 1: INTRODUCTION
1.1 INTRODUCTION 1 1.2 INTER SYMBOL INTERFERENCE 1 1.3 OUTLINE OF THE THESIS 5
CHAPTER 2: EQUALIZATION
2.1 DEFINITION OF EQUALIZATION 6 2.2 TYPES OF EQUALIZERS 7
5.5 APPLICATIONS OF NEURAL NETWORKS 25 5.5.1 NEURAL NETWORKS IN MEDICINE 25 5.5.2 MODELLING AND DIAGNOSING THE CARDIOVASCULAR SYSTEMS 25 5.5.3 ELECTRONIC NOSES 26 5.5.4 INSTANT PHYSICIAN 26 5.5.5 NEURAL NETWORKS IN BUSSINESS 27 5.5.6 NEURAL NETWORKS IN SIGNAL PROCESSING(NNSP) 27 ii 5.5.7 MODEL-BASED NEURAL NETWORKS FOR IMAGE PROCESSING 28 5.6 ARCHITECTURE OF NEURAL NETWORKS 29 5.6.1 NETWORK LAYERS 29 5.6.2 FEED FORWARD NETWORKS 30 5.6.3 FEEDBACK NETWORKS 31 5.7 LEARNING IN NETWORKS 32 5.7.1 SUPERVISED LEARNINGS 33 5.7.2 UNSUPERVISED LEARNING 33 5.7.3 REINFORCEMENT LEARNING 33 5.8 MULTILAYER PERCEPTRONS 34
CHAPTER 7: COMPARISON
7.1 FACTORS EFFECTING PERFORMANCE OF THE SYSTEM 61 7.1.1 CONVERGENCE 61 7.1.2 SETTLING TIME 61 7.1.3 MEAN SQUARE ERROR 61 7.1.4 BIT ERROR RATE 62 7.2 COMPARISON BETWEEN LMS AND BACK PROPAGATION 63 7.3 COMPARISON OF MEAN SQUARE ERROR RESULTS 63 7.4 COMPARISON OF BIT ERROR RATE RESULTS 64
LIST OF ABBREVIATIONS
PAM Pulse Amplitude Modulation QAM Quadrature Amplitude Modulation ASK Amplitude Shift Keying FSK Frequency Shift Keying
PSK Phase Shift Keying SISO Single Input Single Output SIMO Single Input Multiple Output MISO Multiple Input Single Output MIMO Multiple Input Multiple Output STTC Spacetime trellis code STBC Spacetime block code OSTBC Orthogonal Spacetime block code QOSTBC Quasi-Orthogonal Spacetime block code NSTBC Non-Orthogonal Spacetime block code
iv
LIST OF SYMBOLS
x(n) Channel input h(n) Channel transfer function n(n) Noise signal C(z) Equalizer Output Ck Equalizer Coefficients e(k) Adaptive equalizer Error Signal s(n) Adaptive equalizer Channel Input ZDelay Function W(n) Weight Function () Filter Input Vector Step Size Of The Adaptive Filter
CHAPTER 1 INTRODUCTION
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 1
CHAPTER-1 INTRODUCTION
1.1 Introduction:
The growth in communication services during the past five decades has been phenomenal. Satellite and fibre optic networks provide high-speed communication services around the world. Currently, most of the wired line communication systems are being replaced by fibre optic cables which provide extremely high bandwidth and make possible the transmission of a wide variety of information sources, including voice, data, and video. With the unimaginable development of Internet technologies, efficient high-speed data transmission techniques over communication channels have become a necessity of the day. As the rate of the data transmission increases to fulfil the needs of the users, the channel introduces distortions in data. One major cause of distortion is Inter Symbol Interference (ISI). In digital communication, the transmitted signals are generally in the form of multilevel rectangular pulses. The absolute bandwidth of multilevel rectangular pulses is infinity. If these pulses passes through a band limited communication channel, they will spread in time and the pulse for each symbol may be smeared into adjacent time slot and interfere with the adjacent symbol. This is referred as inter symbol interference (ISI).
Fig 1.1: Transmitted signal Fig 1.2: Received signal Fig 1.3 shows a data sequence, 101101 which we use to transmit through a channel. This sequence is in form of square pulses. Square pulses are nice as an abstraction but in practice they are hard to create and also require far too much bandwidth. So we shape them as shown in the dotted line. The shaped version looks essentially like a square pulse and we can quickly tell what was sent even visually. Advantage of (an arbitrary) shaping at this point is that it reduces bandwidth
Delay Spread
Time
Fig 1.3: Sequence 101101 to be sent, the dashed line is the shape that is actually sent. Fig 1.4 shows each symbol as it is received. We can see what the transmission medium creates a tail of energy that lasts much longer than intended. The energy from symbols 1 and 2 goes all the way into symbol 3. Each symbol interferes with one or more of the subsequent symbols. The circled areas show areas of large interference. Fig 1.4: Inter symbol Interference Fig. 1.5 shows the actual signal seen by the receiver. It is the sum of all these distorted symbols. Compared to the dashed line that was the transmitted signal, the received signal looks quite indistinct. The receiver does not actually see this signal, it sees only the little dots, the value of the amplitude at that timing instants. Notice that for symbol 3, this value is approximately half of the transmitted value, which makes Amplitude
Time
this particular symbol more susceptible to noise and incorrect interpretation and this phenomena is the result of this symbol delay and smearing. Fig 1.5: Received signal vs Transmitted signal. This spreading and smearing of symbols such that the energy from one symbol effects the next ones in such a way that the received signal has a higher probability of being interpreted incorrectly resulting in Inter Symbol Interference or ISI. Other factors like thermal noise, impulse noise, and cross talk cause further distortions to the received symbols. Signal processing techniques used at the receiver, to overcome these interferences, so as to restore the transmitted symbols and recover their information, are referred to as channel equalization or simply equalization. In principle, if the characteristics of the channel are precisely known,
then it is always possible to design a pair of transmitting and receiving filter that can minimize the effect of ISI and the additive noise. However, in general the characteristics of channel are random in the sense that it is one of an ensemble of possible channels. Therefore, the use of a fixed pair of transmitting and receiving filter designed on the basis of average channel characteristics, may not adequately reduce inter symbol interference. To overcome this problem adaptive equalization is widely used, which provides precise control over the time response of the channel. Adaptive Amplitude
Time
equalizers have therefore been playing a crucial role in the design of high-speed communication systems. The data transmitted through a band limited communication channel suffers from linear, nonlinear and additive distortions. In order to reduce the effects of these distortions an equalizer is used at the receiver end. The function of the equalizer is to reconstruct the transmitted symbols by observing the received noisy signal. In the present work a generalized neuron model has been used to develop an adaptive equalizer for the digital communication channel equalization. The generalized neuron model overcomes the problems of common neuron such as large number of neurons and layers required for complex function approximation. This reduces the training time of the network and hence improves the performance in terms of speed. . Nonlinear equalizers using artificial neural networks such as multi layer perceptron trained with Error back propagation method gives sufficiently good performance in terms of bit error rate and mean square error when compared with original LMS algorithm.
CHAPTER 2 EQUALIZATION
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 6
CHAPTER-2
EQUALIZATION
nnnhknxnhnxny
Fig 2.1: Output Signal at the Receiver Equalization system is to compensate for transmission channel impairments such as frequency dependent phase and amplitude distortion. Besides correcting for channel frequency response anomalies, the equalizer can cancel the effect of multi path signal components, which can manifest themselves in the form of voice echoes, video ghost or Raleigh fading conditions in mobile communication channel Noise x h n n y n
channels. Equalizers specifically designed for multi-path correction are often termed as Echo cancellers. 2.2 TYPES OF EQUALIZERS: a) Linear Equalizers: In a Linear Equalizer, the current and the past values of the received signal are linearly weighted by equalizer coefficients and summed to produce the output, using the relation below.
() =
b) Zero forcing equalizer: In such type of equalizers, it removes the complete ISI without taking in consideration the resulting noise enhancement. Using this, there is a substantial increment in the noise power. () = () c) Mean-Square Error equalizer: Such type of equalizers attempt to minimize the total error between the slicer input and the transmitted data symbol. d) Decision Feedback Equalizer: It is a simple nonlinear equalizer which is particularly useful for channel with severe amplitude distortion. It uses decision feedback to cancel the interference from symbols which have already have been detected. The basic idea is that if the values of the symbols already detected are known (past decisions are assumed correct), then the ISI contributed by these symbols can be cancelled exactly. In a Decision Feedback Equalizer Architecture, the forward and feedback coefficients may be adjusted simultaneously to minimize the mean square error. The main blocks in decision feedback equalizer are Feed Forward Filter (FFF) Feed Back Filter(FBF) Decision Device Adder or Subtractor Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 8
Fig 2.2 : Decision Feedback Equalizer e) Adaptive Equalizers: These types of equalizers adapt the coefficients to minimize the noise and Inter symbol Interference (depending on the type of equalizer) at the output. Fig 2.3: Adaptive Equalizers There are two modes that adaptive equalizers work: a) Decision Directed Mode: The receiver decisions are used to generate the error signal. Decision directed equalizer adjustment is effective in tracking slow variations in the Feed forward filter (FFF) Feed back filter (FBF) Input Output Adjustment of filter coefficients
kx
Minimum: 2
k
Ee
channel response. However, this approach is not effective during initial acquisition. b) Training Mode: To make equalizer suitable in the initial acquisition duration, a training signal is needed. In this mode of operation, the transmitter generates a data symbol sequence known to the receiver. Once an agreed time has elapsed, the slicer output is used as training signal and the actual data transmission begins.
channel. To compensate for this distortion, you can apply an adaptive filter to the communication channel. The adaptive filter works as an adaptive channel equalizer which automatically updates the weight so that output changes according to the environment. The following figure shows a diagram of an adaptive channel equalization system. Fig3.1 : Adaptive Channel Equalization In this figure3.1 s(n) is the signal that is transmitted through the communication channel, and x(n) is the distorted output signal. To compensate for the signal distortion, the adaptive channel equalization system completes the following two modes: a) Training mode: This mode helps you determine the appropriate coefficients of the adaptive filter. When you transmit the signal s(n) to the communication channel, you also apply a delayed version of the same signal to the adaptive filter. In the previous figure, z is a delay function and d(n) is the delayed signal. y(n) is the output signal from the adaptive filter and e(n) is the error signal between d(n) and y(n). The adaptive filter iteratively adjusts the coefficients to minimize e(n). After the power of e(n) converges, y(n) is almost identical to d(n), which means that you can use the resulting adaptive filter coefficients to compensate for the signal distortion. z Channel
Adaptive Filter s e n n n x d n n y n
b) Decision-directed mode: After you determine the appropriate coefficients of the adaptive filter, you can switch the adaptive channel equalization system to decision-directed mode. In this mode, the adaptive channel equalization system decodes the signal y(n) and produces a new signal (n-) which is an estimation of the signal s(n) except for a delay of taps.
This LMS algorithm uses the adaptive filter adjusts the filter coefficients to minimize the cost function. This is based on the use of instantaneous values of the cost function. = + Where e(n) is the error signal measured using the following equation. () = () [(), (),. , ()]() = () ()() ( + ) = () + ()() Where is learning rate parameter w(n) is the weight vector d(n) is the desired signal and x(n) is the input vector An adaptive filter is a computational device that iteratively models the relationship between the input and output signals of a filter. An adaptive filter selfadjusts the filter coefficients according to an adaptive algorithm. Adaptive filters are digital filters capable of self adjusting or updating their filter coefficients in accordance to their input signals. The adaptive filter requires two inputs: The input signal x(n) The reference input d(n) The new coefficients are sent to the filter from a coefficient generator. The coefficient generator is an adaptive algorithm that modifies the coefficients in response to an incoming signal. Adaptive filters have uses in a number of applications, including noise cancellation, linear prediction, adaptive signal enhancement, and adaptive control. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 15
The following figure shows the diagram of a typical adaptive filter. Fig 4.1 : Typical Adaptive Filter Where x(n) is the input signal to a linear filter y(n) is the corresponding output signal d(n) is an additional input signal to the adaptive filter e(n) is the error signal that denotes the difference between d(n) and y(n). During the training phase the filter is trained using LMS algorithm. A known set of data is given to both the channel needed to be estimated and to the FIR filter to calculate the error between the two. Then the weights of the filter are adjusted using the error. Since the LMS algorithm is being used, training is very simple and also faster. In normal channel conditions, training the FIR filter of just 20-30 iterations is sufficient, compared to more than a minimum of 200 iterations of training required by either gradient based or stochastic algorithm.
adaptive FIR filter: 1. Calculates the output signal y(n) from the FIR filter. () = (). () Adaptive Algorithm
e d
n n
Linear Filter x n y n
Where () is the filter input vector and () is given by () = [()( ). ( + )] () is the filter coefficients vector and () = [()(). . ()] Calculates the error signal () by using the following equation: () = () () 2. Updates the filter coefficients by using the following equation: ( + ) = () + ()() Where is the step size of the adaptive filter w(n) is the filter coefficients vector u(n) is the filter input vector. The LMS algorithm is very simple, tractable and computationally efficient. Besides this it is robust, model independent which means that a small model uncertainty and small disturbances can only result in small estimation errors. The LMS algorithm typically requires a number of iterations equal to about 10 times the dimensionality of the input space for it to reach a steady-state condition. LMS algorithm operating in an environment with a temporally correlated interference gives us low BER. The LMS algorithm executes quickly but converges slowly, and its complexity grows linearly with the number of weights. The biggest limitations of LMS algorithm are its slow rate of convergence and sensitivity to variations of the input. The slow rate of convergence becomes particularly serious when the dimensionality of the input space becomes high which leads to more bit error rate which effects the performance of the system. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 17
N = 1000; Channel length M = 5; The source block transmits Binary Phase Shift Keying (BPSK) symbols with equal probability. The total number of transmitted symbols is denoted as N. The channel block introduces Intersymbol interference (ISI). u = randint(1,N); Channel to be equalized c = randint(M,1); c = c / norm(c); Channel output z = filter(c,1,u); Additive noise to the channel output SNR = 60; var_v = var(z) * 10^(-SNR/10); v = var_v^0.5 * randint(1,N); Input to the equalizer x = z + v; Adding delay by simply inserting zeroes to the beginning of the data sequence xn=[zeros(1,9)]; for i=1:N xn(9+i)=x(i); end Initialisation of weights Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 18
w=[-.12 .30 -.11 .35 .53 -.18 .01 .28 -.36 -.201]; % w=[zeros(1,10)]; Initialisation of required parameters used for calculation of MSE epsilon=10^-3; E=0; Eavg=[]; yy=[]; kk=1000; err2=[]; Calculating the output and for every iteration for j=1:kk for i=10: N+9 xx(1)=xn(i); xx(2)=xn(i-1); xx(3)=xn(i-2); xx(4)=xn(i-3); xx(5)=xn(i-4); xx(6)=xn(i-5); xx(7)=xn(i-6);
xx(8)=xn(i-7); xx(9)=xn(i-8); xx(10)=xn(i-9); y=xx*w'; yy=[yy y]; err=u(i-9)-y; for k=1:10 Weight is updated for every iteration % w(k)= w(k) + .001 * xx(k) * err / (xx(k) +epsilon); w(k) = w(k) + .0001*(err* xx(k)); end err2=[err2 err]; Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 19
Mean Square error is calculated using Cost function E = E+(0.5*((err)^2)); ylast(j,i-9)=y; elast(j,i-9)=err; end Eav=E/N; Eavg=[Eavg Eav]; E=0; end Plot the graph for mean square error plot(Eavg); Calculation of Bit-error rate for i=1:kk for j=1:N Select the threshold value such that the output is greater than the threshold value are considered as1. Otherwise it is considered as 0 if ylast(i,j)>0.48 yn(i,j)=1; else yn(i,j)=0; end end end Declare the variable count for which the bit error rate for every iteration is to be stored count=[]; c=0; k=1; for i=1:kk for j=1:N Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 20
Input signal transmitted is to be compared with the output signal to know about
error bits during transmission. if(u(k)==yn(i,j)) c=c; else c=c+1; end k=k+1; end count=[count c]; c=0; k=1; end Plot the graph for bit error rate for required number of iterations figure,plot(count);
A neural network's knowledge is stored within inter-neuron connection strengths known as synaptic weights. An Artificial Neural Network is an adaptive, most often nonlinear system that learns to perform a function (an input/output map) from data. Adaptive means that the system parameters are changed during operation, normally called the training phase. After the training phase the Artificial Neural Network parameters are fixed and the system is deployed to solve the problem at hand (the testing phase). The Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 23
Artificial Neural Network is built with a systematic step-by-step procedure to optimize a performance criterion or to follow some implicit internal constraint, which is commonly referred to as the learning rule. The input/output training data are fundamental in neural network technology, because they convey the necessary information to "discover" the optimal operating point. The nonlinear nature of the neural network processing elements (PEs) provides the system with lots of flexibility to achieve practically any desired input/output map, i.e., some Artificial Neural Networks are universal mappers. An input is presented to the neural network and a corresponding desired or target response set at the output (when this is the case the training is called supervised). An error is composed from the difference between the desired response and the system output. This error information is fed back to the system and adjusts the system parameters in a systematic fashion (the learning rule). The process is repeated until the performance is acceptable.
The disadvantage of neural networks is that because the network finds out how to solve the problem by itself, its operation can be unpredictable. On the other hand, conventional computers use a cognitive approach to problem solving; the way the problem is to solve must be known and stated in small unambiguous instructions. These instructions are then converted to a high-level language program and then into machine code that the computer can understand. These machines are totally predictable; if anything goes wrong is due to a software or
hardware fault. Neural networks and conventional algorithmic computers are not in competition but complement each other. There are tasks are more suited to an algorithmic approach like arithmetic operations and tasks that are more suited to neural networks. Even more, a large number of tasks require systems that use a combination of the two approaches (normally a conventional computer is used to supervise the neural network) in order to perform at maximum efficiency.
needed. What is needed is a set of examples that are representative of all the variations of the disease. The quantity of examples is not as important as the 'quantity'. The examples need to be selected very carefully if the system is to perform reliably and efficiently.
A model of an individual's cardiovascular system must mimic the relationship among physiological variables (i.e., heart rate, systolic and diastolic blood pressures, and breathing rate) at different physical activity levels. If a model is adapted to an individual, then it becomes a model of the physical condition of that individual. The simulator will have to be able to adapt to the features of any individual without the supervision of an expert. This calls for a neural network. Another reason that justifies the use of ANN technology is the ability of Anns to provide sensor fusion, which is the combining of values from several different sensors. Sensor fusion enables the Anns to learn complex relationships among the individual sensor values, which would otherwise be lost if the values were individually analyzed. In medical modelling and diagnosis, this implies that even though each sensor in a set may be sensitive only to a specific physiological variable, Anns are capable of detecting complex medical conditions by fusing the data from the individual biomedical sensors.
such as accounting or financial analysis. Almost any neural network application would fit into one business area or financial analysis. There is some potential for using neural networks for business purposes, including resource allocation and scheduling. There is also a strong potential for using the neural networks for the database mining that is searching for patterns implicit within the explicitly stored information in databases. Most of the funded work in this area is classified as proprietary. Thus, it is not possible to report on the full extent of the work going on. Most work is applying neural networks, such as the Hopfield-Tank network for optimization and scheduling
c) Aircraft Control:
Neural controllers for aircraft are being developed by several groups, both extending optimal control ideas to nonlinear systems as well as new ideas based on dynamic programming (adaptive circuits). The LoFLYTE hypersonic wave rider is a joint project of NASA and the US Air Force to design experimental aircraft control by Accurate Automation Inc.
playing more and more important roles in architecture design for image processing applications. The biological and computational facts lead us to believe that designing appropriate network architectures for a particular task is at least as important as, if not more important than, twiddling with the connection weights through training in a fully connected network. The biological argument is that the natural networks have a hierarchically clustered architecture that is locally dense but globally sparse. The computational argument is that fully connected artificial neural networks may not generalize well and are difficult in implementation, especially when the problems on hand become large and complicated. MNNs have found broad applications in image processing/analysis/coding and pattern recognition. MNNs have been used in image segmentation, edge detection and enhancement, texture analysis, image regularization and recursive low-level vision modelling, neural-based gradual perception of structure from motion, category and object perception, colour constancy and colour induction, and model based adaptive transform coding . MNNs have also demonstrated their Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 29
out in 3 in 4 Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 30
set of output value rather than a sequence of values from a given input. A feedback neural network distinguishes itself from the feed forward networks in the sense that it has at least one feedback loop. Feedback network are dynamic systems. In a feedback neural network the outputs of the neurons are computed whenever the new input patterns are presented to the network.
Following the way learning is performed; we can distinguish two major categories of neural networks: FIXED NETWORKS in which the weights cannot be changed, i.e. dW/dt=0. In such networks, the weights are fixed a priori, according to the problem to solve. ADAPTIVE NETWORKS which are able to change their weights, i.e. dW/dt not=0. All learning methods used for adaptive neural networks can be classified into two major categories:
above two types of learning. Here the learning machine does some action on the environment and gets a feedback response from the environment. The learning system grades its action good (rewarding) or bad (punishable) based on the environmental response and accordingly adjusts its parameters. Generally, parameter adjustment is continued until an equilibrium state occurs, following which there will be no more Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 34
changes in its parameters. The self-organizing neural learning may be categorized under this type of learning.
requires a change in the population of synaptic connections or their weights. The figure shown below is the architectural graph of a multilayer perceptron with two hidden layers and one output layer. This is a fully connected network in
which a neuron in any layer of the network is connected to all the nodes/neurons in the previous layer. Signal flow through the network progresses in a forward direction, from left to right and on a layer-by-layer basis. Fig 5.4: Multilayer Perceptron Two kinds of signals are identified in this network. a) Function signals: A function signal is an input signal (stimulus) that comes in at the input end of the network, propagates forward (neuron by neuron) through the network, and emerges at the output end of the network as an output signal. It is referred as a function signal for two reasons. First, it is presumed to perform a useful function at the output of the network. Second, at each neuron of the network through which a function signal passes, the signal is calculated as a function of the inputs and associated weights applied to that neuron. The function signal is also referred to as the input signal. b) Error signals: An error signal originates at an output neuron of the network, and propagates backward (layer by layer) through the network. It is referred as Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 36
an error signal because its computation by every neuron of the network involves an error-dependent function in one form or another. The output neurons (computational nodes) constitute the output layers of the network. The remaining neurons (computational nodes) constitute hidden layers of the network. Thus the hidden units are not part of the output or input of the network-hence their designation as hidden. The first hidden layer is fed from the input layer made up of sensory units (source nodes); the resulting outputs of the first hidden layer are in turn applied to the next hidden layer; and so on for the rest of the network. Each hidden or output neuron of a multilayer perceptron is designed to perform two computations: The computation of the function signal appearing at the output of a neuron, which is expressed as a continuous nonlinear function of the input signal and synaptic weights, associated with that neuron. The computation of an estimate of the gradient vector (i.e., the gradients of the error surface with respect to the weights connected to the inputs of a neuron), which is needed for the backward pass through the network.
given input. It is most useful for feed-forward networks. The term is an abbreviation for "backwards propagation of errors". Back propagation requires that the activation function used by the artificial neurons (or "nodes") is differentiable. Basic back propagation is still the most widely used variant. Its two primary virtues are that it is simple and easy to understand, and it works for a wide range of problems. The back prop algorithm cycles through two distinct passes, a forward pass followed by a backward pass through the layers of the network. The algorithm alternates between these passes several times as it scans the training data. Typically, the training data has to be scanned several times before the networks "learns" to make good classifications. Forward Pass: Computation of outputs of all the neurons in the network The algorithm starts with the first hidden layer using as input values the independent variables of a case from the training data set. The neuron outputs are computed for all neurons in the first hidden layer by performing the relevant sum and activation function evaluations. These outputs are the inputs for neurons in the second hidden layer. Again the relevant sum and activation function calculations are performed to compute the outputs of second layer neurons. This continues layer by layer until we reach the output layer and compute the outputs for this layer. Backward pass: Propagation of error and adjustment of weights During the backward pass, on the other hand, the synaptic weights are all adjusted in accordance with an error-correction rule. Specifically, the actual response of the network is subtracted from a desired (target) response to produce an error signal. This error signal is then propagated backward through the network against the direction of synaptic connections-hence the name error back-propagation. The Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 38
synaptic weights are adjusted to make the actual response of the network move closer to the desired response in a statistical sense.
= =
( ) ------------------(2) where the index p ranges over the set of input patterns and Ep represents the error on pattern p. The LMS procedure finds the values of all the weights that minimise the error function by a method called gradient descent. The idea is to make a change in the weight proportional to the negative of the derivative of the error as measured on the current pattern with respect to each weight: =
= .
= --------------------(5) Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 39
And,
= --------------------(7) Where = is the difference between the target output and actual output for pattern p. The delta rule modifies weight appropriately for target and actual outputs of either polarity and for both continuous and binary input and output units. These characteristics have opened up a wealth of new applications. 6.3 The Generalised Delta rule: Since we are now using units with nonlinear activation functions. We have to generalise the delta rule which was presented for linear functions to the set of nonlinear activation functions. The activation is a differentiable function of the total input given by
( ) --------------------(8) In which,
=
--------------------(9) To get the correct generalisation of the delta rule, we must set
+
------------------(10) The error measure is defined as the total quadratic error for pattern p at the output units: =
------------------(11)
where is the desired output for unit o when pattern p is clamped. We further set = as the summed squared error. We can write
= -------------------(13) Comparison of LMS and neural network algorithms for reducing ISI
When we define,
=
-------------------(14) We will get an update rule which is equivalent to the delta rule as described in the previous topic resulting in a gradient descent on the error surface if we make the weight changes according to:
=
------------------(15)
The trick is to figure out what should be for each unit k in the network. The interesting result, which we now derive, is that there is a simple recursive computation of these s which can be implemented by propagating error signals backward through the network. To compute we apply the chain rule to write this partial derivative as the product of two factors, one factor reflecting the change in error as a function of the output of the unit and one reflecting the change in the output as a function of changes in the input. Thus, we have
=
= .
= ( ) ------------------(17) Which is simply the derivative of the squashing function F for the kth unit,
to )
we consider two cases. First, assume that unit k is an output unit k = o of the network. In this case, it follows from the definition of Ep that
= (
) ------------------(18) Which is the same result as we obtained with the standard delta rule. Substituting this and equation (17) in equation (16) we get
=
( ) -----------------(19) Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 41
for any output unit o. Secondly, if k is not an output unit but a hidden unit k = h. we do not readily know the contribution of the unit to the output error of the network. However, the error measure can be written as a function of the net inputs from hidden to output layer: = (
,
----(20)
----------------(21)
Equations (18) and (20) give a recursive procedure for computing the s for all units in the network, which are then used to compute the weight changes according to equation (15) This procedure constitutes the generalised delta rule for a feed-forward network of non-linear Units. The basic back propagation algorithm is implemented in three steps. The input pattern is presented to the input layer of the network. These inputs are propagated through the network until they reach the output units. This forward pass produces the actual or predicted output pattern. Because back propagation is a supervised learning algorithm, the desired outputs are given as part of the training vector. The actual network outputs are subtracted from the desired outputs and an error signal is produced. This error signal is then the basis for the back propagation step, whereby the errors are passed back through the neural network by computing the contribution of each hidden processing unit and deriving the corresponding adjustment needed to produce the correct output. The connection weights are then adjusted and the neural network has just learned from an experience. The backward propagation of weight adjustments along these lines
continues until we reach the input layer. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 42
To teach the neural network we need training data set. The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) z. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Modification is calculated using algorithm described below: Each teaching step starts with forcing both input signals from training set. After this stage we can determine output signals values for each neuron in each network layer. Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 44
Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer Propagation of signals through the output layer. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 45
In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal of output layer neuron. It is impossible to compute error signal for internal neurons directly, because output values of these neurons are unknown. For many years the effective method for training multiplayer networks has been unknown. Only in the middle eighties the back propagation algorithm has been worked out. The idea is to propagate error signal (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 46
The weights' coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is
used for all network layers. If propagated errors came from few neurons they are added. The illustration is below: Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 47
When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/de represents derivative of neuron activation function (which weights are modified). Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 49
Coefficient affects network teaching speed. There are a few techniques to select his parameter. The first method is to start teaching process with large value of the parameter. While weights coefficients are being established the parameter is being decreased gradually. The second, more complicated, method starts teaching with small parameter value. During the teaching process the parameter is being increased when the teaching is advanced and then decreased again in the final stage. Starting teaching process with low parameter value enables to determine weights coefficients signs. Two major learning parameters are used to control the training process of a back propagation network. The learn rate is used to specify whether the neural network is going to make major adjustments after each learning trial or if it is only going to make minor adjustments. Momentum is used to control possible oscillations in the weights, which could be caused by alternately signed error signals. Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 51
() ---------------(29) where t indexes the presentation number and is a constant which determines the effect of the previous weight change. The role of the momentum term is shown in figure. When no momentum term is used, it takes a long time before the minimum has been reached with a low learning rate, whereas for high learning rates the minimum is never reached
because of the oscillations. When adding the momentum term, the minimum will be reached faster. Fig 6.1 : The descent in weight space. a) for small learning rate. b) for large learning rate; note the oscillations, and c) with large learning rate and momentum term added. This corrective procedure is called back propagation (hence the name of the neural network) and it is applied continuously and repetitively for each set of inputs and corresponding set of outputs produced in response to the inputs. This procedure continues so long as the individual or total errors in the responses exceed a Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 52
specified level or until there are no measurable errors. At this point, the neural network has learned the training material and you can stop the training process and use the neural network to produce responses to new input data. The learning rate applies a greater or lesser portion of the respective adjustment to the old weight. If the factor is set to a large value, then the neural network may learn more quickly, but if there is a large variability in the input set then the network may not learn very well or at all. As we train the network, the total error, that is the sum of the errors over all the training sets, will become smaller and smaller. Once the network reduces the total error to the limit set, training may stop. The learning rate applies a greater or lesser portion of the respective adjustment to the old weight. If the factor is set to a large value, then the neural network may learn more quickly, but if there is a large variability in the input set then the network may not learn very well or at all. The back prop algorithm is a version of the steepest descent optimisation method applied to the problem of finding the weights that minimize the error function of the network output. Due to the complexity of the function and the large numbers of weights that are being trained as the network learns, there is no assurance that the back prop algorithm will find the optimum weights that minimize error. The procedure can get stuck at a local minimum. It has been found useful to randomise the order of presentation of the cases in a training set between different scans. It is possible to speed up the algorithm by batching that is updating the weights for several exemplars in a pass. However, at least the extreme case of using the entire training data set on each update has been found to get stuck frequently at poor local minima. A single scan of all cases in the training data is called an epoch. Most applications of feed forward networks and back prop require several epochs before errors are reasonably small. A number of modifications have been proposed to reduce the epochs needed to train a neural net. One commonly employed idea is to incorporate a momentum term that injects some inertia in the weight adjustment on the backward pass. This is done by adding a term to the expression for weight adjustment Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 53
for a connection that is a fraction of the previous weight adjustment for that connection. This fraction is called the momentum control parameter. When no momentum term is used, it takes a long time before the minimum has been reached with a low learning rate, whereas for high learning rates the minimum is never reached
because of the oscillations. When adding the momentum term the minimum will be reached faster. Another idea is to vary the adjustment parameter so that it decreases as the number of epochs increases. Intuitively this is useful because it avoids over fitting that is more likely to occur at later epochs than earlier ones. The point of minimum validation error is a good indicator of the best number of epochs for training and the weights at that stage are likely to provide the best error rate in new data. 6.6 MATLAB CODE FOR BACK-PROPAGATION ALGORITHM: clc; clear all; close all; N = 1000; %No of Symbols M = 5; u = randint(1,N); %Training Signal c = randint(M,1); %Channel to be equalized c = c / norm(c); z = filter(c,1,u); %Channel output SNR = 60; %Additive noise to the channel output var_v = var(z) * 10^(-SNR/10); v = var_v^0.5 * randint(1,N); x = z + v; %Input to the Equalizer x1=[zeros(1,9)]; for i=1:N x1(9+i)=x(i); end Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 54
n=10; %no of nodes in input layer h1=6; %no of nodes in first hidden layer h2=3; %no of nodes in second hidden layer y=[]; eta=.002; alp=.00001; it=1000; %no of iterations th1=[-.202 .268 -.328 -.107 .295 -.106]; %Weight initialization for bias in first hidden layer th2=[.293 .329 -.552]; %Weight initialization for bias in second hidden layer tho=[.526]; %Weight initialization for bias in output layer a=1.2; Eavg=[]; E=0; %paarmeter initialization starts for j1=1:h1 for i=1:n pwh1(j1,i)=0.0; % wh1(j1,i)=0.0;
end end %----------------------First hidden layer---------------------% wh1=[.04 -.234 .0132 .53 -.77 -.012 .257 .994 -.421 .192; -.231 .435 -.93 -.351 .561 -.126 .789 .310 -.409 -.11; -.191 .312 -.141 .234 -.249 .335 .125 -.267 -.195 .021; .606 -.489 .143 -.245 .198 -.395 .477 -.199 -.356 .11; -.102 .426 .129 .410 -.230 -.138 -.104 .128 -.119 .25; -.250 .129 .109 .569 -.235 .150 -.334 .956 .185 -.156]; %-----------------------------------------------------------------% for j2=1:h2 for j1=1:h1 pwh2(j2,j1)=0.0; % wh2(j2,j1)=0.0; end Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 55
end %------------------Second hidden layer-----------------------% wh2=[-.240 .536 -.628 -.284 -.431 .191; .462 -.185 .421 -.104 -.118 -.278; -.104 -.143 .028 .78 -.726 -.198]; %------------------------------------------------------------------% for k=1:1 for j2=1:h2 pwo(k,j2)=0.0; % wo(k,j2)=0.0; end end %----------------Output layer--------------------------------% wo=[.441 -256 -.726]; %parameter initialization ends %---------------------------------------------------------------% for kk=1:it %Mean Square Error caculation E=.0; for i=10:(N+9) xx(1)=x1(i); xx(2)=x1(i-1); xx(3)=x1(i-2); xx(4)=x1(i-3); xx(5)=x1(i-4); xx(6)=x1(i-5); xx(7)=x1(i-6); xx(8)=x1(i-7); xx(9)=x1(i-8); xx(10)=x1(i-9); %--------------------------------for j1=1:h1 vvh1(j1)=0.0;
for i1=1:n Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 56
vvh1(j1)=vvh1(j1)+wh1(j1,i1)*xx(i1); end end %--------------------------------for j1=1:h1 vh1(j1)=vvh1(j1)+th1(j1); yh1(j1)=1/(1+exp(-a*vh1(j1))); end %--------------------------------for j2=1:h2 vvh2(j2)=0.0; for j1=1:h1 vvh2(j2)=vvh2(j2)+wh2(j2,j1)*yh1(j1); end end %--------------------------------for j2=1:h2 vh2(j2)=vvh2(j2)+th2(j2); yh2(j2)=1/(1+exp(-a*vh2(j2))); end %--------------------------------vvo=0.0; for j2=1:h2 vvo=vvo+wo(j2)*yh2(j2); end %--------------------------------vo=vvo+tho; yo=vo; ylast(kk,i-9)=yo; d=u(i-9); e=d-yo; elast(kk,i-9)=e; Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 57
%--------------------------------%calculation of error- gradient of outputlayer del=e; %calculation of error-gradient on hidden layer 2 %--------------------------------for j2=1:h2 dw2(j2)=0; dw2(j2)=dw2(j2)+del*wo(k,j2); end %----------------------------------
for j2=1:h2 delh2(j2)=0.5*a*(1-yh2(j2))*yh2(j2)*dw2(j2); end %---------------------------------%calculation of error gradient on hidden layer 1 %---------------------------------for j1=1:h1 dw1(j1)=0; for j2=1:h2 dw1(j1)=dw1(j1)+delh2(j2)*wh2(j2,j1);; end end %---------------------------------for j1=1:h1 delh1(j1)=0.5*a*(1.-yh1(j1))*yh1(j1)*dw1(j1); end %---------------------------------% updating of weight of output layer %---------------------------------for j2=1:h2 dwo(j2)=eta*del*yh2(j2); dwom(j2)=alp*(wo(j2)-pwo(j2)); pwo(j2)=wo(j2); Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 58
wo(j2)=wo(j2)+dwo(j2)+dwom(j2); end %---------------------------------%updation of weight of hidden layer 2 %---------------------------------for j2=1:h2 for j1=1:h1 dwh2(j2,j1)=eta*delh2(j2)*yh1(j1); dwhh2(j2,j1)=alp*(wh2(j2,j1)-pwh2(j2,j1)); pwh2(j2,j1)=wh2(j2,j1); wh2(j2,j1)=wh2(j2,j1)+dwh2(j2,j1)+dwhh2(j2,j1); end end %----------------------------------%updation of weight of hidden layer 1 %---------------------------------for j1=1:h1 for i1=1:n dwh1(j1,i1)=eta*delh1(j1)*xx(i1); dwhh1(j1,i1)=alp*(wh1(j1,i1)-pwh1(j1,i1)); pwh1(j1,i1)=wh1(j1,i1); wh1(j1,i1)=wh1(j1,i1)+dwh1(j1,i1)+dwhh1(j1,i1);
end end %---------------------------------ersq=e^2; E=E+.5*ersq; %--------------------------------end %--------------------------------------Eav=E/N; Eavg=[Eavg Eav]; end Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 59
plot(Eavg); for i=1:it for j=1:N if ylast(i,j)>.52 yn(i,j)=1; else yn(i,j)=0; end end end k=1; count=[]; c=0; for i=1:kk for j=1:N if(u(k)==yn(i,j)) c=c; else c=c+1; end k=k+1; end count=[count c]; c=0; k=1; end figure,plot(count); Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 60
CHAPTER 7
COMPARISON
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 61
CHAPTER-7 COMPARISON
7.1 Factors Effecting the Performance of the System:
Convergence Settling time Mean Square Error Bit Error Rate
7.1.1 Convergence:
Adaptive filters optimize the filter coefficients to minimize the power of the error signal iteratively. The process of minimizing the power of the error signal is known as convergence. A fast convergence indicates that the adaptive filter takes a short time to calculate the appropriate filter coefficients that minimize the power of the error signal.
7.1.2 Settling time: It is the time period that adaptive filters take to converge. Smaller
settling time means quicker convergence speed. Convergence speed is also known as adoption rate. Steady state is the state when the adaptive filter converges and the filter coefficients no longer have significant changes. Because signals might include random noise or because adaptive filters are not optimum, the error signal outputs of adaptive filters are not necessarily zero when the adaptive filter converges. This error is called the steady state error.
corresponding to the expected value of the squared error loss or quadratic loss. MSE measures the average of the square of the "error." The error is the amount by which the estimator differs from the quantity to be estimated. The difference occurs because of randomness or because the estimator doesn't account for information that could produce a more accurate estimate.
As an example, assume this transmitted bit sequence: 0 1 1 0 0 0 1 0 1 1, And the following received bit sequence: 0 0 1 0 1 0 1 0 0 1, The BER is in this case 3 incorrect bits (underlined) divided by 10 transferred bits, resulting in a BER of 0.3 or 30%. The bit error probability pe is the expectation value of the BER. The BER can be considered as an approximate estimate of the bit error probability. In a communication system, the receiver side BER may be affected by transmission channel noise, interference, distortion, bit synchronization problems, attenuation, wireless multipath fading, etc. The BER may be improved by choosing a strong signal strength (unless this causes cross-talk and more bit errors), by choosing a slow and robust modulation Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 63
scheme or line coding scheme, and by applying channel coding schemes such as redundant forward error correction codes.
7.2 Comparison between LMS and Back Propagation: Factor LMS BACK PROPAGATION No of Symbols 1000 1000 No of iterations required 997 575 Mean Square Error 0.137 0.02808 Bit Error Rate 58 14 Computational time 5hr 10 min 7.3 Comparison of Mean Square Error Result
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 64
CONCLUSION
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 65
CONCLUSION
Conclusion:
In our present work, we have undertaken the problem of inter symbol interference for digital data transmission. In order to overcome this problem, we have used Adaptive channel Equalization techniques like LMS and Multilayered feed forward network and compared the results. From the table 7.2 it is evident that for 1000 bits transmission in the channel, the Back propagation neural network performs better than the ordinary LMS algorithm, in performance wise like number of iterations for convergence, Mean Square Error, computational time and Bit Error Rate. For 1000 bits transmission channel the factors differences between LMS algorithm and multilayer perceptron model are number of iteration is 997 and 575, mean square errors 0.137 and 0.2808, bit error rate 58 and 14 respectively. One
more main factor is computation time. It is 5 hours for LMS algorithm to compute while 10 minutes for multilayer perceptron model. Hence we can conclude that the Multilayer Perceptron model out clashes the LMS algorithm. Still we can achieve the BER still smaller in number by suitable choice of activation function, threshold values in Perceptron learning mechanism including more no of layers and certain optimization techniques for faster learning of this model.
Future Enhancements:
We can use certain advanced techniques like Radial basis networks, Elman networks, and binary valued Boolean network for supervised cases with certain optimization techniques. This model can be extended to non-linear channels.
BIBLIOGRAPHY
Comparison of LMS and neural network algorithms for reducing ISI
Department of ECE, GMRIT. Page 66
BIBILOGRAPHY
1) K Kornik, M Stichcombe, and H White, Mutilayer feedforward networks are universal approximators, Neural Networks, vol.2, no.5, pp 359-366, 1989. 2) K Kornik, M Stichcombe, and H White, Universal approximation of an unknown mapping and its derivatives using multilayer feed forward networks, Neural Networks, vol.3, no.5, pp 551-560, 1990. 3) Yu Hen, Hu Jenq, Neng Hwang, Handbook of Neural Network Signal Processing, CRC Press 2002 4) H B Demuth & Beale, Neural Network Design, PWS Publishing Company 1995. 5) Rajkumar Thenua, S.K.Agarwal Simulation and Performance Analysis of Adaptive Filter in Noise Cancellation vol.2(9), 2010, 4373-4378, International Journal of Engineering Science and Technology. 6) J. G. Proakis and J. H. Miller An adaptive receiver for digital signalling through channels with inter symbol interference, fourth edition. 7) D. A. George, R. R. Bowen, and JR. Storey An adaptive decision feedback equalizer, first edition. 8) Haykin, S. 1996 Adaptive filter theory, third edition, Upper Saddle Riven, N. J., Prentice- Hall. 9) Simon Haykin Neural Networks, second edition. 10) B.P.Lathi Modern digital communication, third edition. 11) Farhang-Boroujeny, B. 1998 Adaptive Filters: Theory and Applications, Chichester, England,John Wiley & Sons 12) www.mathworks.com 13) www.ieeeexplore.com 14) Mark Hudson Beale, Martin T. Hagan, Howard B. Demuth, users guide Neural Network Toolbox 1992-2011. 15) Brian R. Hunt, Ronald L. Lipsman, Jonathan M. Rosenberg with Kevin R. Coombes,JohnE. Osborn, and Garrett J. Stuck A Guide to Users, for beginners and experienced users 2001 Cambridge University Press.