You are on page 1of 63

CHAPTER 1 INTRODUCTION

1.1

INTRODUCTION
Advances in semiconductors, digital signal processing, and communication technologies

have made multimedia applications more flexible and reliable. A good example is the H.264
video standard, also known as MPEG-4 Part 10 Advanced Video Coding, which is widely
regarded as the next generation video compression standard. Video compression is necessary in a
wide range of applications to reduce the total data amount required for transmitting or storing
video data. Among the coding systems, a ME is of priority concern in exploiting the temporal
redundancy between successive frames, yet also the most time consuming aspect of coding.
Additionally, while performing up to 60%90% of the computations encountered in the entire
coding system, a ME is widely regarded as the most computationally intensive of a video coding
system.
A ME generally consists of PEs with a size of 4 X 4. However, accelerating the
computation speed depends on a large PE array, especially in high-resolution devices with a large
search range such as HDTV. Additionally, the visual quality and peak signal-to-noise ratio
(PSNR) at a given bit rate are influenced if an error occurred in ME process. A testable design is
thus increasingly important to ensure the reliability of numerous PEs in a ME. Moreover,
although the advance of VLSI technologies facilitate the integration of a large number of PEs of a
ME into a chip, the logic-per-pin ratio is subsequently increased, thus decreasing significantly the
efficiency of logic testing on the chip. As a commercial chip, it is absolutely necessary for the
ME to introduce design for testability (DFT). DFT focuses on increasing the ease of device
testing, thus guaranteeing high reliability of a system. DFT methods rely on reconfiguration of a
circuit under test (CUT) to improve testability. While DFT approaches enhance the testability of
circuits, advances in sub-micron technology and resulting increases in the complexity of
electronic circuits and systems have meant that built-in self-test (BIST) schemes have rapidly
become necessary in the digital world. BIST for the ME does not expensive test equipment,
ultimately lowering test costs. Moreover, BIST can generate test simulations and analyze test
responses without outside support, subsequently streamlining the testing and diagnosis of digital
systems. However, increasingly complex density of circuitry requires that the built-in testing
approach not only detect faults but also specify their locations for error correcting. Thus,

1|Page

extended schemes of BIST referred to as built-in self-diagnosis and built-in self-correction have
been developed recently.
While the extended BIST schemes generally focus on memory circuit, testing-related
issues of video coding have seldom been addressed. Thus, exploring the feasibility of an
embedded testing approach to detect errors and recover data of a ME is of worthwhile interest.
Additionally, the reliability issue of numerous PEs in a ME can be improved by enhancing the
capabilities of concurrent error detection (CED). The CED approach can detect errors through
conflicting and undesired results generated from operations on the same operands. CED can also
test the circuit at full operating speed without interrupting a system. Thus, based on the CED
concept, this work develops a novel EDDR architecture based on the RQ code to detect errors
and recovery data in PEs of a ME and, in doing so, further guarantee the excellent reliability for
video coding testing applications.
1.2

OVERVIEW
Video compression is the field in electrical engineering and computer science that deals

with representation of video data, for storage and/or transmission, for both analog and digital
video. Video coding is often considered to be only for natural video, it can also be applied to
synthetic (computer generated) video, i.e. graphics. Many representations take advantage of
features of the Human Visual System to achieve an efficient representation. The biggest
challenge is to reduce the size of the video data using video compression. For this reason the
terms video coding and often used interchangeably by those who dont know the difference.
The search for efficient video compression techniques dominated much of the research activity
for video coding since the early 1980s, the major milestone was H.261, from which JPEG
adopted the idea of using the DCT; since the many other advancements have been made to
algorithms such as motion estimation. Since approximately 2000 the focus has been more on
Meta data and video search, resulting in MPEG-7 and MPEG-21.
1.2.1 Video Compression
The main problem with the uncompressed (raw) video is it contain immense amount of
data and hence communication and storage capabilities are limited and are expensive. For
example, if we consider a HDTV video signal with 720 X 2180 pixels/frame with progressive
scanning at 60 frames/sec, then the transmitter must be able to send
2|Page

720X1280 pixels
frame

)(

60 frames 3colours 8bits


)(
)(
) = 1.3Gb/sec
sec
pixels colour

But the available HDTV channel bandwidth is around 20 Mb/s, i.e., it require compression by a
factor of 70. A DVD (Digital Versatile Disk) can only store a few seconds of raw video at Frame
rate and television quality resolution.
1.2.2 Need for Compression
The following statement (or something similar) has been made many times over the 20year history of image and video compression: Video compression will become redundant very
soon, once transmission and storage capacities have increased to a sufficient level to cope with
uncompressed video. It is true that both storage and transmission capacities Continue to
increase. However, an efficient and well-designed video compression system gives very
significant performance advantages for visual communications at both low and high
transmission bandwidths. At low bandwidths, compression enables applications that would not
otherwise be possible, such as basic-quality video telephony over a standard telephone
connection. At high bandwidths, compression can support a much higher visual quality. For
example, a 4.7 Gbyte DVD can store approximately 2 hours of uncompressed QCIF video (at 15
frames per second) or 2 hours of compressed ITU-R 601 video (at 30 frames per second). Most
users would prefer to see television-quality video with smooth motion rather than postagestamp video with jerky motion. Video CODEC s will therefore remain a important part of the
emerging multimedia industry for the foreseeable future, allowing designers to make the most
efficient use of available transmission or storage capacity. In this chapter we introduce the basic
components of an image or video compression system. We then describe the main functional
blocks of an image encoded decoder (CODEC) and a video CODEC.
1.2.3 Achieving Compression
Video compression can be achieved by exploiting the similarities or redundancies and
irrelevancy that exist in a typical video signal. The redundancy in a video signal is based on two
principles. The first is the spatial redundancy that exists in each frame. The second is the
difference between the corresponding frames. This is called temporal redundancy. This temporal
redundancy can be eliminated by using motion estimation and compensation procedure. The
remaining goals of video compression are to reduce the irrelevancy in the video signal is
3|Page

relatively straight forward. The identification and reduction redundancy in video signal is the
straightforward, in this what is perceptually relevant and what is not is very difficult and there.
This operation can be done by using appropriate models of the Human Vision System.
In video successive frames may contain the same objects (still or moving). In inter frame
coding motion estimation and compensation has become powerful techniques to eliminate the
temporal redundancy due high correlation between consecutive frames. In video scenes, motion
can be complex combination of translation of rotation. Such motion is difficult to estimate and
may require large amount of processing. The translational motion is easily estimated and has
been used successfully for motion compensated coding.
Different search algorithms are used to estimate motion between frames. When motion
estimation is performed by an MPEG-2 encoder it groups pixels into 16 x 16 macro blocks.
MPEG-4 AVC encoders can divide these macro blocks into partitions as well as 4 x 4, and even
of variable size within the same Macro block. Partitions allow for more accuracy in motion
estimation because areas with high motion can be isolated from those with less movement.

4|Page

CHAPTER 2 LITERATURE REVIEW AND PROBLEM


IDENTIFICATION
2.1

LITERATURE REVIEW
Most of the researchers have proposed many algorithms for Motion Estimation.

Generally, the motion estimation search types are divided into two types these are
1). Pixel based motion estimation.
2).Block based motion estimation
The pixel based motion estimation approach seeks to determine motion vector for every
pixel in the image. This is also referred to as the optical flow method, which works on the
fundamental assumption of brightness constancy that is the intensity of a pixel remains constant,
when it is displaced. However, no unique match for a pixel in the reference frame is found in the
direction normal to the intensity gradient. It is for this reason that an additional constraint is also
introduced in terms of the smoothness displacement vector in the neighborhood. The smoothness
constraint makes the algorithm interactive and requires excessively large computation time,
making it unsuitable for practical and real time implementation.
An alternate and faster approach is the block based motion estimation. In this method, the
candidate frame is divided into non overlapping blocks (of size 16 16, 8 8 or even 4 4
pixels in the recent standards) and for each such candidate block, the best motion vector is
determined in the reference frame. Here, a single motion vector is computed for the entire block,
whereby we make an inherent assumption that the entire block undergoes translational Motion.
This assumption is reasonably valid, except for the object boundaries and smaller block size leads
to better motion compensation and motion estimation.
Block based motion estimation is accepted in all the video coding standards proposed till
date. It is easy to implement in hardware and real time motion estimation and prediction is
possible.
Many studies in literature use different block matching motion estimation algorithms in
these algorithms full search gives minimum error when compared to the all block matching
algorithms this is the basic search but it takes maximum computational complexity. So that in
literature survey we find so many block matching algorithms in this block matching motion
5|Page

estimation we have to remember two main things the first one is the type of search pattern this
one is the most important one because when the object moving we have to follow the certain
pattern then only we can get the minim number of search points for block matching. The second
one is meaning absolute difference so when the search pattern is very close to objects then we
can get the minimum error.
The block size is the one of the important parameter in block matching algorithm. If the
block size is smaller, it achieves better prediction quality. This is due to a number of reasons. A
smaller block size reduces the effect of the accuracy problem. In other words, with a smaller
block size, there is less possibility that the block will contain different objects moving in different
directions.
In literature review I observed different searching algorithms in all these if the error
minimizes the number of search points increasing If search points decreases the error increases so
that by observing these algorithms proposed one fast block matching algorithm this given best
search accuracy and minim error. Finally the performance of the proposed algorithm is evaluated
in terms of completeness and correctness.
2.1.1 Motion Estimation
A video sequence can be considered to be a discredited three dimensional projection of
the real four-dimensional continuous space time. The objects in the real world may move, rotate,
or deform. The movements cannot be observed directly, but instead the light reflected from the
object surfaces and projected onto an image. The light can be moving, and the reflected back
light varies depending on the angle between a surface and alight source. There may be objects
occluding the light rays and casting shadows. The objects may be transparent (so that several
independent motions could be observed at the same location of an image) or there might be fog,
rain or snow blurring the observed image. The discretization causes noise into the video
sequence, from which the video encoder makes its motion estimations. There may also be noises
in the image capture device (such as a video camera) or in the electrical transmission lines. A
perfect motion model would take all the factors into account and find the motion that has the
maximum likelihood from the observed video sequence. The current frame and reference frame
difference can observe in the figure 2.1 diagram.

6|Page

Changes between frames are mainly due to the movement of objects. Using a model of
the motion of objects between frames, the encoder estimates the motion that occurred between
the reference frame and the current frame. This process is called motion estimation (ME). The
encoder then uses this motion model and information to move the contents of the reference frame
to provide a better prediction of the current frame. This process is known as motion
compensation (MC), and the prediction so produced is called the motion-compensated prediction
(MCP) or the displaced-frame (DF). In this case, the coded prediction error signal is called the
displaced-frame difference (DFD).A block diagram of a motion compensated coding system is
illustrated in Figure 2.2 This is the most commonly used inter frame coding method.

Figure 2.1 Motion estimation detector


The reference frame employed for ME can occur temporally before or after the current
frame. The two cases are known as forward prediction and backward prediction, respectively.
The prediction can be observed in figure 2.3. In bidirectional prediction, however, two reference
frames (one each for forward and backward prediction) are employed and the two predictions are
interpolated (the resulting predicted frame is called B-frame). The most commonly used ME
method is the block matching motion estimation (BMME) algorithm.

7|Page

Figure 2.2 Motion compensated video coding

Figure 2.3 Predictive sources coding with motion compensation


2.1.2 Motion Estimation Procedure
After completion of motion estimation the residue of picture and motion vectors are
predicted. This procedure is executed for each block (16x16, 8x8 or 4x4) in the current frame.
8|Page

1.

For the reference frame, a search area is defined for each block in the current frame. The
search area is typically sized at 2 to 3 times the macro block size (16x16). Using the fact
that the motion between consecutive frames is statistically small, the search range is
confined to this area. After the search process, a best match will be found within the
area. The best matching usually means having lowest energy in the sum of residual
formed by subtracting the candidate block in search region from the current block located
in current frame. The process of finding best match block by block is called block-based
motion estimation.

2. After finding the best match, the motion vectors and residues between the current block
and reference block are computed. The process of getting the residues and motion vectors
is known as motion compensation.
3. The residues and motion vectors of best match are encoded by the transform unit and
entropy unit and transmitted to the decoder side.
4. At decoder side, the process is reversed to reconstruct the original picture.
Figure 2.4 shows an illustration of the above procedure. In modern video coding standards, the
reference frame can be a previous frame, a future frame or a combination of two or more
Previously coded frames. The number of reference frames needed depends on the required
accuracy. The more reference frames referenced by current block, the more accurate the
prediction is.
2.1.3 Motion Vectors
To find the motion of each block, a motion vector is defined as the relative displacement
between the current candidate block and the best matching block within the search window in the
reference frame. It is a directional pair representing the displacement in horizontal (x-axis)
direction and vertical (y-axis direction). The maximum value of motion vector is determined by
the search range. If the search range is more, the more bits needed to code the motion vector.
Designers need to make tradeoffs between these two conflicting parameters. The motion vector is
illustrated in figure 2.4

9|Page

Figure 2.4 Motion Estimation and Motion Vector


Each macro block in the frame motion vector is produced. MPEG-1 and MPEG-2 employ
this property. The introduction of variable block size motion estimation in MPEG-4 and
H.264/AVC, one macro block can produce more than one motion vector due to the existence of
different kinds of sub blocks. In H.264, 41 motion vectors should be produced in one macro
block and they are passed to rate distortion optimization to choose the best combination. This is
known as mode selection.
2.1.4 Prediction Of Video CODEC
A video signal consists of a sequence of individual frames. Each frame may be
compressed individually using an image CODEC as described above: this is described as intraframe coding, where each frame is intra coded without any reference to other frames. However,
better compression performance may be achieved by exploiting the temporal redundancy in a
video sequence (the similarities between successive video frames). This may be achieved by
adding a front end to the image CODEC, with two main functions:
10 | P a g e

These are: (1) The Prediction


(2) Compensation
Prediction: The prediction of the current frame find based on the one or more previously
transmitted frames.
Compensation: The prediction is subtracted from the current frame to produce a residual frame.
The residual frame is then processed using an image CODEC. The key to this approach
is the prediction function: if the prediction is accurate, the residual frame will contain little data
and will hence be compressed to a very small size by the image CODEC. In order to decode the
frame, the decoder must reverse the compensation process, adding the prediction to the decoded
residual frame (reconstruction). This is inter frame coding: frames are coded based on some
relationship with other video frames, i.e. coding exploits the interdependencies of video frames.

Figure 2.5 (a) Current Frame

Figure 2.5(b) Previous Frame

11 | P a g e

Figure 2.5(c) Residual frame


2.1.5 Frame Differencing
The simplest predictor is just the previous transmitted frame. The above figure shows the
residual frame produced by subtracting the previous frame from the current frame in a video
sequence. Mid-grey areas of the residual frame contain zero data: light and dark areas indicate
positive and negative residual data respectively. It is clear that much of the residual data is zero:
hence, compression efficiency can be improved by compressing the residual frame rather than the
current frame.
Encoder

Encoder

Encoder output

Decoder

Decoder

Input

Prediction

Decoder input

Prediction

Output

Original frame

Zero

Compressed

Zero

Decoded frame 1

Frame 1
Original frame Original frame 1

Compressed

Decoded frame 1

Decoded frame 2

Decoded frame 2

Decoded frame 3

Residual frame 2
Original frame Original frame 2

Compressed
Residual frame 2

Table 2.1 Prediction drift

12 | P a g e

Figure 2.6 Encoder with Decoding Loop


The decoder faces a potential problem that can be illustrated as follows. Table 2.1
shows the sequence of operations required to encode and decode a series of video frames
using frame differencing. For the first frame the encoder and decoder use no prediction. The
problem starts with frame 2: the encoder uses the original frame 1 as a prediction and encodes
the resulting residual. However, the decoder only has the decoded frame 1 available to form
the prediction. Because the coding process is lossy, there is a difference between the decoded
and original frame 1 which leads to a small error in the prediction of frame 2 at the decoder.
This error will build up with each successive frame and the encoder and decoder predictors
will rapidly drift apart, leading to a significant drop in decoded quality. The solution to this
problem is for the encoder to use a decoded frame to form the prediction. Hence the encoder
in the above example decodes (or reconstructs) frame 1 to form a prediction for frame 2. The
encoder and decoder use the same prediction and drift should be reduced or removed. Figure
2.6 shows the complete encoder which now includes a decoding loop in order to reconstruct
its prediction reference. The reconstructed (or reference) frame is stored in the encoder and
in the decoder to form the prediction for the next coded frame.
2.1.6 Motion Compensated Prediction
Frame differencing gives better compression performance than the intra frame coding
when successive frames are very similar, but does not perform well when there is a significant
change between the previous and current frames. Such changes are usually due to movement
13 | P a g e

in the video scene and a significantly better prediction can be achieved by estimating this
movement and compensating for it.
The below figure 2.7 shows a video CODEC that uses motion-compensated
prediction. Two new steps are required in the encoder:
Motion estimation: a region of the current frame (often a rectangular block of
luminance samples) is compared with neighboring regions of the previous
reconstructed frame.

Figure 2.7 Video CODEC with Motion Estimation and Compensation


Motion estimator attempts to find the best match, i.e. the neighboring block in the
reference frame that gives the smallest residual block.
Motion compensation: The matching region or block from the reference frame
identified by The motion estimator) is subtracted from the current region or block.

14 | P a g e

Here the decoder carries out the same motion compensation to re construct the current
frame. This operation means the encoder has to transmit the location of the best matching
blocks to the decoder (typically in the form of a set of motion vectors). The below figure shows a
residual frame produced by subtracting a motion compensated version of the previous frame from
the current frame. The residual frame clearly contains less data than the residual in Figure 2.8 this
improvement in compression does not come without a price: motion estimation can be very
computationally intensive. Design of a motion estimation algorithm can have a dramatic effect on
the compression performance and computational complexity of a video CODEC.

Figure 2.8 Residual frame (MAD)


2.1.7 Block Matching Algorithm
Figure 2.9 illustrates a process of block-matching algorithm. In a typical block matching
Algorithm, each frame is divided into blocks, each of which consists of luminance and
chrominance blocks. Usually, for coding efficiency, motion estimation is performed only on the
luminance block. Each luminance block in the present frame is matched against candidate blocks
in a search area on the reference frame. These candidate blocks are just the displaced versions of
original block. The best candidate block is found and its displacement (motion vector) is
recorded. In a typical inter frame coder; the input frame is subtracted from the prediction of the
reference frame. Consequently the motion vector and the resulting error can be transmitted
instead of the original luminance block; thus inter frame redundancy is removed and data
15 | P a g e

compression is achieved. At receiver end, the decoder builds the frame difference signal from the
received data and adds it to their constructed reference frames.

Figure 2.9 Illustration of Motion Estimation Process


This algorithm is based on a translational model of the motion of objects between frames.
It also assumes that all pixels within a block undergo the same translational movement. There are
many other ME methods, but BMME is normally preferred due to its simplicity and good
compromise between prediction quality and motion overhead. This assumption is not strictly
valid, since we capture 3-D scenes through the camera and objects do have more degrees of
freedom than just the translational one. However, the assumptions are still reasonable,
considering the practical movements of the objects over one frame and this makes our
computations much simpler. There are many other approaches to motion estimation, some using
the frequency or wavelet domains, and designers have considered scope to invent new methods
since this process does not need to be specified in coding standards. The standards need only
specify how the motion vectors should be interpreted by the decoder. Block Matching (BM) is
the most common Method of motion estimation. Typically each macro block (1616 pixels) in
the new frame is compared with shifted regions of the same size from the previous decoded
frame, and the shift which results in the minimum error is selected as the best motion vector for
that macro block. The motion compensated prediction frame is then formed from all the shifted
regions from the previous decoded frame .
16 | P a g e

2.1.8 Backward Motion Estimation


The motion estimation generally considered as backward motion estimation, since the
current frame is considered as the candidate frame and the reference frame on which the motion
vectors are searched is a fast frame that is the search is back word. Back word motion estimation
leads to forward motion prediction.

Figure 2.10 Back word motion estimation with current frame as k and frame (k-1) as the
reference frame
2.1.9 Forward Motion Estimation
It is just the opposite of backward motion estimation. Here, the search for motion vectors
is carried out on a frame that appears later than the candidates frame in temporal ordering. In
other words, the search is forward. Forward motion estimation leads to backward motion
prediction. It may appear that forward motion estimation is unusual, since one requires future
frames to predict the candidate frame. However, this is not unusual, since the candidate frame,
for which the motion vector is being sought is not necessarily the current, that is the most recent
frame. It is possible to store more than one frame and use one of the past frames as a candidate
frame that uses another frame, appearing later in the temporal order as a reference.

17 | P a g e

Figure 2.11 Forward motion estimation with current frame as k and frame (k+1) as the
reference frame
2.1.10 Matching Criteria For Motion Estimation
Inter frame predictive coding is used to eliminate the large amount of temporal and spatial
redundancy that exists in video sequences and helps in compressing them. In conventional
predictive coding the difference between the current frame and the predicted frame is coded and
transmitted. The better the prediction, the smaller the error and hence the transmission bit rate
when there is motion in a sequence, then a pixel on the same part of the moving object is a better
prediction for the current pixel. There are a number of criteria to evaluate the goodness of a
match.
Popular matching criteria used for block-based motion estimation is
1. Sum of Absolute Difference (SAD)
To implement the block motion estimation, the candidate video frame is partitioned into a
set of non-overlapping blocks and the motion vector is to be determined for each such candidate
block with respect to the reference. For each of these criteria, square block of size N X N pixels is
considered. The intensity value of the pixel at coordinate ( n1 , n2 ) in the frame k is given by,
S ( n1 , n2 , k) where (0 =< n1 , n2 =< N -1). the frame k is referred to as the candidate frame and
the block of pixels defined is the candidates block.

18 | P a g e

2.1.10.1.Sum Of Absolute Difference (SAD)


The sum of absolute difference (SAD) too makes the error values as positive, but instead
of summing up the squared differences, the absolute differences are summed up. The SAD
measure at displacement (i , j) is defined as

The SAD is evaluated using the current block/pixel and the reference block/pixel. Reference
block is selected within the search window. The search window size is different for various
CODECs such as H.264, MPEG etcetera. Hence, SAD for all the possible reference blocks within
the search window are calculated. The block with minimum SAD is then selected and a motion
vector is drawn in order to denote the motion.
2.2

PROBLEM IDENTIFICATION
As mentioned in the earlier discussion, the PEs are essential building blocks and are

connected regularly to construct a ME. Generally, PEs are surrounded by sets of ADDs and
accumulators that determine how data flows through them. PEs can thus be considered the class
of circuits called ILAs, whose testing assignment can be easily achieved by using the fault model,
cell fault model (CFM). Using CFM has received considerable interest due to accelerated growth
in the use of high-level synthesis, as well as the parallel increase in complexity and density of
integration circuits (ICs). Using CFM makes the tests independent of the adopted synthesis tool
and vendor library. Arithmetic modules, like ADDs (the primary element in a PE), due to their
regularity, are designed in an extremely dense configuration.
Moreover, a more comprehensive fault model, i.e. the stuck-at (SA) model, must be
adopted to cover actual failures in the interconnect data bus between PEs. The SA fault is a well
known structural fault model, which assumes that faults cause a line in the circuit to behave as if
it were permanently at logic 0 (stuck-at 0 (SA0)) or logic 1 [stuck-at 1 (SA1)]. The SA fault
in a ME architecture can incur errors in computing SAD values. A distorted computational error
and the magnitude of are assumed here to be equal to SAD-SAD, where SAD denotes the
computed SAD value with SA faults.

19 | P a g e

CHAPTER 3 METHODOLOGY
An error detector and data recovery EDDR is the solution that has been proposed for the
above mentioned problem. The technique used for detection and correction of code is RQ code
generation. The proposed architecture is mentioned below.
3.1

PROPOSED EDDR ARCHITECTURE DESIGN


Fig. 3.1 shows the conceptual view of the proposed EDDR scheme, which comprises two

major circuit designs, i.e. error detection circuit (EDC) and data recovery circuit (DRC), to detect
errors and recover the corresponding data in a specific CUT. The test code generator (TCG) in
Fig. 3.1 utilizes the concepts of RQ code to generate the corresponding test codes for error
detection and data recovery. In other words, the test codes from TCG and the primary output
from CUT are delivered to EDC to determine whether the CUT has errors. DRC is in charge of
recovering data from TCG. Additionally, a selector is enabled to export error-free data or datarecovery results. Importantly, an array-based computing structure, such as ME, discrete cosine
transform (DCT), iterative logic array (ILA), and finite impulse filter (FIR), is feasible for the
proposed EDDR scheme to detect errors and recover the corresponding data.

Figure 3.1. Conceptual view of the proposed EDDR architecture.

20 | P a g e

Figure 3.2. A specific PEi testing processes of the proposed EDDR architecture.
This work adopts the systolic ME as a CUT to demonstrate the feasibility of the proposed
EDDR architecture. A ME consists of many PEs incorporated in a 1-D or 2-D array for video
encoding applications. A PE generally consists of two ADDs (i.e. an 8-b ADD and a 12-b ADD)
and an accumulator (ACC). Next, the 8-b ADD (a pixel has 8-b data) is used to estimate the
addition of the current pixel (Cur_pixel) and reference pixel (Ref_pixel). Additionally, a 12-b
ADD and an ACC are required to accumulate the results from the 8-b ADD in order to determine
the sum of absolute difference (SAD) value for video encoding applications. Notably, some
registers and latches may exist in ME to complete the data shift and storage. Fig. 3.2 shows an
example of the proposed EDDR circuit design for a specific PEi of a ME. The fault model
definition, RQCG-based TCG design, operations of error detection and data recovery, and the
overall test strategy are described carefully as follows.
3.1.1 RQ Code Generation
Coding approaches such as parity code, Berger code, and residue code have been
considered for design applications to detect circuit errors. Residue code is generally separable
arithmetic codes by estimating a residue for data and appending it to data. Error detection logic
for operations is typically derived by a separate residue code, making the detection logic is
simple and easily implemented. For instance, assume that N denotes an integer, and represent
data words, N1 and N2 refers to the modulus. A separate residue code of interest is one in which N
is coded as a pair(N, |N|m). Notably, |N|m is the residue of modulo. Error detection logic for
21 | P a g e

operations is typically derived using a separate residue code such that detection logic is simply
and easily implemented. However, only a bit error can be detected based on the residue code.
Additionally, an error cannot be recovered effectively by using the residue codes. Therefore, this
work presents a quotient code, which is derived from the residue code, to assist the residue code
in detecting multiple errors and recovering errors. The mathematical model of RQ code is simply
described as follows. Assume that binary data is expressed as

The RQ code of X modulo m expressed as

R = |X|m Q = [X/m], respectively. Notably, [i]

denotes the largest integer not exceeding i.


According to the above RQ code expression, the corresponding circuit design of the
RQCG can be realized. In order to simplify the complexity of circuit design, the implementation
of the module is generally dependent on the addition operation. Additionally, based on the
concept of residue code, the following definitions shown can be applied to generate the RQ code
for circuit design.

22 | P a g e

To accelerate the circuit design of RQCG, the binary data can generally be divided into two parts:

Significantly, the value of k is equal to [n/2] and the data formation of Y0 and Y1 are a
decimal system. If the modulus m=2k -1, then the residue code of X modulo m is given by

Notably, since the value of

is generally greater than that of modulus

, the

equations must be simplified further to replace the complex module operation with a simple
addition operation by using the parameters

and .

Based on the equations, the corresponding circuit design of the RQCG is easily realized
by using the simple adders (ADDs). Namely, the RQ code can be generated with a low
complexity and little hardware cost.
3.1.2 Test Code Generation Design
According to Fig. 3.2, TCG is an important component of the proposed EDDR
architecture. Notably, TCG design is based on the ability of the RQCG circuit to generate
corresponding test codes in order to detect errors and recover data. The specific

in Fig. 3.2

estimates the absolute difference between the Cur_pixel of the search area and the Ref_pixel of
23 | P a g e

the current macroblock. Thus, by utilizing PEs, SAD shown in as follows, in a macroblock with
size of

can be evaluated

where

and

Importantly,

and

denote the corresponding RQ code of

and

modulo

rep-resent the luminance pixel value of Cur_pixel and Ref_pixel,

respectively. Based on the residue code, the definitions shown can be applied to facilitate
generation of the RQ code (

and

) form TCG. Namely, the circuit design of TCG can be

easily achieved (see Fig. 3.3) by using

Fig. 3.4 shows the timing chart for a macroblock with a size of 4
demonstrate the operations of the TCG circuit. The data

and

4 in a specific

from Cur_pixel and Ref_pixel

must be sent to a comparator in order to determine the luminance pixel value


1st clock. Notably, if

, then

Ref_pixel, respectively. Conversely,

and

and

captured by the

at the

represents the luminance pixel value of Ref_pixel, and

are generated and the corresponding RQ code


and

and

are the luminance pixel value of Cur_pixel and

denotes the luminance pixel value of Cur_pixel when


of

to

. At the 2nd clock, the values


,

can be

circuits if the 3rd clock is triggered. Equations clearly


24 | P a g e

indicate that the codes of

and

can be obtained by using the circuit of a subtracter (SUB).

The 4th clock displays the operating results. The modulus value of

is then obtained at the 5th

clock. Next, the summation of quotient values and residue values of modulo
with from clocks 521 through the circuits of ACCs. Since a 4
of a ME contains 16 pixels, the corresponding RQ code (

are proceeded

4 macroblock in a specific

and

) is exported to the EDC and

DRC circuits in order to detect errors and recover data after

Figure 3.3. Circuit design of the TCG.


22 clocks. Based on the TCG circuit design shown in Fig. 3.4, the error detection and data
recovery operations of a specific

in a ME can be achieved.

25 | P a g e

Figure 3.4. Timing chart of the TCG.


3.2

EDDR PROCESS
Fig. 3.2 clearly indicates that the operations of error detection in a specific PEi is achieved

by using EDC, which is utilized to compare the outputs between TCG and RQCG1 in order to
determine whether errors have occurred. If the values of RPE != RT and/or QPE != QT, then the
errors in a specific PEi can be detected. The EDC output is then used to generate a 0/1 signal to
indicate that the tested PEi is error-free/errancy.
This work presents a mathematical statement to verify the operations of error detection.
Based on the definition of the fault model, the SAD value is influenced if either SA1 and/or SA0
errors have occurred in a specific PEi. in the other words, the SAD value is transformed to
SAD = SAD + e if an error e occurred. Notably, the error signal e is expressed as

26 | P a g e

The RPEi and QPEi are given by the equation

During data recovery, the circuit DRC plays a significant role in recovering RQ code from TCG.
The data can be recovered by implementing the mathematical model as

To realize the operation of data recovery in, a Barrel shift and a corrector circuits are necessary
to achieve the func-tions of

and

, respectively. Notably, the proposed

EDDR design executes the error detection and data re-covery operations simultaneously.
Additionally, error-free data from the tested

or the data recovery that results from DRC is

selected by a multiplexer (MUX) to pass to the next specific

for subsequent testing as

shown in fig. 3.5.

27 | P a g e

Figure 3.5. Proposed EDDR architecture design for a ME.

28 | P a g e

CHAPTER 4 - IMPLEMENTATION
4.1

INTRODUCTION TO VLSI:
VLSI stands for "Very Large Scale Integrated Circuits". It's a classification of ICs. An IC

of common VLSI includes about millions active devices. Typical functions of VLSI include
Memories, computers, and signal processors, etc. A semiconductor process technology is a
method by which working circuits can be manufactured from designed specifications. There are
many such technologies, each of which creates a different environment or style of design. In
integrated circuit design, the specification consists of polygons of conducting and
semiconducting material that will be layered on top of each other to produce a working chip.
When a chip is custom-designed for a specific use, it is called an application-specific integrated
circuit (ASIC). Printed-circuit (PC) design also results in precise positions of conducting
materials, as they will appear on a circuit board; in addition, PC design aggregates the bulk of the
electronic activity into standard IC packages, the position and interconnection of which are
essential to the final circuit. Printed circuitry may be easier to debug than integrated circuitry is,
but it is slower, less compact, more expensive, and unable to take advantage of specialized silicon
layout structures that make VLSI systems so attractive. The design of these electronic circuits can
be achieved at many different refinement levels from the most detailed layout to the most abstract
architectures. Given the complexity that is demanded at all levels, computers are increasingly
used to aid this design at each step. It is no longer reasonable to use manual design techniques, in
which each layer is hand etched or composed by laying tape on film. Thus the term computeraided design or CAD is a most accurate description of this modern way and seems more broad in
its scope than the recently popular term computer-aided engineering (CAE)
4.1.1 Application Of VLSI:
PLAs:
Combinational circuit elements are an important part of any digital design. Three
common methods of implementing a combinational block are random logic, read-only memory
(ROM), and programmable logic array (PLA). In random-logic designs, the logic description of
the circuit is directly translated into hardware structures such as AND and OR gates. The PLA
occupies less area on the silicon due to reduced interconnection wire space; however, it may be

29 | P a g e

slower than purely random logic. A PLA can also be used as a compact finite state machine by
feeding back part of its outputs to the inputs and clocking both sides. Normally, for high-speed
applications, the PLA is not implemented as two NOR arrays. The inputs and outputs are inverted
to preserve the AND-OR structure.
Gate-Arrays:
The gate-array is a popular technique used to design IC chips. Like the PLA, it contains a fixed
mesh of unfinished layout that must be customized to yield the final circuit. Gate-arrays are more
powerful, however, because the contents of the mesh are less structured so the interconnection
options are more flexible. Gate-arrays exist in many forms with many names, eg: uncommitted
logic arrays and master-slice. The disadvantage of gate-arrays is that they are not optimal for any
task.
Gate Matrices:
The gate matrix is the next step in the evolution of automatically generated layout from highlevel specification. Like the PLA, this layout has no fixed size; a gate matrix grows according to
its complexity. Like all regular forms of layout, this one has its fixed aspects and its customizable
aspects. In gate matrix layout the fixed design consists of vertical columns of polysilicon gating
material. The customizable part is the metal and diffusion wires that run horizontally to
interconnect and form gates with the columns.
4.1.2 Application Areas Of VLSI
Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in
some cases have replaced mechanisms that operated mechanically, hydraulically, or by other
means; electronics are usually smaller, more flexible, and easier to service. In other cases
electronic systems have created totally new applications. Electronic systems perform a variety of
tasks, some of them visible, some more hidden:
1. Personal entertainment systems such as portable MP3 players and DVD players perform
sophisticated algorithms with remarkably little energy.
2. Electronic systems in cars operate stereo systems and displays; they also control fuel
injection systems, adjust suspensions to varying terrain, and perform the control functions
required for anti-lock braking (ABS) systems.
30 | P a g e

3. Digital electronics compress and decompress video, even at high definition data rates, onthe-fly in consumer electronics.
4. Low-cost terminals for Web browsing still require sophisticated electronics, despite their
dedicated function.
5. Personal computers and workstations provide word-processing, financial analysis, and
games. Computers include both central processing units (CPUs) and special-purpose
hardware for disk access, faster screen display, etc.
4.1.3

Advantages Of VLSI
While we will concentrate on integrated circuits in this book, the properties of integrated

circuits what we can and cannot efficiently put in an integrated circuitlargely determine the
architecture of the entire system. Integrated circuits improve system characteristics in several
critical ways. ICs have three key advantages over digital circuits built from discrete components:
Size. Integrated circuits are much smallerboth transistors and wires are shrunk to micrometer
sizes, compared to the millimeter or centimeter scales of discrete components. Small size leads to
advantages in speed and power consumption, since smaller components have smaller parasitic
resistances, capacitances, and inductances.
Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than they
can between chips. Communication within a chip can occur hundreds of times faster than
communication between chips on a printed circuit board. The high speed of circuits on-chip is
due to their small sizesmaller components and wires have smaller parasitic capacitances to
slow down the signal
Power consumption. Logic operations within a chip also take much less power. Once again,
lower power consumption is largely due to the small size of circuits on the chipsmaller
parasitic capacitances and resistances require less power to drive them.
4.2

VLSI AND SYSTEMS


These advantages of integrated circuits translate into advantages at the system level:

Smaller physical size. Smallness is often an advantage in itselfconsiders portable televisions


or handheld cellular telephones.

31 | P a g e

Lower power consumption. Replacing a handful of standard parts with a single chip reduces
total power consumption. Reducing power consumption has a ripple effect on the rest of the
system: a smaller, cheaper power supply can be used; since less power consumption means less
heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic
shielding may be feasible, too.
Reduced cost. Reducing the number of components, the power supply requirements, cabinet
costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that
the cost of a system built from custom ICs can be less, even though the individual ICs cost more
than the standard parts they replace. Understanding why integrated circuit technology has such
profound influence on the design of digital systems requires understanding both the technology
of IC manufacturing and the economics of ICs and digital systems.
4.3

INTRODUCTION TO ASICS AND PROGRAMMABLE LOGIC:


The last 15 years have witnessed the demise in the number of cell-based ASIC designs as

a means for developing customized SoCs. Rising NREs, development times and risk have mostly
restricted the use of cell-based ASICs to the highest volume applications; applications that can
withstand the multi-million dollar development costs associated with 1-2 design re-spins.
Analysts estimate that the number of cell based ASIC design starts per year is now only between
2000-3000 compared to ~10,000 in the late 1990s. The FPGA has emerged as a technology that
fills some of the gap left by cell-based ASICs. Yet even after 20+ years of existence and 40X
more design starts per year than cell-based ASICs, the size of the FPGA market in dollar terms
remains only a fraction that of cell-based ASICs. This suggests that there are many FPGA
designs that never make it into production and that for the most part, the FPGA is still seen by
many as a vehicle for prototyping or college education and has perhaps even succeeded in
actually stifling industry innovation. This paper introduces a new technology, the second
generation Structured ASIC that is tipped to reenergize the path to innovation within the
electronics industry. It brings together some of the key advantages of FPGA technology (i.e. fast
turnaround, no mask charges, no minimum order quantity) and of cell-based ASIC (i.e. low unit
cost and power) to deliver a new platform for SoC design. This document defines requirements
for development of Application Specific Integrated Circuits (ASICs). It is intended to be used as
an appendix to a Statement of Work. The document complements the ESA ASIC Design and
32 | P a g e

Assurance Requirements (AD1), which is a precursor to a future ESA PSS document on ASIC
design.
Moores Law
In the 1960s Gordon Moore predicted that the number of transistors that could be
manufactured on a chip would grow exponentially. His prediction, now known as Moores Law,
was remarkably prescient. Moores ultimate prediction was that transistor count would double
every two years, an estimate that has held up remarkably well. Today, an industry group
maintains the International Technology Roadmap for Semiconductors (ITRS), that maps out
strategies to maintain the pace of Moores Law. (The ITRS roadmap can be found at
http://www.itrs.net.)
4.3.1 Changing Landscape
Structured ASICs
A new alternative has recently emerged to address the market void between FPGAs and
cell-based ASICs. Analysts term this as the Structured ASIC.
First Generation Structured ASICs
Like the FPGA market, the Structured ASIC market had a flurry of early entrants many of
who have departed the market. Examples include respectable semiconductor companies like
NEC, LSI logic and EDA vendors such as Simplicity.
First Generation Structured ASICs provided designers with considerable power and cost
improvements over FPGAs but failed to remove many barriers to entry that existed with
traditional cell-based ASICs. First generation Structured ASICs had the following characteristics:
1.

Turn-around times were still 2-5 months from tape-out to silicon

2.

NREs were still in the range of $150-$250K or more making the technology difficult to
access for mainstream users.

3.

Minimum order quantities were required as wafers could not be shared amongst projects
or customers

4.

Development costs and time were also very high and long respectively, as designers were
expected to undergo rigorous verification down to the transistor level

33 | P a g e

5.

Designers transitioning from prototyping devices like FPGAs to first generation


Structured ASICs were still expected to redesign the product into a completely new
device, revisit timing closure and re-qualify the new device before it production ready.
While some companies still offer first generation Structured ASICs today, market

acceptance has been severely limited as a result of these barriers to entry. However, these first
generation Structured ASICs paved the way for a new generation that would combine the benefits
of both FPGAs and cell-based ASICs.
Second Generation Structured ASICs
A new generation of Structured ASICs has emerged on the market and is gaining traction.
This generation utilizes a single via mask for configuring the device. In doing so, it removes the
need for the massive amounts of SRAM configuration elements and metal interconnect that
plagues todays FPGAs. The benefits to designers are delivered through a device that provides up
to 20X lower device power consumption and up to 80% lower unit cost than FPGAs, depending
on device density, (larger FPGAs have more configuration elements and metal interconnect).
This new generation of Structured ASICs, available from eASIC Corporation, and named
Extreme also removes the barriers of traditional cell based ASICs and also first generation
Structured ASICs. With Extremes Structured ASICs advantages include:
1. Turn-around times from tape-out to silicon is only 3-4 weeks
2. There are zero mask charges as multiple projects can be shared on a wafer
3. There is no minimum order quantity
4. Development tools costs are low (analogous to FPGA type tools)
5. Development time is short as designers need not perform verification down to the
transistor level or perform exhaustive test coverage
6. Coarse FPGA-like architecture based on calls which provides manufacturing yield
advantages.
There are device options for both prototyping and mass production. Designers
transitioning from prototyping Nextreme Structured ASICs to mass production Nextreme
Structured ASICs need not revisit timing closure or re-qualify the production device.

34 | P a g e

4.3.2 Applications For Nextreme Stryctured ASICS:


Embedded Processing
Nextreme Structured ASICs are ideally suited for embedded processing applications. The
availability of a firm, 150MHz ARM926EJT processor and AMBA peripherals backed by
industry standard development tools from ARM and its Connected Community partners,
designers have the option to implement control circuits in software. A major benefit of using
Nextreme for implementing embedded systems is that designers are able to make performance,
area and feature tradeoffs using both hardware and software allowing for highly differentiated yet
cost-optimized systems.
Signal, Video and Image Processing
Having to deal with programmable metal interconnect and its associated carry chain
delays ultimately forced FPGA vendors to develop dedicated DSP blocks and slices to overcome
performance bottlenecks. With Nextreme Structured ASICs, the elimination of massive amounts
of metal interconnect means that these devices are not subject to unacceptable carry chain delays
and many signal processing structured can be implemented, at speed, using logic fabric alone.
Another capability with in Nextreme that makes them particularly suitable for signal processing
is memories. eRAM blocks are particularly suited for distributed applications such as semiparallel filters and video processing. As these blocks are located very close together, they can be
connected to form larger blocks up to 4Kbits per eUnit.
4.3.3 Field Programmable Gate Array (FPGA)
A field-programmable gate array (FPGA) is an integrated circuit designed to be
configured by the customer or designer after manufacturinghence "field-programmable". The
FPGA configuration is generally specified using a hardware description language (HDL), similar
to that used for an application-specific integrated circuit (ASIC) (circuit diagrams were
previously used to specify the configuration, as they were for ASICs, but this is increasingly
rare). FPGAs can be used to implement any logical function that an ASIC could perform. The
ability to update the functionality after shipping, partial re-configuration of the portion of the
designand the low non-recurring engineering costs relative to an ASIC design (notwithstanding
the generally higher unit cost), offer advantages for many applications.

35 | P a g e

FPGAs contain programmable logic components called "logic blocks", and a hierarchy of
reconfigurable interconnects that allow the blocks to be "wired together"somewhat like a onechip

programmable breadboard.

Logic

blocks

can

be

configured

to

perform

complex combinational functions, or merely simple logic gates like AND and XOR. In most
FPGAs, the logic blocks also include memory elements, which may be simple flip-flops or more
complete blocks of memory.
In addition to digital functions, some FPGAs have analog features. The most common
analog feature is programmable slew rate and drive strength on each output pin, allowing the
engineer to set slow rates on lightly loaded pins that would otherwise ring unacceptably, and to
set stronger, faster rates on heavily loaded pins on high-speed channels that would otherwise run
too slow.Another relatively common analog feature is differential comparators on input pins
designed to be connected to differential signaling channels. A few "mixed signal FPGAs" have
integrated peripheral Analog-to-Digital Converters (ADCs) and Digital-to-Analog Converters
(DACs) with analog signal conditioning blocks allowing them to operate as a system-on-achip.[5]Such devices blur the line between an FPGA, which carries digital ones and zeros on its
internal programmable interconnect fabric, and field-programmable analog array (FPAA), which
carries analog values on its internal programmable interconnect fabric.
4.4.3.1 Definitions of Relevant Terminology
The most important terminology used below.
Field-Programmable Device (FPD)
A general term that refers to any type of integrated circuit used for implementing digital
hardware, where the chip can be configured by the end user to realize different designs.
Programming of such a device often involves placing the chip into a special programming unit,
but some chips can also be configured in-system. Another name for FPDs is programmable
logic devices (PLDs); although PLDs encompass the same types of chips as FPDs, we prefer the
term FPD because historically the word PLD has referred to relatively simple types of devices.
Programmable Logic Array (PLA)
A Programmable Logic Array (PLA) is a relatively small FPD that contains two levels of
logic, an AND-plane and an OR-plane, where both levels are programmable (note: although PLA
36 | P a g e

structures are sometimes embedded into full-custom chips, we refer here only to those PLAs that
are provided as separate integrated circuits and are user-programmable).
Programmable Array Logic (PAL)
A Programmable Array Logic (PAL) is a relatively small FPD that has a programmable
AND-plane followed by a fixed OR-plane.
Simple PLD
Refers to any type of Simple PLD, usually either a PLA or PAL.
Complex PLD
A more Complex PLD that consists of an arrangement of multiple SPLD-like blocks on a
single chip. Alternative names (that will not be used in this paper) sometimes adopted for this
style of chip are Enhanced PLD (EPLD), Super PAL, Mega PAL, and others.
Field-Programmable Gate Array (FPGA)
A Field-Programmable Gate Array is an FPD featuring a general structure that allows
very high logic capacity. Whereas CPLDs feature logic resources with a wide number of inputs
(AND planes), FPGAs offer more narrow logic resources. FPGAs also offer a higher ratio of flipflops to logic resources than do CPLDs.
High-Capacity PLDs (HCPLD):
high-capacity PLDs: a single acronym that refers to both CPLDs and FPGAs. This term
has been coined in trade literature for providing an easy way to refer to both types of devices.
PAL is a trademark of Advanced Micro Devices.
1. Interconnect - the wiring resources in an FPD.
2. Programmable Switch- a user-programmable switch that can connect a logic element to
an interconnect wire, or one interconnect wire to another
3. Logic Block- a relatively small circuit block that is replicated in an array in an FPD.
When a circuit is implemented in an FPD, it is first decomposed into smaller sub-circuits
that can each be mapped into a logic block. The term logic block is mostly used in the
context of FPGAs, but it could also refer to a block of circuitry in a CPLD.

37 | P a g e

4. Logic Capacity- the amount of digital logic that can be mapped into a single FPD. This is
usually measured in units of equivalent number of gates in a traditional gate array. In
other words, the capacity of an FPD is measured by the size of gate array that it is
comparable to. In simpler terms, logic capacity can be thought of as number of 2-input
NAND gates.
5. Logic Density - the amount of logic per unit area in an FPD.
6. Speed-Performance- measures the maximum operable speed of a circuit when
implemented in an FPD. For combinational circuits, it is set by the longest delay through
any path, and for sequential circuits it is the maximum clock frequency for which the
circuit functions properly. In the remainder of this section, to provide insight into FPD
development the evolution of FPDs over the past two decades is described. Additional
background information is also included on the semiconductor technologies used in the
manufacture of FPDs.
4.3.4 Evolution Of Programmable Logic Devices
The first type of user-programmable chip that could implement logic circuits was the
Programmable Read-Only Memory (PROM), in which address lines can be used as logic circuit
inputs and data lines as outputs. Logic functions, however, rarely require more than a few product
terms, and a PROM contains a full decoder for its address inputs. PROMS are thus an inefficient
architecture for realizing logic circuits, and so are rarely used in practice for that purpose. The
first device developed later specifically for implementing logic circuits was the FieldProgrammable Logic Array (FPLA), or simply PLA for short. A PLA consists of two levels of
logic gates: a programmable wired AND-plane followed by a programmable wired ORplane. A PLA is structured so that any of its inputs (or their complements) can be ANDed
together in the AND-plane; each AND-plane output can thus correspond to any product term of
the inputs. Similarly, each OR plane output can be configured to produce the logical sum of any
of the AND-plane outputs. With this structure, PLAs are well-suited for implementing logic
functions in sum-of-products form. They are also quite versatile, since both the AND terms and
OR terms can have many inputs (this feature is often referred to as wide AND and OR gates).
When PLAs were introduced in the early 1970s, by Philips, their main drawbacks were that they
were expensive to manufacture and offered somewhat poor speed-performance.

38 | P a g e

Both disadvantages were due to the two levels of configurable logic, because
programmable logic planes were difficult to manufacture and introduced significant propagation
delays. To overcome these weaknesses, Programmable Array Logic (PAL) devices were
developed. PALs feature only a single level of programmability, consisting of a programmable
wired AND plane that feeds fixed OR-gates. To compensate for lack of generality incurred
because the OR- Outputs plane is fixed, several variants of PALs are produced, with different
numbers of inputs and outputs, and various sizes of OR-gates. PALs usually contain flip-flops
connected to the OR-gate outputs so that sequential circuits can be realized.
PAL devices are important because when introduced they had a profound effect on digital
hardware design, and also they are the basis for some of the newer, more sophisticated
architectures that will be described shortly. Variants of the basic PAL architecture are featured in
several other products known by different acronyms. All small PLDs, including PLAs, PALs, and
PAL-like devices are grouped into a single category called Simple PLDs (SPLDs), whose most
important characteristics are low cost and very high pin-to-pin speed-performance. As technology
has advanced, it has become possible to produce devices with higher capacity than SPLDs. The
difficulty with increasing capacity of a strict SPLD architecture is that the structure of the
programmable logic-planes grows too quickly in size as the number of inputs is increased.
The only feasible way to provide large capacity devices based on SPLD architectures is
then to integrate multiple SPLDs onto a single chip and provide interconnect to programmably
connect the SPLD blocks together. Many commercial FPD products exist on the market today
with this basic structure, and are collectively referred to as Complex PLDs (CPLDs). CPLDs
were pioneered by Altera, first in their family of chips called Classic EPLDs, and then in three
additional series, called MAX 5000, MAX 7000 and MAX 9000. Because of a rapidly growing
market for large FPDs, other manufacturers developed devices in the CPLD category and there
are now many choices available. All of the most important commercial products will be
described in Section 2. CPLDs provide logic capacity up to the equivalent of about 50 typical
SPLD devices, but it is somewhat difficult to extend these architectures to higher densities. To
build FPDs with very high logic capacity, a different approach is needed. The highest capacity
general purpose logic chips available today are the traditional gate arrays sometimes referred to
as Mask-Programmable Gate Arrays (MPGAs).

39 | P a g e

MPGAs consist of an array of pre-fabricated transistors that can be customized into the
users logic circuit by connecting the transistors with custom wires. Customization is performed
during chip fabrication by specifying the metal interconnect, and this means that in order for a
user to employ an MPGA a large setup cost is involved and manufacturing time is long. Although
MPGAs are clearly not FPDs, they are mentioned here because they motivated the design of the
user-programmable equivalent: Field- Programmable Gate Arrays (FPGAs). Like MPGAs,
FPGAs comprise an array of uncommitted circuit elements, called logic blocks, and interconnect
resources, but FPGA configuration is performed through programming by the end user. An
illustration of a typical FPGA architecture appears in Figure .
4.4

SOFTWARE REQUIREMENT

Verification Tool

ModelSim 6.5e

Synthesis Tool

Xilinx ISE 14.4

4.4.1 MODELSIM
ModelSim SE - High Performance Simulation and Debug
ModelSim SE is our UNIX, Linux, and Windows-based simulation and debug
environment, combining high performance with the most powerful and intuitive GUI in the
industry.
What's New in ModelSim SE?
1.

Improved FSM debug options including control of basic information, transition table and
warning messages. Added support of FSM Multi-state transitions coverage (i.e. coverage
for all possible FSM state sequences).

2.

Improved debugging with hyperlinked navigation between objects and their declaration,
and between visited source files.

3.

The dataflow window can now compute and display all paths from one net to another.

4.

Enhanced code coverage data management with fine grain control of information in the
source window.

40 | P a g e

5.

Toggle coverage has been enhanced to support SystemVerilog types: structures, packed
unions, fixed-size multi-dimensional arrays and real.

6.

Some IEEE VHDL 2008 features are supported including source code encryption. Added
support of new VPI types, including packed arrays of struct nets and variables.

ModelSim SE Features:
1.

Multi-language, high performance simulation engine

2.

Verilog, VHDL, SystemVerilog Design

3.

Code Coverage

4.

SystemVerilog for Design

5.

Integrated debug

6.

JobSpy Regression Monitor

7.

Mixed HDL simulation option

8.

System C Option

9.

TCL/tk

10.

Solaris and Linux 32 & 64-bit

11.

Windows 32-bit

ModelSim SE Benefits:
1.

High performance HDL simulation solution for FPGA & ASIC design teams

2.

The best mixed-language environment and performance in the industry

3.

Intuitive GUI for efficient interactive or post-simulation debug of RTL and gate-level
designs

4.

Merging, ranking and reporting of code coverage for tracking verification progress

5.

Sign-off support for popular ASIC libraries

6.

All ModelSim products are 100% standards based. This means your investment is
protected, risk is lowered, reuse is enabled, and productivity is enhanced

7.

Award-winning technical support

High-Performance, Scalable Simulation Environment:


ModelSim provides seamless, scalable performance and capabilities. Through the use of a
single compiler and library system for all ModelSim configurations, employing the right

41 | P a g e

ModelSim configuration for project needs is as simple as pointing your environment to the
appropriate installation directory.
ModelSim also supports very fast time-tenet-simulation turnarounds while maintaining
high performance with its new black box use model, known as bbox. With bbox, non-changing
elements can be compiled and optimized once and reused when running a modified version of the
test bench. B box delivers dramatic throughput improvements of up to 3X when running a large
suite of test cases.
Easy-to-Use Simulation Environment:
An intelligently engineered graphical user interface (GUI) efficiently displays design data
for analysis and debug. The default configuration of windows and information is designed to
meet the needs of most users. However, the flexibility of the ModelSim SE GUI allows users to
easily customize it to their preferences. The result is a feature-rich GUI that is easy to use and
quickly mastered.
A message viewer enables simulation messages to be logged to the ModelSim results file
in addition to the standard transcript file. The GUIs organizational and filtering capabilities
allow design and simulation information to be quickly reduced to focus on areas of interest, such
as possible causes of design bugs.
ModelSim SE allows many debug and analysis capabilities to be employed postsimulation on saved results, as well as during live simulation runs. For example, the coverage
viewer analyzes and annotates source code with code coverage results, including FSM state and
transition, statement, expression, branch, and toggle coverage. Signal values can be annotated in
the source window and viewed in the waveform viewer. Race conditions, delta, and event activity
can be analyzed in the list and wave windows. User-defined enumeration values can be easily
defined for quicker understanding of simulation results. For improved debug productivity,
ModelSim also has graphical and textual dataflow capabilities. The memory window identifies
memories in the design and accommodates flexible viewing and modification of the memory
contents. Powerful search, fill, load, and save functionalities are supported. The memory window
allows memories to be pre-loaded with specific or randomly generated values, saving the timeconsuming step of initializing sections of the simulation merely to load memories. All functions
are available via the command line, so they can be used in scripting.
42 | P a g e

Advanced Code Coverage


The ModelSim advanced code coverage capabilities deliver high performance with ease
of use. Most simulation optimizations remain enabled with code coverage. Code coverage
metrics can be reported by-instance or by-design unit, providing flexibility in managing coverage
data. All coverage information is now stored in the Unified Coverage Data Base (UCDB), which
is used to collect and manage all coverage information in one highly efficient database. Coverage
utilities that analyze code coverage data, such as merging and test ranking, are available.
The coverage types supported include:
1.

Statement coverage: number of statements executed during a run

2.

Branch coverage: expressions and case statements that affect the control flow of the HDL
execution

3.

Condition coverage: breaks down the condition on a branch into elements that make the
result true or false

4.

Expression coverage: the same as condition coverage, but covers concurrent signal
assignments instead of branch decisions

5.

Focused expression coverage: presents expression coverage data in a manner that


accounts for each independent input to the expression in determining coverage results

6.

Enhanced toggle coverage: in default mode, counts low-to-high and high-to-low


transitions; in extended mode, counts transitions to and from X

7.

Finite State Machine coverage: state and state transition coverage

SYNTHESIS TOOL:
4.4.2

XILINX ISE

4.4.2.1 Introduction
For two-and-a-half decades, Xilinx has been at the forefront of the programmable logic
revolution, with the invention and continued migration of FPGA platform technology. During
that time, the role of the FPGA has evolved from a vehicle for prototyping and glue-logic to a
highly flexible alternative to ASICs and ASSPs for a host of applications and markets. Today,
Xilinx FPGAs have become strategically essential to world-class system companies that are
hoping to survive and compete in these times of extreme global economic instability, turning
43 | P a g e

what was once the programmable revolution into the programmable imperative for both Xilinx
and our customers.
Programmable Imperative
When viewed from the customer's perspective, the programmable imperative is the
necessity to do more with less, to remove risk wherever possible, and to differentiate in order to
survive. In essence, it is the quest to simultaneously satisfy the conflicting demands created by
ever-evolving product requirements (i.e., cost, power, performance, and density) and mounting
business challenges (i.e., shrinking market windows, fickle market demands, capped engineering
budgets, escalating ASIC and ASSP non-recurring engineering costs, spiraling complexity, and
increased risk). To Xilinx, the programmable imperative represents a two-fold commitment. The
first is to continue developing programmable silicon innovations at every process node that
deliver industry-leading value for every key figure of merit against which FPGAs are measured:
price, power, performance, density, features, and programmability. The second commitment is to
provide customers with simpler, smarter, and more strategically viable design platforms for the
creation of world-class FPGA-based solutions in a wide variety of industrieswhat Xilinx calls
targeted design platforms.
Base Platform
The base platform is both the delivery vehicle for all new silicon offerings from Xilinx
and the foundation upon which all Xilinx targeted design platforms are built. As such, it is the
most fundamental platform used to develop and run customer-specific software applications and
hardware designs as production system solutions. Released at launch, the base platform
comprises a robust set of well-integrated, tested, and targeted elements that enable customers to
immediately start a design. These elements include:
1. FPGA silicon
2. ISE Design Suite design environment
3. Third-party synthesis, simulation, and signal integrity tools
4. Reference designs common to many applications, such as memory interface and
configuration designs.
5. Development boards that run the reference designs
6. A host of widely used IP, such as Gig E, Ethernet, memory controllers, and PCIe.
44 | P a g e

Domain-Specific Platform
The next layer in the targeted design platform hierarchy is the domain-specific platform.
Released from three to six months after the base platform, each domain specific platform targets
one of the three primary Xilinx FPGA user profiles (domains):the embedded processing
developer, the digital signal processing (DSP) developer, or the logic/connectivity developer.
This is where the real power and intent of the targeted design platform begins to emerge.
Domain-specific platforms augment the base platform with a predictable, reliable, and
intelligently targeted set of integrated technologies, including:
1. Higher-level design methodologies and tools
2. Domain-specific embedded, DSP, and connectivity IP
3. Domain-specific development hardware and daughter cards
4. Reference designs optimized for embedded processing, connectivity, and DSP
5. Operating systems (required for embedded processing) and software
Every element in these platforms is tested, targeted, and supported by Xilinx and/or our
ecosystem partners. Starting a design with the appropriate domain-specific platform can cut
weeks, if not months, off of the user's development time.
Market-Specific Platform
A market-specific platform is an integrated combination of technologies that enables
software or hardware developers to quickly build and then run their specific application or
solution. Built for use in specific markets such as Automotive, Consumer, Mil/Aero,
Communications, AVB, or ISM, market-specific platforms integrate both the base and domainspecific platforms and provide higher level elements that can be leveraged by customer-specific
software and hardware designs. The market-specific platform can rely more heavily on thirdparty targeted IP than the base or domain-specific platforms. The market-specific platform
includes: the base and domain-specific platforms, reference designs, and boards (or daughter
cards) to run reference designs that are optimized for a particular market (e.g., lane departure
early-warning systems, analytics, and display processing).Xilinx will begin releasing marketspecific platforms three to six months after the domain-specific platforms, augmenting the
domain-specific platforms with reference designs, IP, and software aimed at key growth markets.
Initially, Xilinx will target markets such as Communications, Automotive, Video, and Displays
45 | P a g e

with platform elements that abstract away the more mundane portions of the design, thereby
further reducing the customer's development effort so they can focus their attention on creating
differentiated value in their end solution. This systematic platform development and release
strategy provides the framework for the consistent and efficient fulfillment of the programmable
imperativeboth by Xilinx and by its customers.
Platform Enablers
Xilinx has instituted a number of changes and enhancements that have contributed
substantially to the feasibility and viability of the targeted design platform. These platformenabling changes cover six primary areas:
1.

Design environment enhancements

2.

Socket able IP creation

3.

New targeted reference designs

4.

Scalable unified board and kit strategy

5.

Ecosystem expansion

6.

Design services supporting the targeted design platform approach

Design Environment Enhancements


With the breadth of advances and capabilities that the Virtex-6 and Spartan-6
programmable devices deliver coupled with the access provided by the associated targeted design
platforms, it is no longer feasible for one design flow or environment to fit every designer's
needs. System designers, algorithm designers, SW coders, and logic designers each represent a
different user-profile, with unique requirements for a design methodology and associated design
environment. Instead of addressing the problem in terms of individual fixed tools, Xilinx targets
the required or preferred methodology for each user, to address their specific needs with the
appropriate design flow. At this level, the design language changes from HDL (VHDL/Verilog)
to C, C++, MATLAB software, and other higher level languages which are more widely used
by these designers, and the design abstraction moves up from the block or component to the
system level. The result is a methodology and complete design flow tailored to each user profile
that provides design creation, design implementation, and design verification. Indicative of the
complexity of the problem, to fully understand the user profile of a logic designer, one must
consider the various levels of expertise represented by this demographic. The most basic category
46 | P a g e

in this profile is the push-button user who wants to complete a design with minimum work or
knowledge.
The push-button user just needs good-enough results. Contrastingly, more advanced
users want some level of interactive capabilities to squeeze more value into their design, and the
power user (the expert) wants full control over a vast array of variables. Add the traditional
ASIC designers, tasked with migrating their designs to an FPGA (a growing trend, given the
intolerable costs and risks posed by ASIC development these days), and clearly the imperative
facing Xilinx is to offer targeted flows and tools that support each user's requirements and
capabilities, on their terms. The most recent release of the ISE Design Suite includes numerous
changes that fulfill requirements specifically pertinent to the targeted design platform. The new
release features a complete tool chain for each top-level user profile (the domain-specific
personas: the embedded, DSP, and logic/connectivity designers), including specific
accommodations for everyone from the push-button user to the ASIC designer.
The tighter integration of embedded and DSP flows enables more seamless integration of
designs that contain embedded, DSP, IP, and user blocks in one system. To further enhance
productivity and help customers better manage the complexity of their designs, the new ISE
Design Suite enables designers to target area, performance, or power by simply selecting a design
goal in the setup. The tools then apply specific optimizations to help meet the design goal. In
addition, the ISE Design Suite boasts substantially faster place-and-route and simulation run
times, providing users with 2X faster compile times. Finally, Xilinx has adopted the FLEXnet
Licensing strategy that provides a floating license to track and monitor usage.
4.4.2.2 XILINX ISE Design Tools:
Xilinx ISE is the design tool provided by Xilinx. Xilinx would be virtually identical for
our purposes.
There are four fundamental steps in all digital logic design. The se consist of:
1. Design The schematicor code that describes the circuit.
2. Synthesis The intermediate conversion of human readable circuit description to FPGA
code(EDIF) format. It involves syntax checking and combining of all the separate design
files into a single file.
47 | P a g e

3. Place & Route Where the layout of the circuit is finalized. This is the translation of the
EDIF into logic gates on the FPGA.
4. Program The FPGA is updated to reflect the design through the use of programming
(.bit) files. Test bench simulation is in the second step. As its name implies, it is used for
testing the design by simulating the result of driving the inputs and observing the outputs
to verify your design. ISE has the capability to do a variety of different design
methodologies including: Schematic Capture, Finite State Machine and Hardware
Descriptive Language(VHDL or Verilog).
4.5

SIMULATION IMPLEMENTATION

4.5.1 General
Verilog HDL is a Hardware Description Language (HDL). A Hardware Description
Language is a language used to describe a digital system, for example, a computer or a
component of a computer. One may describe a digital system at several levels. For example, an
HDL might describe the layout of the wires, resistors and transistors on an Integrated Circuit (IC)
chip, i.e., the switch level. Or, it might describe the logical gates and flip flops in a digital system,
i. e., the gate level. An even higher level describes the registers and the transfers of vectors of
information between registers. This is called the Register Transfer Level (RTL). Verilog supports
all of these levels. However, this handout focuses on only the portions of Verilog which support
the RTL level.
4.5.2 Verilog
Verilog is one of the two major Hardware Description Languages (HDL) used by
hardware designers in industry and academia. VHDL is the other one. The industry is currently
split on which is better. Many feel that Verilog is easier to learn and use than VHDL. As one
hardware designer puts it, I hope the competition uses VHDL. VHDL was made an IEEE
Standard in 1987, and Verilog in 1995. Verilog is very C-like and liked by electrical and
computer engineers as most learn the C language in college. VHDL is very most engineers have
no experience. Verilog was introduced in 1985 by Gateway Design System Corporation, now a
part of Cadence Design Systems, Inc.s Systems Division. Until May, 1990, with the formation
of Open Verilog International (OVI), Verilog HDL was a proprietary language of Cadence.
Cadence was motivated to open the language to the Public Domain with the expectation that the
48 | P a g e

market for Verilog HDL-related software products would grow more rapidly with broader
acceptance of the language. Cadence realized that Verilog HDL users wanted other software and
service companies to embrace the language and develop Verilog-supported design tools.

49 | P a g e

CHAPTER 5 RESULT ANALYSIS


5.1

EXPERIMENTAL RESULTS

5.1.1 Output of Separate Modules


Output of the processing element (PE) module

Figure 5.1 PE module simulation output


Fig 5.1 shows the simulation results of the PE module. The PE module finds the absolute
difference of the inputs given i.e. a and b.
Output of the RQ generator module

Figure 5.2 RQ generator module simulation output


Figure 5.2 shows the simulation results of the RQ generator. RQ generator takes k as input and
generates R-residue and Q-quotient as the output.

50 | P a g e

Output of the ERROR DETECTION CIRCUIT

Figure 5.3 EDC module simulation result


Figure 5.3 illustrates the simulation result of the ERROR DETECTION CIRCUIT module. The
EDC detects any errors occurred during the determination of sum of absolute difference. It
compares the rpe,qpe values with rt,qt values respectively that are given as inputs to it. If
the values are the same then, it means there are no errors. Therefore the line sl is set to bit-1.
Otherwise it is set to bit-0.

51 | P a g e

Output of the Multiplexer

Figure 5.4 multiplexer module simulation output


Fig 5.4 shows the simulation window of the multiplexer module. When the sl bit is zero, the
multiplexer output line is connected to spe. Otherwise it is connected to src.
Output of the DATA RECOVERY CIRCUIT

Figure 5.5 DRC simulation output


Fig 5.5 shows the simulation result of the DATA RECOVERY CIRCUIT. The DRC takes the rt
and qt values as input and recovers the original data i.e. src.

52 | P a g e

5.1.2 Final Output

Figure 5.6(a) final output

53 | P a g e

Figure 5.6(a) final output


Fig 5.6(a) and 5.6(b) are the simulation results of the error detection and data recovery circuit that
has been designed.whenever there is an error the circuit detects the error and recovers the data.
An error free data is obtained from the circuit.

54 | P a g e

5.2

SYNTHESIS REPORT

The synthesis report consists of information about the number of registers, LUTs, etc
utilized by the proposed EDDR architecture.

55 | P a g e

CONCLUSION
This work presents an EDDR architecture for detecting the errors and recovering the data
of PEs in a ME. Based on the RQ code, a RQCG-based TCG design is developed to generate the
corresponding test codes to detect errors and recover data. The proposed EDDR architecture is
also implemented by using VERILOG and synthesized by the XILINX 14.4. Experimental results
indicate that that the proposed EDDR architecture can effectively detect errors and recover data
in PEs of a ME with reasonable area overhead and only a slight time penalty.

56 | P a g e

FUTURE WORK
This error detection and data recovery circuit using RQ code generator is limited by the static
value of modulo m. The modulo value is found using the sum of absolute difference value,
which makes it difficult for dynamic modulo determination. In future, dynamic modulo
implementation using prediction technique would make the circuit more reliable and efficient. In
addition to this along with the EDDR circuit, a dynamic minimum SAD detection circuit can be
introduced to motion estimation technique to make the process faster. A dynamic SAD detection
circuit eliminates all the other SAD values except the minimum valu. This would reduce the
number of clock cycles for finding the minimum SAD.

57 | P a g e

APPENDIX
VERILOG CODE
Code for the proposed BIST architecture is mentioned below.
module main1(a,b,rst,clk,regx);
/////////////PE MODULE
task subtractor;
input [15:0] a;
input [15:0] b;
output [15:0] sad;
reg [15:0] sad;
begin
if(a>b)
sad = a-b;
else
sad = b-a;
end
endtask
////////////2*1 MULTIPLEXER
task mux21;
input [15:0] spe;
input [15:0] src;
input sl;
output [15:0] sad;
reg [15:0] sad;
begin
if(sl==1'b0)
sad = spe;
else
sad = src;
end
endtask
////////////ERROR DATA CHECKER
58 | P a g e

task edc;
input [15:0] rpe;
input [15:0] qpe;
input [15:0] rt;
input [15:0] qt;
output sl;
reg sl;
begin
if(rpe==rt && qpe==qt)
sl=1'b0;
else
sl=1'b1;
end
endtask
/////////R-Q CODE GENERATOR
task rqcg;
input [15:0] k;
input [15:0] m;
output [15:0] r;
output [15:0] q;
reg [15:0] r;
reg [15:0] q;
begin
r=k%m;
q=k/m;
end
endtask
///////////DATA RECOVERY CIRCUIT
task drc;
input [15:0] rt;
input [15:0] qt;
input [15:0] m;
output [15:0] src;
reg [15:0] src;
59 | P a g e

begin
src = (qt*m)+rt;
end
endtask
/////////////MAIN MODULE CODE
input [15:0] a;
input [15:0] b;
input rst;
input clk;
output [15:0] regx;
reg [15:0] regx;
reg
reg
reg
reg
reg
reg

[15:0]
[15:0]
[15:0]
[15:0]
[15:0]
[15:0]

regy;
dif;
spe;
m;
rpe;
qpe;

integer rt;
integer qt;
reg sl;
reg [15:0] src;
////////////TCG VARIABLES
integer reg1;
integer reg2;
integer reg3;
reg [15:0] rxij;
reg [15:0] qxij;
reg [15:0] qyij;
reg [15:0] ryij;
integer rij;
reg [15:0] qij;
integer rk;
integer qk;
reg [15:0] qt1;
60 | P a g e

reg [15:0] rl;


integer ql;
//////////////////////LOGIC
always @(posedge clk )
begin
/////////////////////CODE FOR Rpe AND Qpe
m=16'b0000000000111111;
if(rst) begin
reg1=16'b0000000000000000;
reg2=16'b0000000000000000;
reg3=16'b0000000000000000;
regx=16'b0000000000000000;
end
else begin
subtractor(a,b,dif);
spe = regx + dif;
rqcg(spe,m,rpe,qpe);
///////////////////TEST CODE GENRATOR CODE
rqcg(a,m,rxij,qxij);
rqcg(b,m,ryij,qyij);
rij=rxij-ryij;
if(rij<0)begin
rij=(-1)*rij;
rqcg(rij,m,rk,qk);
rij=(-1)*rij;
rk=(-1)*rk;
end
else begin
rqcg(rij,m,rk,qk);
end
reg1=reg1+rk;
if(reg1<0) begin
reg1=(-1)*reg1;
rqcg(reg1,m,rt,qt1);
61 | P a g e

reg1=(-1)*reg1;
rt=(-1)*rt;
end
else begin
rqcg(reg1,m,rt,qt1);
end
qij = qxij -qyij;
reg3 = reg3+qij;
reg2 = reg2+rij;
if(reg2<0)begin
reg2=(-1)*reg2;
rqcg(reg2,m,rl,ql);
reg2=(-1)*reg2;
ql=(-1)*ql;
end
else begin
rqcg(reg2,m,rl,ql);
end
qt = ql + reg3;
/////////////////////CODE FOR Rt AND Qt
drc(rt,qt,m,src);
if(rt<0) begin
qt=src/m;
rt=src%m;
end
edc(rpe,qpe,rt,qt,sl);
mux21(spe,src,sl,regy);
regx = regy;
end
end
endmodule

62 | P a g e

REFERENCES
1. Gorpuni Pawankumar and GanapatiPanda development of motion estimation and
compensation algorithms for video compression in digital storage media, India 2009.
2. B.Liu and A.Zaccarin, New fast algorithms for the estimation of block motion vectors, IEEE
Trans. Circuits Syst. Video technology, Vol.3, pp.440445,Dec 1995.
3. S.kappagantula and K.R.Rao, Motion compensated interframe image Prediction,IEEE Trans.
Commun.vol,COM 33,PP, 1011l015,Sept, 1985.
4 .C. W. Chiou, C. C. Chang, C. Y. Lee, T. W. Hou, and J. M. Lin, Concurrent error detection
and correction in Gaussian normal basis multiplier over GF

, IEEE Trans. Comput., vol. 58,

no. 6, pp. 851857, Jun. 2009.


5. L. Breveglieri, P. Maistri, and I. Koren, A note on error detection in an RSA architecture by
means of residue codes, in Proc. IEEE Int. Symp. On-Line Testing, Jul. 2006, 176 177.

63 | P a g e

You might also like