Chapter-1: 1.2 Pseudo-Random BIST

1
CHAPTER-1
1.1 INTRODUCTION
In the production of integrated circuits, testing is done to identify defective chips. This is
very important for shipping high quality products. Testing is also done to diagnose the reason for
a chip failure in order to improve the manufacturing process. In system maintenance, testing is
done to identify parts that need to be replaced in order to repair a system. Testing a digital circuit
involves applying an appropriate set of input patterns to the circuit and checking for the correct
outputs. The conventional approach is to use an external tester to perform the test. However,
built-in self-test (BIST) techniques have been developed in which some of the tester functions
are incorporated on the chip enabling the chip to test itself. BIST provides a number of wellknown advantages. It eliminates the need for expensive testers. It provides fast location of failed
units in a system because the chips can test themselves concurrently.
And, it allows at-speed testing in which the chip is tested at its normal operating clock
rate which is very important for detecting timing faults. Despite all of these advantages, BIST
has seen limited use in industry because of its area and performance overhead, increased design
time, and lack of BIST design tools. These are problems that this dissertation addresses. The
research described in this dissertation is timely because the interest in BIST is growing rapidly.
The increasing pin count, operating speed, and complexity of ICs is outstripping the capabilities
of external testers. BIST provides solutions to these problems.
1.2 Pseudo-Random BIST
Figure 1.1 is a block diagram showing the architecture for BIST. The circuit that is being
tested is called the circuit-under-test (CUT). There is a test pattern generator which applies test
patterns to the CUT and an output response analyzer which checks the outputs. The test pattern
generator must generate a set of test patterns that provides a high fault coverage in order to
thoroughly test the CUT. Pseudo-random testing is an attractive approach for BIST. A linear
feedback shift register (LFSR) can be used to apply pseudo-random patterns to the CUT. An
LFSR has a simple structure requiring small area overhead. Moreover, an LFSR can also be used
as an output response analyzer thereby serving a dual purpose. BIST techniques such as circular
BIST [Stroud88], [Krasniewski89], and BILBO registers [Koenemann 79] make use of this
advantage to reduce overhead.
Fig.1Block Diagram for BIST

There are limits on the test length, which is the number of pseudo-random patterns that
can be applied during BIST. One limit is simply the amount of time that is required to apply the
patterns. Another limit is the fault simulation time required to determine the fault coverage. A
third limit is heat dissipation for an unpackaged die. Thus, in order for pseudo-random pattern
testing to be effective, a high fault coverage must be obtained for an acceptable test length.
What is considered acceptable depends on the particular test environment.
The probability of detecting a fault with a single random pattern is defined as the
detection probability for the fault and is given by the number of patterns that detect the fault
divided by the total number of inputs patterns, 2n, where n is the number of inputs in the circuit.
Unfortunately, many circuits contain faults with very low detection probabilities. Such faults are
said to be random-pattern-resistant (r.p.r.) [Eichelberger 83] because they are hard to detect with
random patterns and therefore limit the fault coverage for pseudo-random testing. A circuit is
said to be random pattern testable if it does not contain any r.p.r. faults.
If the fault coverage for pseudo-random BIST is insufficient, then there are two solutions.
One is to modify the circuit-under-test to make it random pattern testable, and the other is to
modify the test pattern generator so that it generates patterns that detect the r.p.r. faults.
Innovative techniques for both of these approaches are described in this dissertation. These
techniques enable automated design of pseudo-random BIST implementations that satisfy fault
coverage requirements while minimizing area and performance overhead. These techniques have
been incorporated in the TOPS (Totally Optimized Synthesis-for-test) tool being developed at the
Center for Reliable Computing.
1.3 Weighted Pattern Testing
Weighted pattern testing is performed by weighting the signal probability (probability
that the signal is a '1') for each input to the circuit-under-test. Two issues in weighted pattern
testing are what set of weights to use and how to generate the weighted signals. Many techniques
have been proposed for computing weight sets [Bardell 87]. It has been shown that for most
circuits, multiple weight sets are required to achieve sufficient fault coverage [Wunderlich 88].
For BIST, the weight sets must be stored on-chip and control logic is needed to switch between
them which can result in a lot of overhead.
In order to reduce the BIST overhead for weighted pattern testing, researchers have
looked for efficient methods for on-chip generation of weighted patterns. Wunderlich proposed a
Generator of Unequiprobable Random Tests (GURT) in [Wunderlich 87] that requires very little
hardware overhead but is limited to only one weight set. Hartmann and Kemnitz proposed a
method in [Hartmann93] that uses a modified GURT structure and described test pattern
generators for the C2670 and C7552 benchmark circuits [Brglez85] that require very little
overhead.
However both of these methods are not general methods because they use only a single
weight set and therefore will not provide sufficient fault coverage for many circuits. Methods
that use multiple weight sets with 3 different weight values (0, .5, and 1) were described in
[Pomeranz93] and [AlShaibi94]. These methods essentially fix the value of certain inputs
while random patterns are being applied.
Fig.2 Essential Functions

The two essential functions include the test pattern generator (TPG) and output response
analyzer (ORA). While the TPG produces a sequence of patterns for testing the CUT, the ORA
compacts the output responses of the CUT into some type of Pass/Fail indication. The other two
functions needed for system-level use of the BIST include the test controller (or BIST controller)
and the input isolation circuitry. Aside from the normal system I/O pins, the incorporation of
BIST may also require additional I/O pins for activating the any BIST sequence (the BIST Start
control signal), reporting the results of the BIST (the Pass/Fail indication), and an optional
indication (BIST Done) that the BIST sequence is complete and that the BIST results are valid
and can be read to determine the fault-free/faulty status of the CUT [6].
1.4 BIST PROCESS

Each of the PCB has multiple chips. The system Test Controller can activate self-test
simultaneously on all PCBs. Each Test Controller on each PCB board can activate self-test on all
the chips on the board.
The chip Test Controller runs the self-test on the chip and transmits the result out to the
board Test Controller. The board Test Controller accumulates test results from all chips on the
PCB and sends the results to the system Test Controller. The system Test Controller uses all of
these results to determine if the chips and board are faulty.
1.4.1 BIST Implementation

Figure 2 shows the BIST hardware architecture in more detail. In this project, the BIST
module in the IOP is developed based on the architecture . Basically, a design with embedded
BIST architecture consists of a test controller, hardware pattern generator, input multiplexer,
circuit under test (CUT) which in this project is the IOP and output response compactor.
Optionally, a design with BIST capability may includes also the comparator and Read-OnlyMemory (ROM). As shown in Figure 2, the test controller is used to control the test pattern and
test generation during BIST mode. Hardware pattern generator functions to generate the input
pattern to the CUT.
1.5 DESIGNS FOR TESTABILITY

The mechanics of testing, as illustrated in Figure 3 are similar at all levels of testing,
including design verification. A set of input stimuli is applied to a circuit and the output response
of that circuit is compared to the known good output response, or
Fig.3 Basic testing flow.
BIST include:
1. Pseudo random pattern generator(PRPG)
2. BIST Response Compaction
3. Output response analyzer(ORA)
1.6 PSEUDO RANDOM PATTERN GENERATOR:
To perform test on any circuit, a Pseudo random pattern generator use mostly linear feedback
shift register (LFSR) to generate input test vectors.
1.6.1 Standard LFSR
The standard LFSR method has been used as the test pattern generator for the BIST. A
LFSR is a shift register where the input is a linear function of two or more bits (taps) as shown in
Figure 4 . It consists of D flip-flops and linear exclusive-OR (XOR) gates. It is considered an
external exclusive-OR LFSR as the feedback network of the XOR gates feeds externally from
X0 to Xn-1. One of the two main parts of an LFSR is the shift register. A shift register is used to
shift its contents into adjacent positions within the register or, in the case of the position on the
end, output of the register.
Fig 4 By convention, the output bit of an LFSR that is n bits long is the nth bit; the input bit of an
LFSR is bit 1.
CHAPTER-2
PROJECT DESCRIPTION
2.1 AIM
A digital system is tested and diagnosed during its lifetime on numerous occasions. It is
very critical to have quick and very high fault coverage testing. One common and widely used in
semiconductor industry for IC chip testing is to ensure this is to specify test as one of the system
functions and thus becomes self-test. A system designed without an integrated test strategy which
covering all levels from the entire system to components is being described as chip-wise and
system-foolish. A proper designed Built-In-Self-Test (BIST) is able to offset the cost of added
test hardware while at the same time ensuring the reliability, testability and reduce maintenance
cost [6, 17].
The basic idea of BIST, in its most simple form, is to design a circuit so that the circuit
can test itself and determine whether it is good or bad (fault-free or faulty, respectively).
This typically requires additional circuitry whose functionality must be capable of generating test
patterns as well as providing a mechanism to determine if the output responses of the circuit
under test (CUT) to the test patterns correspond to that of a fault-free circuit . In all the cases
test-per-clock and the test-per-scan schemes are required
2.3 LITERATURE REVIEW:

Pseudorandom built-in self test (BIST) generators have been widely utilized
to test integrated circuits and systems. The arsenal of pseudorandom generators
includes, among others.
2.3.1 EXISTING SYSTEMs:

Pseudorandom Built-In Self Test (BIST) generators have long been
successfully utilized for the testing of integrated circuits and systems. The arsenal
of pseudorandom generators includes Linear Feedback Shift Registers (LFSRs)
[1], Cellular Automata [2] and Accumulators accumulating a constant value [3].
However, some circuits contain faults for which large number of random patterns
has to be generated before high fault coverage is achieved. Therefore, weighted
pseudorandom techniques have been proposed; in weighted pseudorandom
techniques inputs are biased by changing the probability of a 0 or a 1 on a given
input from 0.5 (for pure pseudorandom tests) to some other value.
Weighted random pattern methods that rely on a single weight assignment,
usually fail to achieve complete fault coverage using a reasonable number of test
patterns, since, although the weights are computed to be suitable for most faults,
some faults may require long test sequences to be detected with weight
assignments that do not match their activation and propagation requirements.
Therefore, multiple weight assignments have been suggested in cases where
different faults require biases of the input combinations applied to the circuit, to
ensure that a relatively small number of patterns detect all faults [4].
Approaches to derive weight assignments based on given deterministic test
sets are especially attractive since they have the potential to allow complete
coverage with a small number of random patterns. Furthermore, single weight
assignment schemes suffer from the limitation that, in order to produce weights
which are different from 0.5, several cell of the random pattern generator are
connected to a gate whose output is used to drive an input of the circuit under test
10
(CUT). For example, to produce a weight of a 0.25, two cells of the generator are
connected to an AND gate, whose output drives the input of the CUT.
When weights are assigned arbitrary values, arbitrary numbers of shift
register cells are used to produce the required input values. Register cells are
generally not allowed to be shared between gates that feed different inputs, to
avoid correlation between the inputs of the CUT. Pomeranz and Reddy in [5]
performed a major breakthrough to the weighted pattern generation paradigm,
proposing a method that dynamically walks through the range of possible
test generation approaches, starting from pure pseudorandom tests to detect easyto-detect faults at low hardware cost, then reducing the number of inputs which are
allowed to be specified randomly, fixing an increasing number of the inputs to 0 or
1 according to a given deterministic test set, to detect faults that have more
stringent requirements on input values and cannot be detected by a purely random
sequence of reasonable length.
Thus, the method proposed in [5] can be viewed as a weighted random test
generation method that uses three weights: 0, 0.5 and 1. A weight of 0 corresponds
to fixing an input to 0; a weight of 1 corresponds to fixing an input to 1; and a
weight of 0.5 indicates pure random values. Consequently, every weight is
generated using a single LFSR cell primary input, and a small number of logic
gates to account for the weights 0 and 1. Current VLSI circuits, e.g. data path
architectures, or Digital Signal Processing chips [6] commonly contain arithmetic
modules (accumulators or ALUs).
Utilizing accumulators for Built-In Testing (compression of the CUT
responses, or generation of test patterns) has been shown to result in low hardware
11
overhead and low impact on the circuit normal operating speed. In [7] it was
proved that the test vectors generated by an accumulator whose inputs are driven
by a constant pattern can have quite good pseudorandom characteristics, if the
input pattern is properly selected. To the best of our knowledge, no accumulatorbased weighted pattern generator has been proposed in the open literature to date.
In this paper, an Accumulator-based Weighted Pattern Generator (AWPG) is
presented, based on an accumulator whose inputs are driven by a small number of
constant patterns. The accumulator is modified in such way that the weight of each
output can be 0 / 1, or 0.5, depending on the input value of the certain input of the
accumulator and a control input.
2.3.2 WEIGHTED PATTERN GENERATION:
All weighted random and related methods suffer from the following
limitation. To produce weights which are different from 0.5, several cells of an
LFSR or a shift register are connected to a gate whose output is used to derive the
corresponding primary input of the circuit under test; e.g., to produce a weight of
0.25, two cells of the LFSR are connected to an AND gate, whose output drives a
primary input of the circuit. When weights are allowed to assume arbitrary values,
arbitrary numbers of shift-register cells have to be used to produce the required
input values. Register cells are generally not allowed to be shared between circuits
that generate weights for different primary inputs, to avoid correlation between the
values different primary inputs assume.
In [14], experimental evidence was presented to demonstrate that in some
cases, such correlations may not affect the fault coverage achieved. However, some
12
faults remain undetected in many of the circuits considered. A different approach to

efficient generation of tests is reported in [17], where special hardware is used to
produce a subset of a deterministic test set, and detection of all faults is based on
generating the selected subset of deterministic tests as part of a larger set of input
combinations which can be regarded as random patterns for faults not detected by
the target subset of tests, The hardware consists of a counter and XOR gates. In
some cases, the hardware cost to achieve complete fault coverage may become
high, or the number of test patterns may become extremely large. Work on efficient
generation of tests thus varies from pure random tests, which have low hardware
requirements but may require large test sequences (or, alternatively, may leave
some faults undetected), to application of a deterministic test set, which is short but
may have a high hardware cost and may not provide adequate coverage of un
modeled faults.
Other approaches that were investigated, and which mix deterministic tests
and random tests, include the use of several seeds, or initial LFSR states, to
generate random patterns [18], instead of a single seed conventionally used, and
the use of transformations, other than the one performed by the LFSR, on the states
of the LFSR [ 19]. Random pattern generation for sequential circuits was
investigated in [20]-[23].
In this paper we proposed TPG scheme that can convert an
SIC vector to unique low transition vectors for multiple scan
chains. First, the SIC vector is decompressed to its multiple
codewords. Meanwhile, the generated codewords will bit- XOR
with a same seed vector in turn. Hence, a test pattern with
similar test vectors will be applied to all scan chains. The
proposed MSIC-TPG consists of an SIC generator, a seed
generator, an XOR gate network, and a clock and control block
13
A minimal number of weight assignments is searched for, to keep hardware

requirements low, and at the same time, the number of tests generated for every
weight assignment is limited, to limit the total test length. In contrast to other
weighted random methods (e.g., [lo]), a fixed number of random tests (sufficient to
detect all target faults) is generated for every weight assignment to avoid the
hardware overhead related to the use of different numbers of tests.
The method proposed thus combines the advantages of the reconfigurable
Johnson counter and the scalable SIC counter to achieve the following aims.
Test-per-clock and the test-per-scan schemes with considerable power reduction.
2.4 COMPUTING WEIGHT ASSIGNMENTS
Under the approach proposed here, tests are generated using several
assignments of weights. In this section, the selection of the weights is described. To
derive weight assignments to be used in weighted pseudo-random test pattern
generation, we use a known deterministic test set that provides the desired (in our
case 100%) coverage of detectable faults. As in earlier work, a weight of a, 0 I a I
1, assigned to an input x of the circuit under test implies that the probability of
input x being assigned the value 1 is a. The following example illustrates the basic
idea behind our method for deriving the weight assignments to be used for
weighted pseudorandom test pattern generation.
Example: Let the primary inputs to a circuit be A, B, C, D, and let T = (1101,
1001, 0011, 0000} be a set of tests that detects all faults in the circuit. Let us pick
the weight assignment (1, 0.5, 0, 1); i.e., inputs A and D are fixed to 1, input C is
fixed to 0, and the value of input B is randomly generated a predetermined number
14
of times. As soon as both 0 and 1 are generated for B, the first two elements of Tare
produced and applied to the circuit.
The weight assignment (1, 0.5, 0, 1) can be derived from T by intersecting

the first two elements of T, to obtain 1-01 ; i.e., when the two tests are equal, the
same value is assigned to the intersection, while a - is assigned to the intersection
when the two tests differ (intersection is formally defined later). - is then
interpreted as a 0.5 weight. To allow the remaining two tests of T to be generated,
we derive the weight assignment (0, 0, 0.5, 0.5) by intersecting the last two tests of
T. The two weight assignments, (1, 0.5, 0, 1) and (0, 0, 0.5, O S ) , can be used to
generate T as part of a larger test sequence using a The example above illustrates
how a weighted random pattern generator restricted to weights 0, 0.5, and 1 can be
used to generate a given test set as part of a larger weighted random pattern
generator sequence of test patterns.
The parameters that influence the hardware complexity of such a pattern
generator are: the number of weight assignments, the number of random patterns
generated for each weight assignment, and whether the number of random patterns
generated with each weight assignment is fixed or variable. In our method, we
chose to use a fixed number of random patterns with each weight assignment. In
the example above, we implicitly required that the complete deterministic test set
be generated using weighted random patterns.
A more realistic objective is to achieve the same coverage as the given
deterministic test set, not necessarily using the same tests. The deterministic test set
is, therefore, used in the method presented here to guide the selection of weight
15
assignments, with the built-in capability to regenerate some or all of the tests in the
given deterministic test set only if the required fault coverage cannot be otherwise
achieved. The following provides the details of the method proposed.
Under the proposed method, test generation always starts with a set of pure
random patterns, to detect easy to- detect faults at low hardware cost. Using the
tests out of the deterministic test set that detect faults which are yet undetected,
weight assignments are computed, that have increasingly large numbers of inputs
fixed to 0 or 1, to allow the harder to detect faults (faults with more stringent
requirement on primary input values) to be detected. The number of 0.5 weights in
a weight assignment, denoted by K, is decreased when a sequence of pseudorandom patterns generated for the weight assignment with K 0.5-weights fails to
detect any additional faults.
At that point, all remaining faults require more input values to be specified
for an additional fault to be detected, and the value of K must be decreased. The
magnitude of the decrement in K and the number of faults detected that bring about
a decrease in K can be varied. The general form of the algorithm for computing the
weight assignments, when detection of no additional faults causes a decrease by 1
in K, is as follows.
Procedure 1: The general form of weight assignment computation
1) Set F to be the set of all target faults (a set of collapsed, detectable faults). Set K
to equal the number of primary inputs.
2) If F is empty, stop.
3) Find a weight assignment t suitable for the faults in F, that has K 0.5-weights
(the selection of t is described later).
16
4) Generate N random patterns by fixing the inputs that have a weight 0 (1) under t
to logic value 0 (l), and randomly specifying the other inputs (N is a predetermined
constant). For each random pattern generated, perform fault simulation for every
fault f E F. If f is detected, remove f from F.
5) If no fault was detected by the previously applied N
tests, set K = K - 1..
Initially, K is set equal to the number of primary inputs, and sets of N
patterns are generated, until either all faults are detected and F is empty, or no
additional fault is detected by all N patterns in the last set. In the former case, test
generation is complete. In the latter case, K is reduced by 1, a new weight
assignment is computed, and test generation proceeds.
In the worst case, K is reduced to zero, and the deterministic test set (or
parts of it) is reproduced (this point will be clarified when the selection of weight
assignments is explained). The method is thus complete in the sense that fault
coverage can be guaranteed to equal deterministic fault coverage. Note that weight
assignment computation is performed by Procedure 1 dynamically; i.e., fault
simulation is performed for every random pattern generated, and detected faults are
dropped. As a result, the weight assignments computed are matched to a specific
random pattern generation method.
When the random pattern generation method is changed, a fault previously
detected may be left undetected, and complete fault coverage may not be achieved.
To solve this problem, in all the experiments we performed, an LFSR was
simulated, and pseudo-random patterns were generated by the LFSR. The same
LFSR can then be implemented to generate the random patterns in the actual
17
circuit, and the same fault coverage is achieved. To complete Procedure 1, a

method for computing a weight assignment.
18
CHAPTER-3
WORKING OF PROJECT
3.1 INTRODUCTION
Pseudorandom built-in self test (BIST) generators have been widely
utilized to test integrated circuits and systems. The arsenal of pseudorandom
generators includes, among others, linear feedback shift registers (LFSRs) [1],
cellular automata [2], and accumulators driven by a constant value [3]. For circuits
with hard-to-detect faults, a large number of random patterns have to be generated
before high fault coverage is achieved.
In many case we used a clock gating technique where two non overlapping
clocks control the odd and even scan cells of the scan chain so that the shift power
dissipation is reduced by a factor of two. The ring generator [10] can generate a
single-input change (SIC) sequence which can effectively reduce test power. The
third approach aimsto reduce the dynamic power dissipation during scan shift
through gating of the outputs of a portion of the scan cells. Several low-power
approaches have also been proposed for scan-based BIST. It will modifies scanpath structures, and lets the CUT inputs remain unchanged during a shift operation.
Using multiple scan chains with many scan enable (SE) inputs to activate one scan
chain at a time, the TPG proposed in [18] can reduce average power consumption
during scan-based tests and the peak power in the CUT.
19
In [19], a pseudorandom BIST scheme was proposed to reduce switching

activities in scan chains. However, modules containing hard-to-detect faults still
require extra test hardware either by inserting test points into the mission logic or
by storing additional deterministic test patterns [24], [25]. In order to overcome
this problem, an accumulator-based weighted pattern generation scheme was
proposed in [11]. The scheme generates test patterns having one of three weights,
namely 0, 1, and 0.5 therefore it can be utilized to drastically reduce the test
application time in accumulatorbased test pattern generation. However, the
scheme proposed in [11] possesses three major drawbacks: More precisely: 1) it
does not impose any requirements about the design of the adder (i.e., it can be
implemented using any adder design); 2) it does not require any modification of
the adder; and hence, 3) does not affect the operating speed of the adder.
Furthermore, the proposed scheme compares favorably to the scheme proposed in
[11] and [22] in terms of the required hardware overhead.
3.2 SINGLE-INPUT CHANGE (SIC)

The ring generator can generate a single-input change (SIC) sequence which
can effectively reduce test power. Generated SIC sequences have low transition
sequences for each scan chain. This can decrease the switching activity in scan
cells during scan-in shifting. The advantages of the proposed sequence can be
summarized as follows.
1) Minimum transitions: In the proposed pattern, each generated vector applied to
each scan chain is an SIC vector, which can minimize the input transition and
reduce test power.
20
2) Uniqueness of patterns: The proposed sequence does not contain any repeated
patterns, and the number of distinct patterns in a sequence can meet the
requirement of the target fault coverage for the CUT.
3) Uniform distribution of patterns: The conventional algorithms of modifying the
test vectors generated by the LFSR use extra hardware to get more correlated test
vectors with a low number of transitions. However, they may reduce the
randomness in the patterns, which may result in lower fault coverage and higher
test time [3]. It is proved in this paper that our multiple SIC (MSIC) sequence is
nearly uniformly distributed.
4) Low hardware overhead consumed by extra TPGs: The linear relations are
selected with consecutive vectors or within a pattern, which has the benefit of
generating a sequence with a sequential decompressor. Hence, the proposed TPG
can be easily implemented by hardware
3.3 MSIC-TPG SCHEME

Test Pattern Generation Method Assume there are m primary inputs (PIs)
and M scan chains in a full scan design, and each scan chain has l scan cells. Fig.
1(a) shows the symbolic simulation for one generated pattern. The vector generated
by an m-bit LFSR with the primitive polynomial can be expressed as
S(t) =S0(t)S1(t)S2(t), . . . , Sm1(t) (hereinafter referred to as the seed), and the
vector generated by an l-bit Johnson counter can be expressed as J (t) =
J0(t)J1(t)J2(t), . . . , Jl1(t). In the first clock cycle, J = J0 J1 J2, . . . , Jl1 will bitXOR with S = S0S1S2, . . . , SM1, and the results X1Xl+1X2l+1, . . . , X(M1)l+1
will be shifted into M scan chains, respectively. In the second clock cycle, J = J0
J1 J2, . . . , Jl1 will be circularly shifted as J = Jl1 J0 J1, . . . , Jl2, which will
also bit-XOR with the seed S = S0S1S2, . . . , SM1. The resulting
21
X2Xl+2X2l+2, . . . , X(M1)l+2 will be shifted into M scan chains, respectively.

After l clocks, each scan chain will be fully loaded with a unique Johnson
codeword, and seed S0S1S2, . . . , Sm1 will be applied to m PIs.
Fig 3.1 Symbolic representation of an MSIC pattern.

3.3.1 Reconfigurable Johnson Counter
For a short scan length, we develop a reconfigurable Johnson counter to
generate an SIC sequence in time domain. As shown in Fig. 2(a), it can operate in
three modes.
1) Initialization: When RJ_Mode is set to 1 and Init is set to logic 0, the
reconfigurable Johnson counter will be initialized to all zero states by clocking
CLK2 more than l times.
2) Circular shift register mode: When RJ_Mode and Init are set to logic 1, each
stage of the Johnson counter will output a Johnson codeword by clocking CLK2 l
times.
3) Normal mode: When RJ_Mode is set to logic 0, the reconfigurable Johnson
counter will generate 2l unique SIC vectors by clocking CLK2 2l times.
22
Fig 3.2 Reconfigurable Johnson counter.
3.4 SCALABLE SIC COUNTER

When the maximal scan chain length l is much larger than the scan chain
number M, we develop an SIC counter named the scalable SIC counter. As
shown in Fig. 2(b), it contains a k-bit adder clocked by the rising SE signal, a k-bit
subtractor clocked by test clock CLK2, an M-bit shift register clocked by test clock
CLK2, and k multiplexers. The value of k is the integer of log2(l M). The
waveforms of the scalable SIC counter are shown in Fig. 3.3 The k-bit adder is
clocked by the falling SE signal, and generates a new count that is the number of 1s
(0s) to fill into the shift register. As shown in Fig. 3.3 it can operate in three
modes:
23
1) If SE = 0, the count from the adder is stored to the k-bit subtractor. During SE =
1, the contents of the k-bit subtractor will be decreased from the stored count to all
zeros gradually.
2) If SE = 1 and the contents of the k-bit subtractor are not all zeros, M-Johnson
will be kept at logic 1 (0).
3) Otherwise, it will be kept at logic 0 (1). Thus, the needed 1s (0s) will be shifted
into the M-bit shift register by clocking CLK2 l times, and unique Johnson
codewords will be applied into different scan chains.
Fig 3.3 Scalable SIC counter.
24
3.4.1 Test-Per-Clock Schemes

The MSIC-TPG for test-per-clock schemes is illustrated in Fig. 3.4 The
CUTs PIs X1 Xmn are arranged as an n m SRAM-like grid structure. Each grid
has a two-input XOR gate whose inputs are tapped from a seed output and an
output of the Johnson counter. The outputs of the XOR gates are applied to the
CUTs PIs. A seed generator is an m-stage conventional LFSR, and operates at low
frequency CLK1. The test procedure is as follows.
1) The seed generator generates a new seed by clocking CLK1 one time.
2) The Johnson counter generates a new vector by clocking CLK2 one time.
3) Repeat 2 until 2l Johnson vectors are generated.
4) Repeat 13 until the expected fault coverage or test length is achieved.
Fig 3.4 test-per-scan schemes.
25
The inputs of the XOR gates come from the seed generator and the SIC
counter, and their outputs are applied to M scan chains, respectively. The outputs of
the seed generator and XOR gates are applied to the CUTs PIs, respectively. The
test procedure is as follows.
1) The seed circuit generates a new seed by clocking CLK1 one time.
2) RJ_Mode is set to 0. The reconfigurable Johnson counter will operate in the
Johnson counter mode and generate a Johnson vector by clocking CLK2 one time.
3) After a new Johnson vector is generated, RJ_Mode and Init are set to 1. The
reconfigurable Johnson counter operates as a circular shift register, and generates l
codewords by clocking CLK2 l times. Then, a capture operation is inserted.
4) Repeat 23 until 2l Johnson vectors are generated.
5) Repeat 14 until the expected fault coverage or test length is achieved.
26
CHAPTER-4
HARDWARE AND SOFTWARE USED
3.11 SIMPLE ASIC DESIGN FLOW
FIG 3.3: SIMPLE ASIC DESIGN FLOW

For any design to work at a specific speed, timing analysis has to be performed. We need
to check whether the design is meeting the speed requirement mentioned in the specification.
This is done by Static Timing Analysis Tool, for example Primetime.
Register Transfer Logic
27
RTL is expressed in Verilog or VHDL. This document will cover the basics of Verilog.
Verilog is a Hardware Description Language (HDL). A hardware description language is a
language used to describe a digital system example Latches, Flip-Flops, Combinatorial,
Sequential Elements etc Basically you can use Verilog to describe any kind of digital system.
One can design a digital system in Verilog using any level of abstraction. The most important
levels are
Behavior Level
This level describes a system by concurrent algorithms (Behavioral). Each algorithm
itself is sequential, that means it consists of a set of instructions that are executed one after the
other. There is no regard to the structural realization of the design.
Register Transfer Level (RTL)
Designs using the Register-Transfer Level specify the characteristics of a circuit by
transfer of data between the registers, and also the functionality; for example Finite State
Machines. An explicit clock is used. RTL design contains exact timing possibility; and data
transfer is scheduled to occur at certain times.
Gate level
The system is described in terms of gates (AND, OR, NOT, NAND etc). The signals
can have only these four logic states (0,1,X,Z). The Gate Level design is normally not
done because the output of Logic Synthesis is the gate level netlist.
Optimization
The circuit at the gate level in terms of the gates and flip-flops can be redundant in
nature. The same can be minimized with the help of minimization tools. The step is not shown
separately in the figure. The minimized logical design is converted to a circuit in terms of the
switch level cells from standard libraries provided by the foundries. The cell based design
generated by the tool is the last step in the logical design process; it forms the input to the first
level of physical design.
28
3.12 FPGA DESIGN FLOW

The standard FPGA design flow starts with design entry using schematics or a hardware
description language (HDL), such as Verilog HDL or VHDL. In this step, you create the digital
circuit that is implemented inside the FPGA. The flow then proceeds through compilation,
simulation, programming, and verification in the FPGA hardware we first define the relevant
terminology in the field and then describe the recent evolution of FPDs. The three main
categories of FPDs are delineated: Simple PLDs (SPLDs), Complex PLDs (CPLDs) and FieldProgrammable Gate Arrays (FPGAs).
Introduction to High-Capacity FPDs
Prompted by the development of new types of sophisticated field-programmable devices,
the process of designing digital hardware has changed dramatically over the past few years.
Unlike previous generations of technology, in which board-level designs included large numbers
of SSI chips containing basic gates, virtually every digital design produced today consists mostly
of high-density devices. This applies not only to custom devices like processors and memory, but
also for logic circuits such as state machine controllers, counters, registers, and decoders. When
such circuits are destined for high-volume systems they have been integrated into high-density
gate arrays. However, gate array NRE costs often are too expensive and gate arrays take too long
to manufacture to be viable for prototyping or other low-volume scenarios. For these reasons,
most prototypes, and also many production designs are now built using field-programmable
devices. The most compelling advantages of field-programmable devices are instant
manufacturing turnaround, low start-up costs, low financial risk and (since programming is done
by the end user) ease of design changes.
The market for FPDs has grown dramatically over the past decade to the point where there is
now a wide assortment of devices to choose from. A designer today faces a daunting task to
research the different types of chips, understand what they can best be used for, choose a
particular manufacturers product, learn the intricacies of vendor-specific software and then
design the hardware. Confusion for designers is exacerbated by not only the sheer number of
field-programmable devices available, but also by the complexity of the more sophisticated
devices. The purpose of this paper is to provide an overview of the architecture of the various
29
types of field-programmable devices. The emphasis is on devices with relatively high logic
capacity; all of the most important commercial products are discussed. Before proceeding, we
provide definitions of the terminology in this field. This is necessary because the technical jargon
has become somewhat inconsistent over the past few years as companies have attempted to
compare and contrast their products in literature.
Definitions of Relevant Terminology
The most important terminology used in this paper is defined below.
Field-Programmable Device (FPD): A general term that refers to any type of integrated circuit
used for implementing digital hardware, where the chip can be configured by the end user to
realize different designs. Programming of such a device often involves placing the chip into a
special programming unit, but some chips can also be configured in-system. Another name for
FPDs is programmable logic devices (PLDs); although PLDs encompass the same types of chips
as FPDs, we prefer the term FPD because historically the word PLD has referred to relatively
simple types of devices.
PLA : Programmable Logic Array (PLA) is a relatively small FPD that contains two levels of
logic, an AND-plane and an OR-plane, where both levels are programmable (note: although PLA
structures are sometimes embedded into full-custom chips, we refer here only to those PLAs that
are provided as separate integrated circuits and are user-programmable).
PAL: Programmable Array Logic (PAL) is a relatively small FPD that has a programmable
AND-plane followed by a fixed OR-plane.
SPLD: It refers to any type of Simple PLD, usually either a PLA or PAL.
CPLD: A more Complex PLD that consists of an arrangement of multiple SPLD-like blocks on
a single chip. Alternative names (that will not be used in this paper) sometimes adopted for this
style of chip are Enhanced PLD (EPLD), Super PAL, Mega PAL, and others.
FPGA: Field-Programmable Gate Array is an FPD featuring a general structure that allows
Very high logic capacity. Whereas CPLDs feature logic resources with a wide number of inputs
(AND planes), FPGAs offer more narrow logic resources. FPGAs also offer a higher ratio of
flip-flops to logic resources than do CPLDs.
30
Interconnect: The wiring resources in an FPD.

Programmable Switch: A user-programmable switch that can connect a logic element to an
interconnect wire, or one interconnect wire to another.
Logic Block: A relatively small circuit block that is replicated in an array in an FPD. When a
circuit is implemented in an FPD, it is first decomposed into smaller sub-circuits that can each be
mapped into a logic block. The term logic block is mostly used in the context of FPGAs, but it
could also refer to a block of circuitry in a CPLD.
Logic Capacity: The amount of digital logic that can be mapped into a single FPD. This is
usually measured in units of equivalent number of gates in a traditional gate array. In other
words, the capacity of an FPD is measured by the size of gate array that it is comparable to. In
simpler terms, logic capacity can be thought of as number of 2-input NAND gates.
Logic Density: The amount of logic per unit area in an FPD.
Speed-Performance measures the maximum operable speed of a circuit when implemented
in an FPD. For combinational circuits, it is set by the longest delay through any path, and for
sequential circuits it is the maximum clock frequency for which the circuit functions properly.
3.13 FPGA PERFORMANCE
While the headline performance increase offered by FPGAs is often very large (>100
times for some algorithms) it is important to consider a number of factors when assessing their
usefulness for accelerating a particular application. Firstly, is it practical to implement the whole
application on an FPGA? The answer to this is likely to be no, particularly for floating-point
intensive applications which tend to swallow up a large amount of logic. If it is either impractical
or impossible to implement the whole application on an FPGA, the next best option is to
implement those kernels within the application that are responsible for the majority of the runtime, which may be determined by profiling. Next, the real speedup of the whole application
must be estimated once the kernel has been implemented in a FPGA. Even if that kernel was
originally responsible for 90% of the runtime the total speed-up that you can achieve for your
application cannot exceed 10 times (even if you achieve a 1000 times speed up for the kernel), an
example of Amdahls law, that long time irritant of the HPC software engineer. Once such an
estimate has been made, one must decide if the potential gain is worthwhile given the complexity
31
of instantiating the algorithm on an FPGA. Then, and only then, one should consider whether the
kernel in question is suited to implementation on an FPGA.
In general terms FPGAs are best at tasks that use short word length integer or fixed point data,
and exhibit a high degree of parallelism, but they are not so good at high precision floating-point
arithmetic (although they can still outperform conventional processors in many cases). The
implications of shipping data to the FPGA from the CPU and vice versa must also come under
consideration, for if that outweighs any improvement in the kernel then implementing the
algorithm in an FPGA may be an exercise in futility.
FPGAs are best suited to integer arithmetic. Unfortunately, the vast majority of scientific codes
rely heavily on 64 bit IEEE floating point arithmetic (often referred to as double precision
floating point arithmetic). It is not unreasonable to suggest that in order to get the most out of
FPGAs computational scientists must perform a thorough numerical analysis of their code, and
ideally reemployment it using fixed point arithmetic or lower precision floating-point arithmetic.
Scientists who have been used to continual performance increases provided by each new
generation of processor are not easily convinced that the large amount of effort required for such
an exercise will be sufficiently rewarded. That said the recent development of efficient floating
point cores has gone some way towards encouraging scientists to use FPGAs. If the performance
of such cores can be demonstrated by accelerating a number of real-world applications then the
wider acceptance of FPGAs will move a step closer. At present there is very little performance
data available for 64-bit floating-point intensive algorithms on FPGAs. To give an indication of
expected performance we have therefore used data taken from the Xilinx floating-point cores
(v3) datasheet.
To measure the area, performance and power consumption gap between field -programmable
gate arrays (FPGAs) and standard cell application-specific integrated circuits (ASICs) for the
following reasons:
1. In the early stages of system design, when system architects choose their implementation
medium, they often choose between FPGAs and ASICs. Such decisions are based on the
32
differences in cost (which is related to area), performance and power consumption between these
implementation media but to date there have been few attempts to quantify these differences. A
system architect can use these measurements to assess whether implementation in an FPGA is
feasible.
These measurements can also be useful for those building ASICs that contain programmable
logic, by quantifying the impact of leaving part of a design to be implemented in the
programmable fabric.
2. FPGA makers seeking to improve FPGAs can gain insight by quantitative measurements of
these metrics, particularly when it comes to understanding the benefit of less programmable (but
more efficient) hard heterogeneous blocks such as block memory multipliers/accumulators and
multiplexers that modern FPGAs often employ.
4.1 FPGA
A field-programmable gate array (FPGA) is an integrated circuit designed
to be configured by the customer or designer after manufacturinghence "fieldprogrammable".
The FPGA configuration is generally specified using a hardware
description language (HDL), similar to that used for an application-specific
integrated circuit (ASIC) (circuit diagrams were previously used to specify the
configuration, as they were for ASICs, but this is increasingly rare). FPGAs can be
used to implement any logical function that an ASIC could perform.
FPGAs contain programmable logic components called "logic blocks", and

a hierarchy of reconfigurable interconnects that allow the blocks to be "wired
together"somewhat like a one-chip programmable breadboard.
33
Logic blocks can be configured to perform complex combinational

functions, or merely simple logic gates like AND and XOR. In most FPGAs, the
logic blocks also include memory elements, which may be simple flip-flops or
more complete blocks of memory.
4.1.1 Introduction
The area of field programmable gate array (FPGA) design is evolving at
a rapid pace. The increase in the complexity of the FPGA's architecture means that
it can now be used in far more applications than before. The newer FPGAs are
steering away from the plain vanilla type "logic only" architecture to one with
embedded dedicated blocks for specialized applications.
Definitions of Relevant Terminology are
Field-programmable Device (FPD) a general term that refers to any
type of integrated circuit used for implementing digital hardware, where the chip
can be configured by the end user to realize different designs.
PLA a Programmable Logic Array (PLA) is a relatively small FPD

that contains two levels of logic, an AND-plane and an OR-plane, where both
levels are programmable.
34
PAL a Programmable Array Logic (PAL) is a relatively small FPD that

has a programmable AND-plane followed by a fixed OR-plane. SPLD refers to
any type of Simple PLD, usually either a PLA or PAL. CPLD a more Complex
PLD that consists of an arrangement of multiple SPLD-like blocks on a single
chip.
4.1.2 The FPGA Landscape

In the semiconductor industry, the programmable logic segment is the best
indicator of the progress of technology. No other segment has such varied offerings
as field programmable gate arrays. It is no wonder that FPGAs were among the
first semiconductor products to move to the 0.13m technology, and again recently
to 90nm technology.
35
Fig. 12 Structure of an FPGA
The players in the current programmable logic market are Altera, Atmel,
Actel, Cypress, Lattice, Quick logic and Xilinx. Some of the larger and
more popular device families are: Stratix from Altera, Accelerator from Actel, is
XPGA from Lattice and Virtex from Xilinx.
36
Between these FPGA devices, many major electronics applications such as

communications, video, image and digital signal processing, storage area networks
and aerospace are covered.
4.1.3 FPGA synthesis: the vendor-independent approach
Dedicated memory blocks offer data storage and can be configured as
basic single-port RAMs, ROMs (read only memory), FIFOs (first in first out), or
CAMs (Content Addressable m\Memory). Data processing or the logic fabric of
these FPGAs varies widely in size with the biggest Xilinx Virtex-II Pro offering
up to 100K LUT4s. The ability to interface the FPGA with backplanes, high-speed
buses, and memories is possible by the availability of various single-ended and
differential I/O standards support.
Many of the major electronics applications such as communications, video,
image and digital signal processing; storage area networks and aerospace are
covered between the above-mentioned FPGA devices.
In a similar manner, for programmable systems applications requiring
embedded processors, the Virtex-II Pro with its 32-bit RISC processor
(PowerPC 405) would be an ideal choice.
Table 4.1 Features Offered In FPGA
Features
Clock
Xilinx virtex II
Altera
Actel
Pro
stratix
axcelerator
DCM
PLL
PLL
Lattice
pXPGA
Sys CLOCK
is
37
management
Embedded
memory
blocks
Up to 8
Up to 12
Up to 12
PLL up to 8
Block RAM
Tri Matrix
Up to 10 Mbit
Memory
Embedded
Sys MEM
RAM
Blocks
Up to10 Mbit Up to 338K
Data
CLB and
processing
18-bitx 18-bit
LEs
and Logic modules PFU based
embedded
multipliers
Multipliers
(C-cell
Advanced IO Advanced
I/O s
Support
Embedded
features
power PC405
Cores
DSP blocks
&R-
cell)
Programmable Select IO
Special
Up to 414K
Sys IO
IO Support
Per pin
FIFOs for bus
application
Sys Hs 1 for
high
speed
serial interface
4.1.4 Applications of FPGAs

A list of typical applications includes: random logic, integrating multiple
SPLDs, device controllers, communication encoding and filtering, small to
medium sized systems with SRAM blocks, and many more.
4.2 Altera DE0 Board
38
The DE0 board has many features that allow the user to implement a wide
range of designed circuits, from simple circuits to various multimedia projects.The
ollowing hardware is provided on the DE0 board:
Altera Cyclone III 3C16 FPGA device
Altera Serial Configuration device EPCS4
USB Blaster (on board) for programming and user API control; both JTAG
and Active Serial

(AS) programming modes are supported
8-Mbyte SDRAM
4-Mbyte Flash memory
SD Card socket
3 pushbutton switches
10 toggle switches
10 green user LEDs
50-MHz oscillator for clock sources
VGA DAC (4-bit resistor network) with VGA-out connector
RS-232 transceiver
PS/2 mouse/keyboard connector
Two 40-pin Expansion Headers
39
Figure 13. The DE0 board.

4.2.1 Block Diagram of the DE0 Board:
Figure 4.2 gives the block diagram of the DE0 board. To provide maximum
flexibility for the user, all connections are made through the Cyclone IIII FPGA
device. Thus, the user can configure the FPGA to implement any system design.
40
Figure 14. Block diagram of the DE0 board.

4.2.2 Cyclone IIII 3C16 FPGA
15,408 LEs
56 M9K Embedded Memory Blocks
504K total RAM bits
56 embedded multipliers
4 PLLs
346 user I/O pins
Built-in USB Blaster circuit
On-board USB Blaster for programming and user API (Application
programming interface) control

Using the Altera EPM240 CPLD
SDRAM
One 8-Mbyte Single Data Rate Synchronous Dynamic RAM memory chip
Supports 16-bits data bus
41
Flash memory
4-Mbyte NOR Flash memory
Support Byte (8-bits)/Word (16-bits) mode
SD card socket
Provides both SPI and SD 1-bit mod SD Card access
Pushbutton switches
3 pushbutton switches
Normally high; generates one active-low pulse when the switch is pressed
Slide switches
10 Slide switches
A switch causes logic 0 when in the DOWN position and logic 1 when in
the UP position
General User Interfaces
10 Green color LEDs (Active high)
4 seven-segment displays (Active low)
16x2 LCD Interface (Not include LCD module)
Clock inputs
50-MHz oscillator
VGA output
Uses a 4-bit resistor-network DAC
With 15-pin high-density D-sub connector
Supports up to 1280x1024 at 60-Hz refresh rate
Serial ports
One RS-232 port (Without DB-9 serial connector)
One PS/2 port (Can be used through a PS/2 Y Cable to allow you to connect
a keyboard and mouse to one port)
Two 40-pin expansion headers
42
72 Cyclone III I/O pins, as well as 8 power and ground lines, are brought out
to two 40-pin expansion connectors 40-pin header is designed to accept a
standard 40-pin ribbon cable used for IDE hard drives
4.3 SOFTWARES USED

4.3.2
Model Sim
High Performance and Capacity Mixed HDL Simulation Model Sim

Mentor Graphics was the first to combine single kernel simulator (SKS)
technology with a unified debug environment for Verilog, VHDL, and SystemC.
The combination of industry-leading, native SKS performance with the best
integrated debug and analysis environment make ModelSim the simulator of
choice for both ASIC and FPGA design. The best standards and platform support in
the industry make it easy to adopt in the majority of process and tool flows.
ModelSim-Altera Edition
Recommended for simulating all FPGA designs (Cyclone, Arria, and

Stratix series FPGA designs)
43
33 percent faster simulation performance than ModelSim-Altera Starter

Edition.
No line limitations
Buy today for $945.
ModelSim-Altera Starter Edition
Support for simulating small FPGA designs
10,000 executable line limitations.
4.3.3 Quartus II
Quartus II is a software tool produced by Altera for analysis and synthesis of
HDL designs, which enables the developer to compile their designs, perform
timing analysis, examine RTL diagrams, simulate a design's reaction to different
stimuli, and configure the target device with the programmer.
Quartus II Web Edition
The Web Edition is a free version of Quartus II that can be downloaded or

delivered by mail for free. This edition provided compilation and programming
for a limited number of Altera devices.
44
The low-cost Cyclone family of FPGAs is fully supported by this edition, as

well as the MAX family of CPLDs, meaning small developers and educational
institutions have no overheads from the cost of development software.
License registration is required to use the Web Edition of Quartus II, which is
free and can be renewed an unlimited number of times.
Quartus II Subscription Edition
Quartus II Subscription Edition is also available for free download, but a license
must be paid for to use the full functionality in the software. The free Web
Edition license can be used on this software, restricting the devices that can
be used.
45
CHAPTER-6
RESULT AND DISCUSSION:
MODEL SIM OUTPUT:
Figure 15.Simulated output.
46
AREA UTILIZATION REPORT:
Figure 16.Flow summary report
47
PERFORMANCE REPORT:
48
Figure 17.Fmax. summary report for slow corner.
49
PERFORMANCE REPORT:
Figure 18.Fmax. summary report for fast corner.
50
SYNTHESIS REPORT:
Figure 19.RTL Schematic report.
51
CONCLUSION
52
REFERENCES
[1] Y. Zorian, A distributed BIST control scheme for complex VLSI
devices, in 11th Annu. IEEE VLSI Test Symp. Dig. Papers, Apr. 1993,
pp. 49.
[2] P. Girard, Survey of low-power testing of VLSI circuits, IEEE Design
Test Comput., vol. 19, no. 3, pp. 8090, MayJun. 2002.
[3] A. Abu-Issa and S. Quigley, Bit-swapping LFSR and scan-chain
ordering: A novel technique for peak- and average-power reduction in
scan-based BIST, IEEE Trans. Comput.-Aided Design Integr. Circuits
Syst., vol. 28, no. 5, pp. 755759, May 2009.
[4] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, J. Figueras,
S. Manich, P. Teixeira, and M. Santos, Low-energy BIST design:
Impact of the LFSR TPG parameters on the weighted switching activity,
in Proc. IEEE Int. Symp. Circuits Syst., vol. 1. Jul. 1999, pp.
110113.
[5] S. Wang and S. Gupta, DS-LFSR: A BIST TPG for low switching
activity, IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.,
vol. 21, no. 7, pp. 842851, Jul. 2002.
[6] F. Corno, M. Rebaudengo, M. Reorda, G. Squillero, and M. Violante,
Low power BIST via non-linear hybrid cellular automata,
in Proc. 18th IEEE VLSI Test Symp., Apr.May 2000, pp.
2934.
[7] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, and H. Wunderlich,
A modified clock scheme for a low power BIST test pattern
generator, in Proc. 19th IEEE VTS VLSI Test Symp., Mar.Apr. 2001,
pp. 306311.
53
[8] D. Gizopoulos, N. Krantitis, A. Paschalis, M. Psarakis, and Y. Zorian,

Low power/energy BIST scheme for datapaths, in Proc. 18th IEEE
VLSI Test Symp., Apr.May 2000, pp. 2328.
[9] Y. Bonhomme, P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch,
A gated clock scheme for low power scan testing of logic
ICs or embedded cores, in Proc. 10th Asian Test Symp., Nov. 2001, pp.
253258.
[10] C. Laoudias and D. Nikolos, A new test pattern generator for high
defect coverage in a BIST environment, in Proc. 14th ACM Great Lakes
Symp. VLSI, Apr. 2004, pp. 417420.
[11] S. Bhunia, H. Mahmoodi, D. Ghosh, S. Mukhopadhyay, and K. Roy,
Low-power scan design using first-level supply gating, IEEE Trans.
Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 3, pp. 384395, Mar.
2005.
[12] X. Kavousianos, D. Bakalis, and D. Nikolos, Efficient partial scan cell
gating for low-power scan-based testing, ACM Trans. Design Autom.
Electron. Syst., vol. 14, no. 2, pp. 28-128-15, Mar. 2009.
[13] P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch, A test
vector inhibiting technique for low energy BIST design, in Proc. 17th
IEEE VLSI Test Symp., Apr. 1999, pp. 407412.
[14] S. Manich, A. Gabarro, M. Lopez, J. Figueras, P. Girard, L. Guiller,
C. Landrault, S. Pravossoudovitch, P. Teixeira, and M. Santos, Low
power BIST by filtering non-detecting vectors, J. Electron. Test.-Theory
Appl., vol. 16, no. 3, pp. 193202, Jun. 2000.
[15] F. Corno, M. Rebaudengo, M. Reorda, and M. Violante, A new BIST
architecture for low power circuits, in Proc. Eur. Test Workshop, May
54
1999, pp. 160164.

[16] S. Gerstendorfer and H.-J. Wunderlich, Minimized power consumption
for scan-based BIST, in Proc. Int. Test Conf., Sep. 1999, pp. 7784.
[17] A. Hertwing and H. Wunderlich, Low power serial built-in self-test,
in Proc. Eur. Test Workshop, 1998, pp. 4953.

Chapter-1: 1.2 Pseudo-Random BIST

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter-1: 1.2 Pseudo-Random BIST

Uploaded by

Copyright:

Available Formats

1

Fig.1Block Diagram for BIST

Fig.2 Essential Functions

1.4 BIST PROCESS

1.4.1 BIST Implementation

1.5 DESIGNS FOR TESTABILITY

Fig.3 Basic testing flow.

2.3 LITERATURE REVIEW:

2.3.1 EXISTING SYSTEMs:

faults remain undetected in many of the circuits considered. A different approach to

A minimal number of weight assignments is searched for, to keep hardware

The weight assignment (1, 0.5, 0, 1) can be derived from T by intersecting

circuit, and the same fault coverage is achieved. To complete Procedure 1, a

In [19], a pseudorandom BIST scheme was proposed to reduce switching

3.2 SINGLE-INPUT CHANGE (SIC)

3.3 MSIC-TPG SCHEME

X2Xl+2X2l+2, . . . , X(M1)l+2 will be shifted into M scan chains, respectively.

Fig 3.1 Symbolic representation of an MSIC pattern.

Fig 3.2 Reconfigurable Johnson counter.

3.4 SCALABLE SIC COUNTER

Fig 3.3 Scalable SIC counter.

3.4.1 Test-Per-Clock Schemes

Fig 3.4 test-per-scan schemes.

FIG 3.3: SIMPLE ASIC DESIGN FLOW

3.12 FPGA DESIGN FLOW

Interconnect: The wiring resources in an FPD.

FPGAs contain programmable logic components called "logic blocks", and

Logic blocks can be configured to perform complex combinational

PLA a Programmable Logic Array (PLA) is a relatively small FPD

PAL a Programmable Array Logic (PAL) is a relatively small FPD that

4.1.2 The FPGA Landscape

Fig. 12 Structure of an FPGA

Between these FPGA devices, many major electronics applications such as

Table 4.1 Features Offered In FPGA

Up to10 Mbit Up to 338K

and Logic modules PFU based

4.1.4 Applications of FPGAs

4.2 Altera DE0 Board

and Active Serial

Figure 13. The DE0 board.

Figure 14. Block diagram of the DE0 board.

Built-in USB Blaster circuit

On-board USB Blaster for programming and user API (Application

programming interface) control

4.3 SOFTWARES USED

High Performance and Capacity Mixed HDL Simulation Model Sim

Recommended for simulating all FPGA designs (Cyclone, Arria, and

33 percent faster simulation performance than ModelSim-Altera Starter

Buy today for $945.

ModelSim-Altera Starter Edition

Support for simulating small FPGA designs

10,000 executable line limitations.

Quartus II Web Edition

The Web Edition is a free version of Quartus II that can be downloaded or

The low-cost Cyclone family of FPGAs is fully supported by this edition, as

Quartus II Subscription Edition

Figure 15.Simulated output.

AREA UTILIZATION REPORT:

Figure 16.Flow summary report

Figure 17.Fmax. summary report for slow corner.

Figure 18.Fmax. summary report for fast corner.

Figure 19.RTL Schematic report.

[8] D. Gizopoulos, N. Krantitis, A. Paschalis, M. Psarakis, and Y. Zorian,