You are on page 1of 84

Abstract

This thesis discusses the design and implementation of a VHDL generator


for Wallace tree with (3:2) counter modules and (2:2) counter modules to
solve fast addition problem.
The basic research has been carried out by MATLAB programming
environment and automatic generation of VHDL file based on the result
obtained from MATLAB simulation. MODELSIM has been used for
compilation and simulation of the VHDL file.
VHDL Implementation of Fast adder trees

1
TABLE OF CONTENTS

CHAPTER 1 INTRODUCTION........................................................................................................ 1
1.1 MOTIVATION....................................................................................................................................1
1.2 THESIS TARGET ...............................................................................................................................1
1.3 READING GUIDE ..............................................................................................................................1
CHAPTER 2 ADDER STRUCTURES.............................................................................................. 3
2.1 ADDER STRUCTURES.......................................................................................................................3
2.1.1 Twos Complement Representation.......................................................................................... 3
2.1.2 Fixed Time Type....................................................................................................................... 4
2.1.3 Variable Time Type .................................................................................................................. 4
2.1.4 Carry-Propagate Adder........................................................................................................... 4
2.1.5 Redundant Adders ................................................................................................................... 8
2.1.6 Multi-operand Addition......................................................................................................... 13
CHAPTER 3 DESIGN FLOW......................................................................................................... 17
3.1 SYSTEM SPECIFICATION.................................................................................................................17
3.2 RELATED MATLAB......................................................................................................................18
3.2.1 Basic MATLAB program language ....................................................................................... 18
3.2.2 Multidimensional cell array.................................................................................................. 18
3.3 DESIGN FLOW STRUCTURE............................................................................................................20
3.4 DESCRIPTION OF CELL ARRAYS......................................................................................................23
3.4.1 Counter Block........................................................................................................................ 23
3.4.2 Input block............................................................................................................................. 25
3.4.3 Output block.......................................................................................................................... 27
CHAPTER 4 VHDL GENERATOR AND TOP LEVEL SIMULATION..................................... 32
4.1 MATLAB PROGRAM TO GENERATE VHDL CODE..........................................................................32
4.2 VHDL CODE DESCRIPTION............................................................................................................33
4.2.1 Related MODELSIM and VHDL language ........................................................................... 33
4.2.2 VHDL dataflow description................................................................................................... 34
4.2.3 VHDL structural RTL description......................................................................................... 34
4.2.4 VHDL code of each level....................................................................................................... 35
4.2.5 VHDL code for top level........................................................................................................ 37
4.3 SIMULATION RESULT......................................................................................................................38
CHAPTER 5 CONCLUSION AND FUTURE WORK.................................................................. 42
VHDL Implementation of Fast adder trees

2
5.1 CONCLUSION.................................................................................................................................42
5.2 FUTURE WORK...............................................................................................................................42
REFERENCES..................................................................................................................................... 43
APPENDICES ...................................................................................................................................... 45
APPENDIX 1 MATLAB PROGRAM FOR EACH LEVEL (EACH LEVEL. M) ................................................45
APPENDIX 2 MATLAB PROGRAM FOR TOP LEVEL (TOP LEVEL. M)......................................................71

VHDL Implementation of Fast adder trees

3
INDEX OF FIGURES


FIGURE 1 RIPPLE-CARRY ADDER................................................................................................................5
FIGURE 2 CARRY OUT OF CARRY LOOKAHEAD ADDER..............................................................................7
FIGURE 3 SUM OF CARRY LOOKAHEAD ADDER.........................................................................................7
FIGURE 4 FUNCTION OF CARRY-SAVE ADDER.............................................................................................8
FIGURE 5 CSA USED INn BIT NUMBERS ....................................................................................................9
FIGURE 6 CSA COMPUTATION....................................................................................................................9
FIGURE 7 FIRST STEP OF SIGNED-DIGIT ADDITION...................................................................................11
FIGURE 8 SECOND STEP OF SIGN-DIGIT ADDITION...................................................................................12
FIGURE 9 SIGN-DIGIT ADDITION..............................................................................................................12
FIGURE 10 A [ p :2] ADDER.....................................................................................................................13
FIGURE 11 REDUCTION BY ROWS.............................................................................................................14
FIGURE 12 FA AND HA AS (3:2) COUNTER AND (2:2) COUNTER ...............................................................15
FIGURE 13 EXAMPLE OF REDUCTION BY COLUMNS ................................................................................15
FIGURE 14 SYSTEM REQUIREMENT..........................................................................................................17
FIGURE 15 COMPONENT FIGURE..............................................................................................................19
FIGURE 16 EXPLAIN OF REPRESENTATION................................................................................................19
FIGURE 17 EXAMPLE OF FIRST LEVEL'S STRUCTURE OF WALLACE TREE..................................................20
FIGURE 18 DESIGN FLOW OF PROGRAM...................................................................................................22
FIGURE 19 COUNTER BLOCK...................................................................................................................23
FIGURE 20 FA POSITION IN ADDER TREE ..................................................................................................24
FIGURE 21 HA AND BP POSITION IN ADDER TREE ....................................................................................25
FIGURE 22 INPUT BLOCK.........................................................................................................................26
FIGURE 23 FA INPUTS STATE....................................................................................................................26
FIGURE 24 HA AND BP INPUTS STATE......................................................................................................27
FIGURE 25 OUTPUT BLOCK .....................................................................................................................27
FIGURE 26 FA'S SUM STATE .....................................................................................................................28
FIGURE 27 HA'S SUM STATE ....................................................................................................................29
FIGURE 28 BP'S OUTPUT STATE................................................................................................................29
FIGURE 29 FA'S CARRY OUT STATE ..........................................................................................................30
FIGURE 30 HA'S CARRY OUT STATE .........................................................................................................30
FIGURE 31 MATLAB CODE TO VHDL CODE...........................................................................................32
FIGURE 32 MODELSIM OPERATION.......................................................................................................33
FIGURE 33 VHDL CODE DESCRIPTION FOR EACH LEVEL .........................................................................36
VHDL Implementation of Fast adder trees

4
FIGURE 34 VHDL CODE DESCRIPTION FOR TOP LEVEL ............................................................................38
FIGURE 35 ADDER TREE STRUCTURE DESCRIBED BY MODELSIM..........................................................39
FIGURE 36 COMPUTATION RESULT...........................................................................................................40


INDEX OF TABLES

TABLE 1 CSA COMPUTATION...................................................................................................................10
TABLE 2 COMPUTATION PROCEDURE .......................................................................................................40
VHDL Implementation of Fast adder trees

1
Chapter 1
Introduction

1.1 Motivation
Computation operations like fast parallel multiplication using adder trees are present
in many parts of a digital system or digital computer, especially in signal processing,
high-speed circuits, graphics and scientific computation. Examples of such are
graphic processor, digital signal processors, communication or code compression. To
speed up addition is a very important part for computation.
There are many tree structure like Wallace adder tree [1], CSA tree, over turn stair
tree [2] and some other kinds of adder trees are mentioned in [3]-[7]. Here Wallace
tree is used as the tree structure because it is suitable for implementation


1.2 Thesis Target
Use MATLAB to make programs, the first part of the program is formed by blocks
where each block contains some cell arrays. The second part of the program is used
to generate a VHDL file, the information we need is all stored in cell arrays. Then
use MODELSIM to compile and simulated the VHDL file created by MATLAB.

1.3 Reading guide
This thesis is organized in five chapters.

Chapter 2 mainly discuss different adders, multi-operand addition and fast addition
trees.


VHDL Implementation of Fast adder trees

2
Chapter 3 mainly discuss basic knowledge of MATLAB programming, design
flow of the program, how the program describes the structure of the adder tree of
each level, and the method that was used to solve this problem. It also describes how
to automatic generate VHDL code use this program.

Chapter 4 focuses on the top levels simulation, using MODELSIM to compile and
simulate.

Chapter 5 gives the conclusion of this work and future work that still has to be done.

Appendices shows the MATLAB code to generate VHDL code.
VHDL Implementation of Fast adder trees

3
Chapter 2
Adder Structures

2.1 Adder Structures
Adders are used in many aspects [11], [12]. It is generally recognized that most of
the time required by adders is due to carry propagation, so how to reduce the
propagation time is the focus on todays techniques. Different binary adder schemes
have their own characters, such as area and energy dissipation. No such adder
scheme is the best for every condition, so to choose in a specific context with
specific requirement and constraint is important. Because this thesis work does not
focus on analysis of delay time of different adders, here the function of some
commonly used adders is given.

2.1.1 Twos Complement Representation
Twos complement representation uses the most significant bit as a sign bit, making
it easy to test whether an integer is positive or negative. Range of twos complement
representation is from
1
2

n
to 1 2
1

n
. Consider an n bits integer A, in twos
complement representation. If A is positive, then the sign bit
1 n
a is zero. The
remaining bits represent the magnitude of the number, in the same fashion as for sign
magnitude:

=
=
2
0
2
n
i
i
i
a A for A0
The number zero is identified as positive and therefore has a 0 sign bit and a
magnitude of all 0s, we can see that the range of positive integers that maybe
represented is from 0 to 1 2
1

n
. Any larger number would require more bits.
VHDL Implementation of Fast adder trees

4
2.1.2 Fixed Time Type
Most commonly implemented is the fixed time type adder scheme. The character is
that no signal is indicated when addition is completed. Therefore the worst case
delay should be considered.
2.1.3 Variable Time Type
Contrary to fixed time type adder scheme, the variable time type adders have a
completion signal so that the result of the addition can be used as soon as the
completion signal is asserted.

2.1.4 Carry-Propagate Adder
Carry-propagate adders (CPA) can get the result in conventional number system, also
called fixed-radix system. The property of fixed-radix system is that every number
has a unique representation, so that no two sequences have the same numerical value.
A digit set from 0 to 1 r , where r means radix.

2.1.4.1 Ripple-Carry Adder
An n -bit adder used to add two n -bit binary numbers can build by connectingn
full adders in series. Each full adder represents a bit position i (from 0 to 1 n ).
Each carry out from a full adder at position i is connected to the carry in of the full
adder at the higher position 1 + i .

The sum output of a full adder at position i as shown in Figure 1 is given by:
i i i i
C Y X S =


The carry output of each FA as shown in Figure 1 is given by:

VHDL Implementation of Fast adder trees

5

i i i i i i i
C Y C X Y X C + + =
+1




Figure 1 Ripple-carry adder
In the expression of the sum,
i
C must be generated by the full adder at the lower
position 1 i .
c
t is the delay from the input from the full adder to the carry output
and
s
t is the delay form the input to the sum output. The worst case delay is given
by

) , max( ) 1 (
s c c CRA
t t t n T + =

This adder is slow for largen . The main advantage of this adder is the simplicity of
its cell and connection among them.



2.1.4.2 Carry-Lookahead Adder
The basic idea of carry-lookahead adder is computing the carries simultaneously, i.e.
in this type of adder all the carries in the same groups are computed at the same time.
The carry-lookahead adder has two functions, first is to compute all the carries then
the operation
i i i i
C Y X S = is implemented by a simple 3-input XOR gate. The

VHDL Implementation of Fast adder trees

6
design of the lookahead carry generator involves two Boolean functions named
Generate and Propagate. For each pair of input bits these functions are defined as:


i i i
Y X G =

i i i
Y X P =
The carry bit
1 + i
C generated when adding two bits
i
X and
i
Y , is '1' when the
function
i
G is '1' or if the
I
C is 1 and the function
i
P is '1' simultaneously. In
the first case, the carry bit is activated by the local conditions (the values of
i
X
and
i
Y ). In the second, the carry bit is received from the less significant elementary
addition and is propagated further to the more significant elementary addition
depending on the function
i
P .Therefore, the carry-out bit corresponding to a pair of
bits
i
X and
i
Y is computed according to the equation:

1
+ =
i i i i
C P G C

Hence, the carry signal can be computed by carry in, Generate and Propagate
signals.

For example, consider a four bit adder



in
C P G C
0 0 1
+ =

in
C P P G P G C
0 1 0 1 1 2
+ + =

in
C P P P G P P G P G C
0 1 2 0 1 2 1 2 2 3
+ + + =
in
C P P P P G P P P G P P G P G C
0 1 2 3 0 1 2 3 1 2 3 2 3 3 4
+ + + + =
VHDL Implementation of Fast adder trees

7
Figure 2 can help us understand the carry out signal computation procedure more
clearly.


Figure 2 Carry out of Carry Lookahead Adder

The sumoutput of each column is given in Figure 3.

in carry Y X out sum
i i i
_ _ =

Figure 3 Sum of Carry Lookahead Adder
The advantage of carry-lookahead adder is if we consider the input vector of n bits
is divided into groups of m bits and groups connected like a ripple-carry adder, the
worst delay should be:

s groups CLA
t t
m
n
T + =
VHDL Implementation of Fast adder trees

8
The worst delay is less than ripple-carry adder because
groups
t is smaller than
c
mt .
Hence the carry-lookahead adder is faster than ripple-carry adder.

2.1.5 Redundant Adders
The character of redundant adders is that no carry propagation is required. In other
words, independence of numbers of bits of the adders. The operand is represented
using a redundant set. The main purpose of the redundant adder is to reduce the
addition time. But this kind of adder have some disadvantages, first is the increase of
the number of bits needed for representation of a number, which depend on the
degree of the redundancy. Another disadvantage is that some of operations cant be
performed in redundant numbers such as magnitude comparison or sign detection.



2.1.5.1 Carry-Save Adder
Carry-save adder(CSA) have the same circuit as the full adder, as show in Figure 4.


Figure 4 Function of Carry-save adder

The carry in signal is considered as an input of the CSA, and the carry out signal is
considered as an output of the CSA. Figure 5 show hown carry save adders are
arranged to adder threen bit numbersx , y , z . into two numbersc ands .
VHDL Implementation of Fast adder trees

9


Figure 5 CSA used inn bit numbers


In Figure 5, note that all full adders are independent

Figure 6 show the CSA compute flow and Table 1 will show how the CSA works
(basic on binary numbers).



Figure 6 CSA computation


VHDL Implementation of Fast adder trees

10

Table 1 CSA Computation

The computation can be divided into two steps, first we compute S and C using a
CSA, then we use a CPA to compute the total sum. From this example, we can see
that the carry signal and the sum signal can be computed independently to get only
two n -bits numbers. A CPA is used for the last step computation and the carry
propagation exist only in the last step.



2.1.5.2 Signed-Digit Adder (SDA)
Signed-digit (SD) number representation systems have been defined for any radix r
with digit values ranging over the set (- alpha , . . ., -1, 0, 1, . . ., alpha ), where alpha
is an arbitrary integer in the range 1
2
1

r alpha
r
.Such number representation
systems possess sufficient redundancy to allow for the cut up of carry or borrow
chains and hence result in fast propagation-free addition and subtraction. The result
of the addition uses signed digit representation. Use fixed-radix representation with
digit value from a signed-integer set.

=
1
0
n
i
i
r x x

with a digit set (- alpha , . . ., -1, 0, 1, . . ., alpha ).

Here the addition algorithm is not mention in detail.
The objective of SDA is to eliminate carry propagation. A signed-digit addition is
performed in two steps.
VHDL Implementation of Fast adder trees

11

Step 1: to compute sum( w) and transfer(t ), the transfers function is something like
carriers in CPA.

t w y x + = +

At the digital level this correspond to


1 +
+ = +
i i i i
rt w y x

Figure 7 show the addition of the first two bits of n -bit numbers


Figure 7 First step of Signed-digit Addition

Step 2: compute t w s + = At the digital level


i i i
t w s + =

We can compute
i
s without produce a carry, as shown in Figure 8.




VHDL Implementation of Fast adder trees

12

Figure 8 Second step of Sign-digit Addition

Finally we can conclude SDA structure, as shown in Figure 9.


Figure 9 Sign-digit Addition

A.Avizienis [13] proposed a redundant binary number (a radix-2 signed-digit
number). With this type of number, the propagation of carry figures is absorbed into
its redundancy and the addition processes are unrelated to the number of digits and
can be executed in only two steps. More detail to compute
i
t and representation of
operands has been mentioned in [14].



VHDL Implementation of Fast adder trees

13
2.1.6 Multi-operand Addition
A common structure for adding several operands is an adder tree, such as Wallace
tree, Dadda tree, carry save adder tree and so on. In this thesis, carry save adder tree
structure and Wallace tree are used. The primitive operation performed on the inputs
bit-array is reduction, to achieve an output bit-array with a small number of bits.
There are two methods used: reduction by rows and reduction by columns, carry
save adder tree belong to first method and the Wallace tree belong to second method.
Modules to reduce the rows are called adders and reduce the columns are called
counters.

2.1.6.1 Carry Save Adder Tree
The carry save adder tree can be used to add three operands in twos complement
representation and produces a result as the sum of two vectors. A 3-to-2 reduction is
called [3:2] adder, and using this tree, we can use a [ p :2] adder to reduce p
bit-vectors to 2 bit-vectors using CSAs.


Figure 10 A [ p :2] adder

From Figure 10, each columns bit numbers are k , and have p levels. We can use
[3:2] adders to reduce the rows and get 2 bit vectors. No propagation of the carries
VHDL Implementation of Fast adder trees

14
are required except on the last two rows which result in a speed up of the
computation.

Figure 11 Reduction by rows

From Figure 11, the number of input vectors were reduced by the rows. Finally, we
should estimate the numbers of levels of the CSA tree as

2
3
log
2
log
k
level
where k is the number of input operands.

2.1.6.2 Wallace tree
Wallace tree structures are widely used in additions with several operands. The
reduction by column is similar to reduction by rows if the number of bits in each
column of the array is the same. But conditions are always not like this. For example
the partial products of the multiplier, the Least-significant column cant receive bits
from other columns. So reduction by columns is introduced.

The basic concept is to reduce bit numbers in each column of each level. So full
adder and half adder are used as (3:2) counter adder and (2:2) counter.
VHDL Implementation of Fast adder trees

15

Figure 12 FA and HA as (3:2) counter and (2:2) counter
In Figure 12, three nodes inside pane represent the FAs three inputs and two nodes
outside represents the FAs carry out and sum. The half adder has two inputs, abd one
sum and one carry out. Here is a example used in this thesis presented.

Example: a =[2 3] means 2 bit numbers with weight
1
2 and 3 bit numbers with
weight
0
2 . We can use a Wallace tree as shown in Figure 13 to achieve fast addition.
The basic module in the Wallace tree is (3:2) counter and (2:2) counter.


Figure 13 Example of Reduction by Columns
VHDL Implementation of Fast adder trees

16
The vector change from a =[2 3] to a =[1 2 1] . The max change from 3 bits to 2
bits. Carry propagation delay was eliminated except for the last row. The last step
use the CPA, like carry-lookahead adder, to compute the sum, and fast addition is
achieved.

I think Dadda tree is a special condition of the Wallace tree where all bit-numbers
are collected and using Wallace tree with minimum number of counters and critical
path. Wallace tree was chosen as basic algorithm of program for this thesis.




VHDL Implementation of Fast adder trees

17
Chapter 3
Design Flow

3.1 System specification

This program mainly uses Wallace tree structure with (3:2) counter module and (2:2)
counter module in adder tree to solve the fast addition problem. Environment of the
program is MATLAB and the MATLAB program generates VHDL code.

Figure 14 System requirement

As shown in Figure 14. The input of the system is an integer vector (in MATLAB on
integer vector can represent a bit array) that gives the number of bits in each column
and the output of the program is VHDL code for the adder tree.
VHDL Implementation of Fast adder trees

18
3.2 Related MATLAB
MATLAB [8] is a high-level technical computing language and interactive
environment for algorithm development, data visualization, data analysis, and
numerical computation. Using MATLAB, you can solve technical computing
problems faster than with traditional programming languages, such as C, C++, and
Fortran.

3.2.1 Basic MATLAB program language
Here key program syntax used in my program are introduced. First is if a is a
vector, we will use length(a ) to express the vector length. Second is matrix addition
and subtraction which are like in C language. Third is control flow like if end and
for loops, in my program the tree levels depend on the input vector and we must
use control flow to determine the levels and each levels detail information.


3.2.2 Multidimensional cell array
In the program, inputs, outputs and component names of each level will be stored.
For example, as shown in Figure 15, the component name full_1_1_1 means the first
full adder in column one in first level and in_data_1_1_1(0) means input data in
column one in first level. All these names are variable character strings, because the
level, column, and bit number are all variables. So if we want to store this
information which must be recoded for next level, we must use an efficient method
to solve it. MATLAB provide a good structure to solve this problem that is cell array.
The function of multidimensional cell array is powerful. It can store the variable
character string and all the information needed for each level. Two and three
dimensional cell arrays [9] are used in my program.
VHDL Implementation of Fast adder trees

19

Figure 15 Component figure


Figure 16 Explain of representation

Consider Figure 16(a), a full adders name: Full_1_1_1
First create a three dimensional cell array named cell(n ,m , p ). then we define the
information to the cell array, p means how many levels there is in the total system,
m means which column in the defined level. n means which full adder is used in

defined column.

Consider Figure 16(b), an input data name: in_data_1_1(0)
First create a three dimensional cell array named cell(n ,m , p ). Then we define the
information to the cell array, p means how many levels in the total system. m
means which column in the defined level. Last n means which bit in the defined
column.
VHDL Implementation of Fast adder trees

20
For all the three dimensional cell arrays in my program, p and m are defined as
level and column, which is easy for the informations pick up.

3.3 Design Flow Structure
The design flow of the program is shown in figure 18, The first step is to compute
the total numbers of levels in the adder tree. The second step is to compute each
levels integer vector through the Wallace tree, The third step is to compute how
many columns in each level through second step. The fourth step to compute the
total numbers of counter and bypass in each column. There are three conditions. One
condition is only have full adder in each column, Another condition is have full
adder and half adder in each column, the last condition is have bypass and full adder
in each column. The fifth step is after fourth step where we already know how many
counters and bypasses in each column at each level. So though the numbers of full
adder, half adder and bypasses in each column at each level, we can get inputs and
outputs position states of counters and bypasss input and output position states. The
sixth step is to store these states in cell arrays which will be used for describing
hardware connection of Wallace tree.

Here is an example to help us understand program procedure.

Example: Assume the input integer vector is a =[6 4 5 6].



Figure 17 Example of first level's structure of Wallace tree
VHDL Implementation of Fast adder trees

21
Figure 17 shows the first level of the tree structure. As mention above, we discuss
this example from the third step.

The third step can confirm that this level has 4 columns.

The fourth step can compute that column 1 has two full adders, column 2 has one
full adder and a bypass, column 3 has a half adder and a full adder, column 4 has two
full adders. So this step can confirm full adder and half adders position. FA_4_1,
means column 4 and level 1 and so on. Then store FA_4_1 to full adder position cell
array and so on.

The fifth step is through full adder, half adder and bypass position to get inputs. and
outputs position. Like if we know FA_4_1, we can get its input should be
In_data_4(0), In_data_4(1) and In_data_4(2). And output should be Out_data_4(0) ,
Out_data_3(0) and so on.

The sixth step is:
Store FA_4_1 in full adders cell array, In_data_4(0), In_data_4(1), In_data_4(2) in
full adders input cell array. Out_data_4(0) in full adders outputs sum cell array.
Out_put_3(0) to full adders outputs carry out cell array.

Store HA_3 in half adders cell array, In_data_3(3), In_data3(4) (belong to half
adder inputs in Figure 18) to half adders input cell array, Out_data_3(3) to half
adders outputs sum cell array and Out_data_2(1) to half adders outputs carry out
cell array.

Assume bypass is a component like full adder. So store in_data_2(3) to bypasss
input cell array and Out_data_2(3) to bypasss output cell array.

So there are 4 cell arrays for full adder, 4 cell arrays for half adder and 3 cell arrays
for bypass, because bypass doesnt have carry out.

Now we store all this levels information by different cell arrays. When we want to
use it, according to the column and level, we can find them in the cell arrays.





VHDL Implementation of Fast adder trees

22
Figure 18 shows design flow of this program.

Figure 18 Design Flow of program
VHDL Implementation of Fast adder trees

23
3.4 Description of cell arrays
In Figure 18, three blocks are defined: counter block, input block, and output block.

Counter block means all the cell arrays where the information related to position of
full adders, half adders and bypasses (assume it is a component) in this block are
stored. This block contains full adders position cell array, half adders position cell
array, and bypasss position cell array.

Input block means all the cell arrays storing the inputs information will be in this
block, so this block contains full adders input cell array, half adders input cell array
and bypass input cell array.

Output block means the cell arrays that store the output information will be in this
block. This block contains full adders outputs sum cell array, full adders outputs
carry out cell array, half adders outputs sum cell array, half adders outputs carry
out cell array, bypasss output cell array.
3.4.1 Counter Block
The counter block divided by three cell arrays: (3:2) counter (FA) cell array, (2:2)
counter cell array and bypass cell array. The FA cell array store each FAs position of
each column in each level. HA cell array store the position of each column in each
level, and so does bypass. Figure 19 shows the counter block.



Figure 19 Counter Block

VHDL Implementation of Fast adder trees

24
First we discuss the store method of FA in cell array.

We already know current level and column through level cell array and column cell
array. So we can get the numbers of FA in current column and then create a three
dimensional cell array FA(max, column, level). Max means the maximum number of
FA in current column, through the cell array. We can easily find the position of the
FA we want to use.

Example: If the input bit array is a =[6 4 5 6]; through the FA cell array, we can get
from Figure 20 that there are tree levels, and in first level, column one need two FAs.
:


Figure 20 FA position in adder tree


Column two needs one FA, column tree need one FA, column four need two FA, and
so on for the other two levels. When we want to use the FA information, just find it
in the cell array is enough.

Then the next steps are the HA and Bypass cell arrays.

The principle is like the FA cell array, only difference is the dimension size of the
cell array, as each column maybe only have one HA or Bypass, so the cell array
should be HA(1,column, level), and Bypass(1, column, level). Consider that Bypass
(BP) is a component because it is the easy for us to find its signal flow.



VHDL Implementation of Fast adder trees

25
Example: Same input bit array a =[6 4 5 6]; through the HA and Bypass cell arrays,
we can get from Figure 21.


Figure 21 HA and BP position in adder tree

From Figure 21 we know that level one, column two has a Bypass and column three
has a HA. The same concept applies to last two levels. Because the cell arrays two
dimension(column, level) are same compared with FA cell array, it is easy for us to
get all the component position information when we want to generate the VHDL
code.
3.4.2 Input block
When position of counters are defined, we should add input signals to each FA, HA
and BP.
First consider input of the FA cell array. Create a cell array FAinput(max, column,
level). Column and level are the same as mention above, Max here means when we
already know a column have max FA numbers, this numbers multiplied by three
because each FA has three inputs. Figure 22 shows the input block.

VHDL Implementation of Fast adder trees

26

Figure 22 Input Block

Example: Same input bit array a =[6 4 5 6]; the input state is shown in Figure 23.


Figure 23 FA inputs state


Because we already know column one in first level has two FA as mentioned above,
so the inputs of column one should be six input numbers. The same applies for the
other columns and levels.
Then we consider inputs of HA and BP. The principle is the same as for the FAs
inputs, difference is that HA has two inputs and BP only has one input. So the cell
VHDL Implementation of Fast adder trees

27
array can be created like HAinput(2, column, level) and BPinput(1, column, level).

Example: Same input bit array a =[6 4 5 6]; the input state we can get from
HAinput cell array and BPinput cell array shown in Figure 24.



Figure 24 HA and BP inputs state


3.4.3 Output block
The program has already stored the counters and inputs information, the last step is
to store the output state. The output block contain FAs sum cell array, FAs carry out
cell array, HAs sum cell array, HAs carry out cell array and BPs output cell array.


Figure 25 Output Block

VHDL Implementation of Fast adder trees

28
There are five cell arrays used in the Outputs block as shown in Figure 25. First
introduce FAs sum cell array and HAs sum cell array and BPs output cell array.
The concept to discuss these three cell arrays together is their outputs are all in
original column, not like carry out which is in the next column. The form of each
cell array are FAsum(max, column, level), HAsum(1, column, level), BPout(1,
column, level). Because numbers of FA are random in each column at defined level,
we need to decide a maximum numbers of FA in that level as one dimension of the
cell array, HA and BPs output in each column are only one, so the one dimension of
the cell array should be one.

Example: Same input bit array a =[6 4 5 6]; we can get FAs sum, HAs sum and
BPs sum from Figure 26.


Figure 26 FA's sum state

The cell array named fos, means full adder output sum. We already know column
one have two FA in level one, so there are two corresponding outputs. So do other
columns and levels.

VHDL Implementation of Fast adder trees

29

Figure 27 HA's sum state


The cell array hos, as shown in Figure 27, means HAs output sum, from counter
block. We know that in the first level we only have one HA in column three. And this
output corresponding to the HA position. So do other columns and levels.


Figure 28 BP's output state

The cell array bpos, as shown in Figure 28 means BPs output, from the counter
block, we know that in the first level we only have one BP in column two, so the
output corresponding to the BP position, so do other columns and levels.

Carry in bits will affect the positions of the next columns output, so I create a cell
array to store the number of bits of the carries to the next column. When storing the
current sum output, we should use this cell array to get the correct position for each
bit. Then we will discuss the carry out cell array of FA and HA. The method to
VHDL Implementation of Fast adder trees

30
store FA and HAs carry out bit is the same like store its sum value, the only
difference is the column representation.


Example: Same input bit array a =[6 4 5 6];


Figure 29 FA's carry out state

From Figure 29, for example in the first level, column two, carry out is
out_data_1_1(0). Though this bit should be in column one because it is a carry out,
but it is produced by column twos FA, so store this bit in column two. It is easy for
us to get all information of a counter by column and level. So does HAs carry out
cell array as shown in Figure 30.


Figure 30 HA's carry out state

VHDL Implementation of Fast adder trees

31
From all these cell arrays, the program describe all the information of the adder tree
hardware connections. And solve fast addition target by Wallace tree.
VHDL Implementation of Fast adder trees

32
Chapter 4
VHDL generator and Top level simulation


4.1 MATLAB program to generate VHDL code

RTL description of the adder tree is suitable for describing the component structure.
A MATLAB program is used to generate RTL VHDL code.
The main method to generate VHDL code is using file input and output in the
MATLAB language. in Figure 31 it is explained how to create a VHDL file and the
procedure of generation.



Figure 31 MATLAB code to VHDL code

The first step in Figure 31 is to create a VHDL file, w means that we can write the
file. And fid is used to identify which file that is used. Here eachlevel.vhdl as is
used as filename, the next step is writing to the file, MATLABs syntax fprintf(fid,
'format','cmd') writes the string using the format specified by format. Format is a C
language conversion specification. Conversion specifications involve the %
character and the conversion characters d, i, o, u, x, X, f, e, E, g, G, c, and s. In this
thesis, %s is used because all information is stored in cell array in character string
format.
VHDL Implementation of Fast adder trees

33
The second sentence in Figure 31 defines a library, and the third sentence describes
the library using std_logic_1164.all package. The generated code should agree with
VHDL grammar. Information stored in cell array is used to represent the adder tree
structure.

4.2 VHDL code Description
Because of structural VHDL, the Wallace adder tree representation was divided in
each level and a top level. Each level record current levels state like numbers of
counters and their positions. Also record inputs and outputs state of each counter.
The Top level is used to integrate all these levels, to get the final result of the fast
adder tree.

4.2.1 Related MODELSIM and VHDL language
MODELSIM is a quick and handy VHDL/Verilog simulator. The VHDL code must
be complied into a VHDL library before it is simulated. The simulator itself cant
read VHDL source code. The procedure flow is show in figure 32:


Figure 32 MODELSIM operation

VHDL Implementation of Fast adder trees

34
4.2.2 VHDL dataflow description
In the data flow approach, circuits are described by indicating how the inputs and
outputs of built in primitive components (for example AND gates) are connected
together. In other words we describe how signals (data) flow through the circuit.

4.2.3 VHDL structural RTL description
A structural description [10] of a piece of hardware is a description of what its
subcomponents are and how the subcomponents are connected to each other.
Structural description is more concrete than behavioral description; that is the
correspondence between a given portion of a structural description and a portion of
the hardware is easier to see than for a behavioral description.


4.2.3.1 Building Blocks
If we want to make the design more understandable and maintainable, a system
design should be decomposed into several blocks. These blocks are connected
together to form a complete design. Every part of a VHDL design is considered as a
block. A VHDL design may be completely described in a single block, or it may be
decomposed into several blocks. Each block in VHDL is analogous to an
off-the-shelf part and is called an entity. The entity describes the interface to that
block and a separate part associated with the entity describes how that block operates.
The interface description is like a pin description in a data book, specifying the
inputs and outputs of the block. The description of the operation of the block is like a
schematic.




VHDL Implementation of Fast adder trees

35
4.2.3.2 Connect Block
Once we have defined the basic building blocks of our design using entities and their
associated architectures, we can combine them together to form the system. For my
work, the top level is formed by each levels connection. Each level is a basic block,
and the connect block integrate those blocks together to form the top level structure.

4.2.4 VHDL code of each level
Level numbers depend on the input bit array, when confirm the level, using VHDL
structure description to describe the adder trees structure. The VHDL code was
automatic generated by the MATLAB program. Each level of the structure is
described by structure VHDL, ports and their connection are central matter of the
structure description. Each levels ports information can be obtained from counters,
inputs, and outputs cell array blocks.


Example: This is a VHDL file generated by MATLAB code. Assume bit array a =[2
3].
Figure 33 shows the detailed information.
VHDL Implementation of Fast adder trees

36

Figure 33 VHDL code description for each level
VHDL Implementation of Fast adder trees

37
4.2.5 VHDL code for top level

Integrate all those levels together, using the connect block of structural VHDL to
complete it. First levels inputs as top levels inputs and last levels outputs as top
levels outputs, other levels between these two levels are internal signals. The
outputs from one level are used as next levels inputs

Example: Bit array a =[6 4 5 6]. The VHDL code is shown in Figure 34.

VHDL Implementation of Fast adder trees

38

Figure 34 VHDL code description for top level
4.3 Simulation result

Now we get four types of VHDL files : each level structural VHDL, top level
structural VHDL and (3:2) counter and (2:2) counter dataflow VHDL file. Using
VHDL Implementation of Fast adder trees

39
MODELSIM compile and simulate the files. Then we can get the final result of the
fast adder tree.

From MODELSIM, we can see the adder trees hardware connection more clearly, as
shown in Figure 35.


Figure 35 Adder tree structure described by MODELSIM

VHDL Implementation of Fast adder trees

40
Example: Assume input bit array is a =[12 11 14 9 5]. The computation result is
shown in Figure 36.


Figure 36 Computation result
From Figure 35, this Wallace adder tree has five levels. Let us consider the
simulation result. The sum value at the output must be equal to the sum value at the
input.





7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
2
a = 111111110011
12 bits
11110011111
11 bits
11010110101111
14 bits
111110000
9 bits
11010
5 bits
Sum=285
4
2 0
3
2 9
2
2 10
1
2 5
0
2 3
a = 11
2bits
00
2bits
00
2bits
10
2bits
1
1bits
1
1bits
0
1bits
1
1bits
Sum=285
7
2
2
6
2
0
5
2
0
4
2 1
3
2 1
2
2 1
1
2 1
0
2 1
Table 2 Computation procedure

VHDL Implementation of Fast adder trees

41
From Table 2, the computation results are the same, and bit vectors in the output bit
array are no more than two bits, so the carry propagation is only required at the
output.
VHDL Implementation of Fast adder trees

42
Chapter 5
Conclusion and Future Work

5.1 Conclusion

In this work fast adder tree implementation in VHDL was considered. When inputs
are of large word length, Wallace tree was used to solve this problem and VHDL
files to describe Wallace adder trees hardware connection were generated.

5.2 Future work

Three programs are written in MATLAB language: one for storing each levels
current states and the other two uses that program to automatic generate each level
and top level VHDL files.

Future work is to add pipeline to each level, and consider delay time of each level.
Furthermore, the Wallace adder tree structure may be changed to another one
because of the irregular routing and large wiring area problems.


VHDL Implementation of Fast adder trees

43
References
[1] C.S. Wallace, A suggestion for a fast multiplier, IEEE Trans. Electron. Comput.,
pp. 1417, Feb. 1964.

[2] Z.-J . Mou, 'Overturned-Stairs' Adder Trees and Multiplier Design, F. IEEE
Computer Society, http://csdl.computer.org/comp/trans/tc/1992/08/t0940abs.htm

[3] A. Weinberger and J . L. Smith, A logic for high-speed addition, National
Bureau of Standards Circular591, pp. 312, 1958.

[4] T. Lynch and E. E. Swartzlander, J r., The redundant cell adder, in Proc. 10th
Symp. Comput. Arithmetic, 1991, pp. 165170.

[5] V. Kantabutra, Designing one-level carry-skip adders, IEEE Trans. Comput.,
vol. 42, no. 6, pp. 759764, J une 1993.

[6] A. Weinberger and J . L. Smith, A logic for high-speed addition, National
Bureau of Standards Circular591, pp. 312, 1958.

[7] Y. Harata et al., A high-speed multiplier using a redundant binary adder tree,
IEEE J. Solid-State Circuits, vol. SC-22, pp. 2834, Feb. 1987.

[8] http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_prog/

[9] E. Herniter, Programming in MATLAB, Northern Arizona University, 2001.

[10] R. Lipsatt, VHDL: Hardware Description and design, Intermetrics Inc., 1993.

[11] Peter Kornerup, Southern Danish University IEEE Computer Society
http://csdl.computer.org/comp/proceedings/asap/2002/1712/00/17120218abs.htm

[12] F. Ancona, S. Rovetta, R. Zumino , High performance in tree-based parallel
architectures Genoa Univ IEEE Computer Society, HUNGARY p.474.


VHDL Implementation of Fast adder trees

44
[13] A. Avizienis, Signed-digit number representations for fast parallel arithmetic,
IRE Transactions on Electronic Computers, vol. EC-10, pp. 389400, Sep 1961.

[14] I. Koren, Computer Arithmetic Algorithms, Englewood Cliffs, N.J .: Prentice
Hall, 1993.
VHDL Implementation of Fast adder trees

45
Appendices
Appendix 1 MATLAB program for each level
clear
a=random bit array
% show fulladder's ouputs , divided two cell arrays, one store the output
% sum data name, another store the carry_out data name.(fos)(foc)
% show half adder's ouputs , divided two cell arrays, one store the output
% sum data name, another store the carry_out data name.(hos)(hoc)
% antoher cell array sotre the BP's output data name(bpos)
% (cncolumn) is a cell array to show the total carry_out values to next
% column in each level
% define total levels
final=a;
model=a;
result=a;
page=0;
while max(result)>2
a=result;
result =zeros(1, length(a));
carry_out =0;
for k =length(a):-1:1 % k columnes in vector
b=a(k);% get the number of each column
rem (b,3);% get rem of b/3
c=fix(b./3);% #of full adder
d=b-3.*fix(b./3);% remainder
carry_in=carry_out;

if d==0
carry_out_HA=0;
sum=0;
elseif d==1
carry_out_HA=0;
sum=1;
else
VHDL Implementation of Fast adder trees

46
carry_out_HA=1;
sum=1;
end
carry_out =carry_out_HA +c;

if k==length(a)
totalsum=sum+c;
else
totalsum=c+carry_in+sum;
end

result(k)=totalsum;
end


if carry_out~=0
result =[ carry_out, result ];
end

page=page+1;
disp(result)
a1=result;

end

level=page;
rowcon=cell(1,level+1);
a=model; % define a vector
result1 =zeros(page, length( result))
result =a;
page =1;
while max(result)>2
a=result;
result =zeros(1, length(a));
carry_out =0;
for k =length(a):-1:1 % k columnes in vector
b=a(k);% get the number of each column
rem (b,3);
c=fix(b./3);
VHDL Implementation of Fast adder trees

47
d=b-3.*fix(b./3);
carry_in=carry_out;
if d==0
carry_out_HA=0;
sum=0;%
elseif d==1
carry_out_HA=0;
sum=1;
else
carry_out_HA=1;
sum=1;
end
carry_out =carry_out_HA +c;

if k==length(a)
totalsum=sum+c;
else
totalsum=c+carry_in+sum;
end

result(k)=totalsum;
% result=[result, totalsum]
end
if carry_out~=0
result =[ carry_out, result ];
end
for intI =length(result):-1:1
result1 ( page, intI ) =result(intI)
end
page =page +1;
end
Fresult=result1;

page1=level;
page1=page1+1;
lcolumn=cell(1,page1);
for i=1:1:page1
if i==1
a=model;
VHDL Implementation of Fast adder trees

48
vcolumn=length(a);
else
a=Fresult(i-1,:);
marix=a;

for k=length(a):-1:1
b=a(k);
if b==0
a(:,k)=[];
else
matrix=a;
end
end
vcolumn=length(a);

end
lcolumn(1,i)={vcolumn};
end
page=level;
page=page+1;
tcolumns=0;

carfull=cell(1,tcolumns, page);
carhalf=cell(1,tcolumns, page);
carbp =cell(1,tcolumns, page);
for i=1:1:page

if i==1

a=model;
tcolumns=lcolumn{1,1};

% carfull=cell(1,tcolumns, 1);
tnFA=0;
tnHA=0;
tnBP=0;

for k1=tcolumns:-1:1
b=a(k1);% get the number of each column
VHDL Implementation of Fast adder trees

49
rem (b,3);% get rem of b/3
c=fix(b./3);% #of full adder
d=b-3.*fix(b./3);% remainder
if d==0
nFA=c;
nHA=0;
nBP=0;
temptnHA=nHA;
temptnFA=nFA;
temptnBP=nBP;

elseif d==1
nFA=c;
nHA=0;
nBP=1;

temptnHA=nHA;
temptnFA=nFA;
temptnBP=nBP;

else
nFA=c;
nHA=1;
nBP=0;
temptnHA=nHA;
temptnFA=nFA;
temptnBP=nBP;
end
carfull(1,k1,1)={temptnFA};
carhalf(1,k1,1)={temptnHA};
carbp(1,k1,1)={temptnBP};
end
else
a=Fresult(i-1,:);
tcolumns=lcolumn{1,i};
tnFA=0;
tnHA=0;
tnBP=0;
for k1=1:1:tcolumns
VHDL Implementation of Fast adder trees

50
b=a(k1);% get the number of each column
rem (b,3);% get rem of b/3
c=fix(b./3);% #of full adder
d=b-3.*fix(b./3);% remainder
if d==0
nFA=c;
nHA=0;
nBP=0;
temptnHA=nHA;
temptnFA=nFA;
temptnBP=nBP;

elseif d==1
nFA=c;
nHA=0;
nBP=1;

temptnHA=nHA;
temptnFA=nFA;
temptnBP=nBP;

else
nFA=c;
nHA=1;
nBP=0;
temptnHA=nHA;
temptnFA=nFA;
temptnBP=nBP;
end
carfull(1,k1,i)={temptnFA};
carhalf(1,k1,i)={temptnHA};
carbp(1,k1,i)={temptnBP};

end
end
end
% define maxf for show full_adder 's name
page=level;
b=[];
VHDL Implementation of Fast adder trees

51
maxf=cell(1,page);
for i =1:1:page
column=lcolumn{1,i};
for k=1:1:column

b=carfull{1,k,i};
comp(1,k)=b;

maxn=max(comp);
maxf(1,i)={maxn};
end
end

page=level;
b=[];
maxg=cell(1,page);
for i =1:1:page
column=lcolumn{1,i};
for k=1:1:column
b=carfull{1,k,i};
comp(1,k)=b;
maxn=max(comp);
maxg(1,i)={3*maxn};
end
end
% show all full adders name
page=level;
max=0;
tcolumnf=0;
c=1;
funame=cell(max,tcolumnf,page);

for i =1:1:page
tcolumnf=lcolumn{1,i};
max=maxf{1,i};
for k1=1:1:tcolumnf
fn=carfull{1,k1,i};
c=1;
for q=1:1:fn
VHDL Implementation of Fast adder trees

52
for p=c:1:fn
funame(p,k1,i)={ strcat( 'full_adder_', num2str(i), '_',
num2str(k1),'_',num2str(q) ) };
end
c=c+1;
end
end
end
% show all half adder's name
page=level;
tcolumnh=0;
haname=cell(1,tcolumnh,page);

for i=1:1:page

tcolumnh=lcolumn{1,i};

for k=1:1:tcolumnh
hn=carhalf{1,k,i};
if hn==1;

haname(1,k,i)={ strcat( 'half_adder_', num2str(i), '_', num2str(k) ) };
else
haname(1,k,i)={[] };
end
end
end

% show BP 's name, although BP have no component, benefit for seek it's
% input and output
page=level;
tcolumnbp=0;
bpname=cell(1,tcolumnbp,page);

for i=1:1:page

tcolumnbp=lcolumn{1,i};

for k=1:1:tcolumnbp
VHDL Implementation of Fast adder trees

53
bpn=carbp{1,k,i};
if bpn==1;

bpname(1,k,i)={ strcat( 'bp_', num2str(i), '_', num2str(k) ) };
else
bpname(1,k,i)={[] };
end
end
end
%store the full adder's input names of each level
page=level;
tcolumnfd=0;
maxz=0;
c=1;
fuinput=cell(maxz, tcolumnfd, page);
for i=1:1:page
tcolumnfd=lcolumn{1,i};
maxz=maxg{1,i};
for k1=1:1:tcolumnfd
fn=carfull{1,k1,i};

c=1;

for q=0:1:3*fn-1
for p=c:1:3*fn
fuinput(p,k1,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k1),'(',num2str(q), ')') };
end
c=c+1;
end
end
end

%store the half adder's input names of each level

page=level;

tcolumnhd=0;

VHDL Implementation of Fast adder trees

54
hainput=cell(2, tcolumnhd, page);

for i=1:1:page

tcolumnhd=lcolumn{1,i};

for k=1:1:tcolumnhd

fn=carfull{1,k,i};
hn=carhalf{1,k,i};
if fn==0
if hn~=0


hainput(1,k,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k),'(',num2str(0), ')') };
hainput(2,k,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k),'(',num2str(1), ')') };
else
hainput(1,k,i)={[]};
hainput(2,k,i)={[]};

end

else
if hn~=0
b=3*fn;
c=3*fn+1;

hainput(1,k,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k),'(',num2str(b), ')') };
hainput(2,k,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k),'(',num2str(c), ')') };
else
hainput(1,k,i)={[]};
hainput(2,k,i)={[]};

end

VHDL Implementation of Fast adder trees

55
end
end
end
% show BP's input's names in each level although BP is not a component, we
% look it as a component
page=level;
tcolumnbd=0;
bpinput=cell(1, tcolumnbd, page)

for i=1:1:page

tcolumnbd=lcolumn{1,i};

for k=1:1:tcolumnbd

fn=carfull{1,k,i};
bpn=carbp{1,k,i};
if fn==0
if bpn~=0


bpinput(1,k,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k),'(',num2str(0), ')') };

else
bpinput(1,k,i)={[]};


end

else
if bpn~=0
b=3*fn;


bpinput(1,k,i)={ strcat( 'in_data_', num2str(i), '_',
num2str(k),'(',num2str(b), ')') };

else
VHDL Implementation of Fast adder trees

56
bpinput(1,k,i)={[]};


end

end
end
end
page=level;
max=0;

foscolumn=0;
hoscolumn=0;
bposcolumn=0;

foccolumn=0;
hoccolumn=0;

cncolumn=0;

fos=cell(max, foscolumn ,page );
hos=cell(1,hoscolumn,page);
bpos=cell(1,bposcolumn, page);

foc=cell(max,foccolumn,page);
hoc=cell(1,hoccolumn,page);

carrynum=cell(1,cncolumn,page);

for i=1:1:page

foscolumn=lcolumn{1,i};
hoscolumn=lcolumn{1,i};
bposcolumn=lcolumn{1,i};
foccolumn=lcolumn{1,i};
hoccolumn=lcolumn{1,i};
cncolumn=lcolumn(1,i);
scolumn=foscolumn;

VHDL Implementation of Fast adder trees

57
max=maxf{1,i};


for k=scolumn:-1:1
m=scolumn;
if k==m
fn=carfull{1,k,i};
hn=carhalf{1,k,i};
bpn=carbp{1,k,i};

if (fn==0) & (hn==0) & (bpn~=0)

bpos(1,k,i)= { strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(0), ')') };

carrynum(1,k,i)={0};


elseif (fn==0) & (hn~=0) & (bpn==0)

hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(0), ')') };


hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(0), ')') };

carrynum(1,k,i)={1};

elseif (fn~=0) & (hn==0) & (bpn~=0)

c=1; % show fos(1,k,i) and foc(1,k,i)

for q=0:1:fn-1
for p=c:1:fn
fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(q), ')') };
foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(q), ')') };
VHDL Implementation of Fast adder trees

58
end
c=c+1;
end

bpos(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(fn), ')') };
carrynum(1,k,i)={fn};

elseif (fn~=0) & (hn==0) & (bpn==0)

c=1; % show fos(1,k,i) and foc(1,k,i)

for q=0:1:fn-1
for p=c:1:fn
fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(q), ')') };
foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(q), ')') };
end
c=c+1;
end

carrynum(1,k,i)={fn};

else (fn~=0) & (hn~=0) & (bpn==0)

c=1; % show fos(1,k,i) and
foc(1,k,i)

for q=0:1:fn-1
for p=c:1:fn
fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(q), ')') };
foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(q), ')') };
end
c=c+1;
end

VHDL Implementation of Fast adder trees

59

hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(fn), ')') };
hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(fn), ')') };
carrynum(1,k,i)={fn+1};
end



else
fn=carfull{1,k,i};
hn=carhalf{1,k,i};
bpn=carbp{1,k,i};

if (fn==0) & (hn~=0)
hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(0), ')') };
carrynum(1,k,i)={1};

elseif (fn~=0) & (hn==0)

c=1; % show fos(1,k,i) and foc(1,k,i)

for q=0:1:fn-1
for p=c:1:fn

foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(q), ')') };
end
c=c+1;
end
carrynum(1,k,i)={fn};



elseif (fn==0)& (hn==0)

carrynum(1,k,i)={0};
VHDL Implementation of Fast adder trees

60



else (fn~=0) & (hn~=0)

c=1; % show fos(1,k,i) and foc(1,k,i)

for q=0:1:fn-1
for p=c:1:fn

foc(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(q), ')') };
end
c=c+1;
end

hoc(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k-1),'(',num2str(fn), ')') };
carrynum(1,k,i)={fn+1};
end


% second show fos and hos and bpos k=scolumn-1:-1:1
fn=carfull{1,k,i};
hn=carhalf{1,k,i};

bpn=carbp{1,k,i};

cn=carrynum{1,k+1,i};
if (fn==0) & (hn==0) & (bpn~=0)




bpos(1,k,i)= { strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(cn), ')') };


elseif (fn==0) & (hn~=0) & (bpn==0)
VHDL Implementation of Fast adder trees

61



hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(cn), ')') };



elseif (fn~=0) & (hn==0) & (bpn~=0)





c=1;

for q=0:1:fn-1
for p=c:1:fn
fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(q+cn), ')') };

end
c=c+1;
end

bpos(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(fn+cn), ')') };



elseif (fn~=0) & (hn==0) & (bpn==0)

c=1;

for q=0:1:fn-1
for p=c:1:fn
fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(q+cn), ')') };

VHDL Implementation of Fast adder trees

62
end
c=c+1;
end


else (fn~=0) & (hn~=0) & (bpn==0)

c=1;

for q=0:1:fn-1
for p=c:1:fn
fos(p,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(q+cn), ')') };

end
c=c+1;
end


hos(1,k,i)={ strcat( 'out_data_', num2str(i), '_',
num2str(k),'(',num2str(fn+cn), ')') };


end



end
end
end

disp(funame)
disp(haname)
disp(bpname)
disp(fuinput)
disp(hainput)
disp(bpinput)
disp(fos);
disp(hos);
VHDL Implementation of Fast adder trees

63
disp(bpos);
disp(foc);
disp(hoc);



% total need 11 databases to store inputs , oupputs and component
% information
% full_adder 's name store in cell array======>funame.
% full_adder's name like full_adder_4_3_2 : 4 mean level four, 3 mean
% column three, 2 mean the second full_adder in this column.
% half_adder's name store in cell array ======>haname
% half_adder 's name like half_adder_4_3 mean level 4 and third column
% Bypass is not a component, but for index, consider it is, store in cell
% array =====>bpname bp_1_3 mean level 1 and column 3
% full_adder 's input data : in_data_1_1(1) mean level1 , column 1, second input ======>
fuinput
% half_adder 's input data : in_data_1_1(1) mean level1 , column 1, second input ======>
hainput

% Bypass 's input data : in_data_1_1(1) mean level1 , column 1, second input ======>
bpinput

% output data can divided into five cell arrays to show
% full_adder 's sum informaion ====>fos , full_adder's carry_out infomation =====>foc
% half_adder's sum information ======> hos, half_adder's carry_out infomation
============>hoc
% Bypass's output infor ==============>bpos .



page=level;
a=final;
fid =fopen('eachlevel.vhdl', 'w');
for i=1:1:page
if i==1

fprintf(fid,'library ieee; \n');
fprintf(fid,' \n');
VHDL Implementation of Fast adder trees

64
fprintf(fid,'use ieee.std_logic_1164.all; \n');
fprintf(fid,' \n');

fprintf(fid, 'entity tree_level_%d is \n',i);
fprintf(fid, 'port( \n');


fork=length(a):-1:1
% level 1 input
fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k, a(k)-1);

end


result =zeros(1, length(a));
carry_out =0;

for k =length(a):-1:1 % k columnes in vector
b=a(k);% get the number of each column
rem (b,3);% get rem of b/3
c=fix(b./3);% #of full adder
d=b-3.*fix(b./3);% remainder
carry_in=carry_out;
if d==0
carry_out_HA=0;
sum=0;
elseif d==1
carry_out_HA=0;
sum=1;
else
carry_out_HA=1;
sum=1;
end
carry_out =carry_out_HA +c;

if k==length(a)
totalsum=sum+c;
else
totalsum=c+carry_in+sum;
VHDL Implementation of Fast adder trees

65
end

result(k)=totalsum;
% result=[result, totalsum]
end

if carry_out~=0
result =[ carry_out, result ];
end
c=lcolumn{1,1};
d=lcolumn{1,2};
% level 1 output
if d==c


for m=length(result):-1:1
if m==1
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
result(m)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
result(m)-1);
end
end

else

for m=length(result)-1:-1:0
% level output
if m==0
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
result(m+1)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
result(m+1)-1);
end
end
end
fprintf(fid,'); \n');
VHDL Implementation of Fast adder trees

66
fprintf(fid,'end tree_level_%d; \n',i);
fprintf(fid,' \n');
fprintf(fid,'architecture tree_level_%d of tree_level_%d is \n',i,i);
fprintf(fid,' \n');
fprintf(fid,'component full_adder \n');
fprintf(fid,'port( \n');
fprintf(fid,'ain, bin, cin : in std_logic; \n');
fprintf(fid,'sout,cout : out std_logic ); \n');
fprintf(fid,'end component; \n');
fprintf(fid,' \n');
fprintf(fid,'component half_adder \n');
fprintf(fid,'port( \n');
fprintf(fid,'ain, bin : in std_logic; \n');
fprintf(fid,'sout,cout: out std_logic ); \n');
fprintf(fid,'end component; \n');
fprintf(fid,' \n');
fprintf(fid,'begin \n');



% define the relation between adder and it's in out put
column=lcolumn{1,i}; % how many columns in level 1
for k=column:-1:1
fn=carfull{1,k,i};
hn=carhalf{1,k,i};
bpn=carbp{1,k,i};

if fn~=0
for m=1:1:fn

fprintf(fid,'%s : full_adder port map ( %s,%s,%s,%s,%s); \n', funame{m,k,i},
fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i},fos{m,k,i},foc{m,k,i} );

end

else
fprintf(fid,' \n');
end

VHDL Implementation of Fast adder trees

67
if hn~=0;
fprintf(fid,'%s : half_adder port map ( %s,%s,%s,%s); \n', haname{1,k,i},
hainput{1,k,i},hainput{2,k,i},hos{1,k,i},hoc{1,k,i} );
else
fprintf(fid,' \n');
end

if bpn~=0;
fprintf(fid,'%s <=%s ; \n', bpos{1,k,i},bpinput{1,k,i});

else

fprintf(fid,' \n');
end
end
fprintf(fid,'end; \n');


else

fprintf(fid,'library ieee; \n');
fprintf(fid,'use ieee.std_logic_1164.all; \n');
fprintf(fid,'\n');

fprintf(fid, 'entity tree_level_%d is \n',i);
fprintf(fid, 'port( \n');



a=Fresult(i-1,:);
matrix=a;

for k=length(matrix):-1:1
b=matrix(k);
if b==0
matrix(:,k)=[];
a=matrix;


VHDL Implementation of Fast adder trees

68
else
a=matrix;

end


end
for k1=length(a):-1:1

fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k1,
a(k1)-1);
end
a=Fresult(i,:);
matrix=a;

for k=length(matrix):-1:1
b=matrix(k);
if b==0
matrix(:,k)=[];
a=matrix;


else
a=matrix;

end

end

e=i+1;

d=lcolumn{1,e};
c=lcolumn{1,e-1};

if d==c


form=length(a):-1:1
% level output
VHDL Implementation of Fast adder trees

69
if m==1
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
a(m)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
a(m)-1);
end
end

else

form=length(a)-1:-1:0
% level output
if m==0
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
a(m+1)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
a(m+1)-1);
end
end
end

fprintf(fid,'); \n');
fprintf(fid,'end tree_level_%d; \n',i);
fprintf(fid,'\n');

%% show arcitecture

fprintf(fid,'architecture tree_level_%d of tree_level_%d is \n',i,i);
fprintf(fid,'\n');
fprintf(fid,'component full_adder \n');
fprintf(fid,'port( \n');
fprintf(fid,'ain, bin, cin : in std_logic; \n');
fprintf(fid,'sout, cout : out std_logic ); \n');
fprintf(fid,'end component; \n');
fprintf(fid,'\n');
fprintf(fid,'component half_adder \n');
fprintf(fid,'port( \n');
VHDL Implementation of Fast adder trees

70
fprintf(fid,'ain, bin : in std_logic; \n');
fprintf(fid,'sout, cout : out std_logic ); \n');
fprintf(fid,'end component; \n');
fprintf(fid,'\n');
fprintf(fid,'begin \n'); % set the realtion
column=lcolumn{1,i}; % how many columns in level 1
for k=column:-1:1
fn=carfull{1,k,i};
hn=carhalf{1,k,i};
bpn=carbp{1,k,i};
if fn~=0
for m=1:1:fn
fprintf(fid,'%s : full_adder port map ( %s,%s,%s,%s,%s); \n', funame{m,k,i},
fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i},fos{m,k,i},foc{m,k,i} );
end
else
fprintf(fid,' \n');
end
if hn~=0;
fprintf(fid,'%s : half_adder port map ( %s,%s,%s,%s); \n', haname{1,k,i},
hainput{1,k,i},hainput{2,k,i},hos{1,k,i},hoc{1,k,i} );
else
fprintf(fid,' \n');
end

if bpn~=0;
fprintf(fid,'%s <=%s ; \n', bpos{1,k,i},bpinput{1,k,i});

else
fprintf(fid,' \n');
end
end
fprintf(fid,'end; \n');
end
end
fclose(fid)
VHDL Implementation of Fast adder trees

71
Appendix 2 MATLAB program for top level

page=lelve
fid =fopen('toplevel.vhdl', 'w);
fprintf(fid,'library ieee; \n');
fprintf(fid,' \n');
fprintf(fid,'use ieee.std_logic_1164.all; \n');
fprintf(fid,' \n');
fprintf(fid, 'entity top_level is \n');
fprintf(fid, 'port( \n');

for k=length(a):-1:1 % level 1 input
fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',1, k,
a(k)-1);
end


% last level output
i=page;
a=Fresult(i,:);
matrix=a;

for k=length(matrix):-1:1
b=matrix(k);
if b==0
matrix(:,k)=[];
a=matrix;


else
a=matrix;

end

end

VHDL Implementation of Fast adder trees

72
e=i+1;

d=lcolumn{1,e};
c=lcolumn{1,e-1};

if d==c
for m=length(a):-1:1
% level output
if m==1
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
a(m)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
a(m)-1);
end
end

else

for m=length(a)-1:-1:0
% level output
if m==0
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
a(m+1)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
a(m+1)-1);
end
end
end

fprintf(fid,'); \n');
fprintf(fid,'end top_level; \n');
fprintf(fid,' \n');
fprintf(fid,'architecture top_level of top_level is \n');
fprintf(fid,' \n');
fprintf(fid,'component full_adder \n');
fprintf(fid,'port( \n');
fprintf(fid,'ain, bin, cin : in std_logic; \n');
VHDL Implementation of Fast adder trees

73
fprintf(fid,'cout, sout : out std_logic ); \n');
fprintf(fid,'end component; \n');
fprintf(fid,' \n');
fprintf(fid,'component half_adder \n');
fprintf(fid,'port( \n');
fprintf(fid,'ain, bin : in std_logic; \n');
fprintf(fid,'cout, sout: out std_logic ); \n');
fprintf(fid,'end component; \n');
fprintf(fid,' \n');


for i=1:1:page

fprintf(fid,'component tree_level_%d \n',i);
fprintf(fid, 'port( \n');
if i==1;
a=final;
for k =length(a):-1:1
% level 1 input
fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k, a(k)-1);

end
result =zeros(1, length(a));
carry_out =0;

for k =length(a):-1:1 % k columnes in vector
b=a(k);% get the number of each column
rem (b,3);% get rem of b/3
c=fix(b./3);% #of full adder
d=b-3.*fix(b./3);% remainder
carry_in=carry_out;
if d==0
carry_out_HA=0;
sum=0;

elseif d==1
carry_out_HA=0;
sum=1;
else
VHDL Implementation of Fast adder trees

74
carry_out_HA=1;
sum=1;
end
carry_out =carry_out_HA +c;

if k==length(a)
totalsum=sum+c;
else
totalsum=c+carry_in+sum;
end

result(k)=totalsum;

end

if carry_out~=0
result =[ carry_out, result ];
end




c=lcolumn{1,1};
d=lcolumn{1,2};
% level 1 output
if d==c


for m=length(result):-1:1
if m==1
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
result(m)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
result(m)-1);
end
end

else
VHDL Implementation of Fast adder trees

75

for m=length(result)-1:-1:0
% level output
if m==0
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
result(m+1)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
result(m+1)-1);
end
end
end

fprintf(fid,' \n');
else
a=Fresult(i-1,:);
matrix=a;

for k=length(matrix):-1:1
b=matrix(k);
if b==0
matrix(:,k)=[];
a=matrix;


else
a=matrix;

end


end
for k1=length(a):-1:1

fprintf(fid, 'in_data_%d_%d : in std_logic_vector(%d downto 0);\n',i, k1,
a(k1)-1);
end

a=Fresult(i,:); % show outputs
VHDL Implementation of Fast adder trees

76
matrix=a;

for k=length(matrix):-1:1
b=matrix(k);
if b==0
matrix(:,k)=[];
a=matrix;


else
a=matrix;

end

end
e=i+1;
d=lcolumn{1,e};
c=lcolumn{1,e-1};

if d==c


for m=length(a):-1:1
% level output
if m==1
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
a(m)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
a(m)-1);
end
end

else

for m=length(a)-1:-1:0
% level output
if m==0
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0) \n',i, m,
VHDL Implementation of Fast adder trees

77
a(m+1)-1);
else
fprintf(fid, 'out_data_%d_%d : out std_logic_vector(%d downto 0);\n',i, m,
a(m+1)-1);
end
end
end
end


fprintf(fid,'); \n');
fprintf(fid,'end component; \n');
fprintf(fid,' \n');
end
fprintf(fid,'begin \n');
page=level;
for i=1:1:page

fprintf(fid,'adder_stage_%d : tree_level_%d port map (',i,i);
column=lcolumn{1,i};
for k=column:-1:2
fn=carfull{1,k,i};
hn=carhalf{1,k,i};
bpn=carbp{1,k,i};
if (fn~=0) & (hn~=0) & (bpn==0)
for m=1:1:fn

fprintf(fid,'%s,%s,%s,',fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i});
fprintf(fid,'%s,%s,%s,',fos{m,k,i},foc{m,k,i});
end

fprintf(fid,'%s,%s,',hainput{1,k,i},hainput{2,k,i});
fprintf(fid,'%s,%s,', hos{1,k,i},hos{1,k,i});


elseif (fn~=0) & (hn==0) & (bpn~=0)
for m=1:1:fn

fprintf(fid,'%s,%s,%s,',fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i});
VHDL Implementation of Fast adder trees

78
fprintf(fid,'%s,%s,%s,',fos{m,k,i},foc{m,k,i});
end
fprintf(fid,'%s,',bpinput{1,k,i});
fprintf(fid,'%s,',bpos{1,k,i});

elseif (fn~=0) & (hn==0) & (bpn==0)
for m=1:1:fn

fprintf(fid,'%s,%s,%s,',fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i});
fprintf(fid,'%s,%s,%s,',fos{m,k,i},foc{m,k,i});
end


elseif (fn==0) & (hn~=0) & (bpn==0)
fprintf(fid,'%s,',hainput{1,k,i});
fprintf(fid,'%s,',hainput{2,k,i});
fprintf(fid,'%s,%s,', hos{1,k,i},hoc{1,k,i});


else (fn==0) & (hn==0) & (bpn~=0)
fprintf(fid,'%s,',bpinput{1,k,i});
fprintf(fid,'%s,',bpos{1,k,i});

end

end

k=1;
fn=carfull{1,k,i};
hn=carhalf{1,k,i};
bpn=carbp{1,k,i};
if (fn~=0) & (hn~=0) & (bpn==0)
for m=1:1:fn

fprintf(fid,'%s,%s,%s,',fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i});
fprintf(fid,'%s,%s,%s,',fos{m,k,i},foc{m,k,i});
end
fprintf(fid,'%s,',hainput{1,k,i});
fprintf(fid,'%s,',hainput{2,k,i});
VHDL Implementation of Fast adder trees

79
fprintf(fid,'%s,%s); \n', hos{1,k,i},hos{1,k,i});

elseif (fn~=0) & (hn==0) & (bpn~=0)
for m=1:1:fn

fprintf(fid,'%s,%s,%s,',fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i});
fprintf(fid,'%s,%s,%s,',fos{m,k,i},foc{m,k,i});
end
fprintf(fid,'%s,',bpinput{1,k,i});
fprintf(fid,'%s); \n',bpos{1,k,i});

elseif (fn~=0) & (hn==0) & (bpn==0)
for m=1:1:fn

fprintf(fid,'%s,%s,%s,',fuinput{3*m-2,k,i},fuinput{3*m-1,k,i},fuinput{3*m,k,i});

end
for m=1:1:fn-1
fprintf(fid,'%s,%s,%s,',fos{m,k,i},foc{m,k,i});
end
m=fn;
fprintf(fid,'%s,%s,%s); \n',fos{m,k,i},foc{m,k,i});

elseif (fn==0) & (hn~=0) & (bpn==0)
fprintf(fid,'%s,',hainput{1,k,i});
fprintf(fid,'%s,',hainput{2,k,i});
fprintf(fid,'%s,%s); \n', hos{1,k,i},hos{1,k,i});

else (fn==0) & (hn==0) & (bpn~=0)
fprintf(fid,'%s,',bpinput{1,k,i});
fprintf(fid,'%s); \n',bpos{1,k,i});
end
end
fprintf(fid,'end; \n');
fclose(fid)

You might also like