You are on page 1of 28

L8/9: Arithmetic Structures

Acknowledgements:

Materials in this lecture are courtesy of the following sources and are used with permission.
Rex Min
Kevin Atkinson
Prof. Randy Katz (Unified Microelectronics Corporation Distinguished Professor in Electrical
Engineering and Computer Science at the University of California, Berkeley) and Prof. Gaetano
Borriello (University of Washington Department of Computer Science & Engineering) from
Chapter 2 of R. Katz, G. Borriello. Contemporary Logic Design. 2nd ed. Prentice-Hall/Pearson
Education, 2005.
J. Rabaey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuits: A Design Perspective
Prentice Hall/Pearson, 2003.
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Number Systems Basics


How to represent negative numbers?

Three common schemes: sign-magnitude, ones


complement, twos complement

Sign-magnitude: MSB = 0 for positive, 1 for negative


Range:

-(2N-1 1) to +(2N-1 1)
Two representations for zero: 0000 & 1000
Simple multiplication but complicated addition/subtraction

_
Ones complement: if N is positive then its negative is N
Example:

0111 = 7, 1000 = -7
Range: -(2N-1 1) to +(2N-1 1)
Two representations for zero: 0000 & 1111
Subtraction implemented as addition and negation

L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Twos Complement Representation


Twos complement = bitwise complement + 1
0111 1000 + 1 = 1001 = -7
1001 0110 + 1 = 0111 = 7
Asymmetric range: -2N-1 to +2N-1-1
Only one representation for zero
Simple addition and subtraction
Most common representation

0100

-4

1100

0100

-4

1100

+3

0011

+ (-3)

1101

-3

1101

+3

0011

0111

-7

11001

10001

-1

1111

[Katz05]
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Overflow Conditions
Add two positive numbers to get a negative number or two negative numbers
to get a positive number
-2
-3
-4

-1

+0

-1
1111
1110

0001

1101

0010

1100

-5

+3

0101

1001

-7

0110
1000

0111

-8

-4

+6

1110

5 + 3 = -8!

0001

1101

0010

1010

-6

0011

+3

0100

+4

1001

-7

0110
1000

-7 - 2 = +7!

0111

+7

-7

0011

-2

1100

01000

5
3
-8

+5

+6

1000
1001

0111
0101

+2

0101

-8

+7

+1

0000

1011

+4
+5

1111

1100

-5

0100

1010

-3

+2

0011

1011

-6

-2

+1

0000

+0

10111

If carry in to sign equals carry out then can ignore carry out, otherwise have overflow
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Binary Full Adder


A

Full
Adder

Ci

S = A B Ci
= ABCi + ABCi + ABCi + ABCi

Co

Co = AB + Ci (A+B)

S
A
0
0
0
0
1
1
1
1

B
0
0
1
1
0
0
1
1

CI
0
1
0
1
0
1
0
1

S
0
1
1
0
1
0
0
1

CO
0
0
0
1
0
1
1
1

AB
CI
00

11

10

AB
CI
00

01

11

10

CO

L8/9: 6.111 Spring 2006

01

Introductory Digital Systems Laboratory

Ripple Carry Adder Structure

B3 A3
Co,3

Full Co,2 Full


Adder
Adder
S3

B1 A1

B2 A2

S2

Co,1

Full
Adder
S1

B0 A0
Co,0

Full
Adder

Ci,0

S0

Worst case propagation delay linear with the number of bits


tadder = (N-1)tcarry + tsum

L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Extension to Subtraction
Under twos complement, subtracting B is the same as
adding the bitwise complement of B then adding 1
Combination addition/subtraction system:
_

mux selects B for addition, B for subtraction


B3
A3

Co,3

B2

B3
A2

0 1

FA
S3

Co,2

B1

B2
A1

0 1

FA
S2

Co,1

B1

B0
A0

0 1

FA

Co,0

S1

B0

0 1

Add/Subtract

FA
S0

Add 1 for
subtraction using
carry in

Overflow occurs if carry in to sign bit differs from final carry out
overflow
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Comparator (one approach)


B3
A3

A2

0 1

FA

Co,3

B2

B3

Co,2

B1

B2
A1

0 1

FA

Co,1

B0
A0

0 1

FA

S2

S3

B1

Co,0

B0

0 1

FA

S1

S0

N
true if negative
result

true if zero result

A<B = N
A=B = Z
AB = Z+N
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Alternate Adder Logic Formulation


How to Speed up the Critical (Carry) Path?
(How to Build a Fast Adder?)
A

Cin

Full
Adder

Co

S
Generate (G) = AB
Propagate (P) = A B

Note: can also use P = A + B for Co


L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

Carry Bypass Adder


A0

B0

A1

Ci,0

G0 P1

FA

Co,0

P0

FA

P2

Co,0

A3

FA

Co,1

FA

P3

G3

FA

Co,2

P,G
G1

P2

Co,1

B3

P,G
G2

P,G
G0 P1

FA

B2

P,G
G1

P,G
Ci,0

A2

P,G

P,G
P0

B1

FA

Co,3

BP= P0P1P2P3

P,G
G2

P3

Co,2

Can compute P, G
in parallel for all bits

G3

FA

Co,3

Key Idea: if (P0 P1 P2 P3) then Co,3 = Ci,0


L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

10

16-bit Carry Bypass Adder

BP= P0P1P2P3
P,G
Ci,0

P,G

P,G

P,G

FA FA FA FA
Co,0

Co,1

BP= P4P5P6P7

Co,2

P,G

0
1

Co,3

P,G

P,G

P,G

FA FA FA FA
Co,4

Co,5

BP= P8P9P10P11

Co,6

P,G
0
1

Co,7

P,G

P,G

P,G

FA FA FA FA
Co,8

Co,9

BP= P12P13P14P15

Co,10

P,G
Co,11
0
1

P,G

P,G

P,G

FA FA FA FA
Co,12

Co,13

Co,14

0
1
Co,15

Assume the following for delay each gate:


P, G from A, B: 1 delay unit
P, G, Ci to Co or Sum for a FA: 1 delay unit
2:1 mux delay: 1 delay unit

What is the worst case propagation delay for the 16-bit adder?

L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

11

Critical Path Analysis

BP= P0P1P2P3
P,G
Ci,0

P,G

P,G

P,G

FA FA FA FA
Co,0

Co,1

BP2= P4P5P6P7

Co,2

P,G

0
1

Co,3

P,G

P,G

P,G

FA FA FA FA
Co,4

Co,5

Co,6

BP3= P8P9P10P11
P,G

0
1

Co,7

P,G

P,G

P,G

FA FA FA FA
Co,8

Co,9

BP4= P12P13P14P15

Co,10

P,G
Co,11
0
1

P,G

P,G

P,G

FA FA FA FA
Co,12

Co,13

Co,14

0
1
Co,15

For the second stage, is the critical path:


BP2 = 0 or BP2 = 1?

Message: Timing Analysis is Very Tricky


Must Carefully Consider Data Dependencies For
False Paths
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

12

Carry Lookahead Adder


Re-express the carry logic as follows:
C1 = G0 + P0 C0
C2 = G1 + P1 C1 = G1 + P1 G0 + P1 P0 C0
C3 = G2 + P2 C2 = G2 + P2 G1 + P2 P1 G0 + P2 P1 P0 C0
C4 = G3 + P3 C3 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 C0

Each of the carry equations can be implemented in a two-level logic


network
Variables are the adder inputs and carry in to stage 0

Ripple effect has been eliminated!


L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

13

Carry Lookahead Logic


Ai
Bi

Pi

Ci

Si

Adder with propagate and


generate outputs

Gi

Later stages have increasingly complex logic


C0
P0

C1

G0

C2

G1
P2
G2

G1

C0
P0
P1
P2
P3

G0
P1
P2

C0
P0
P1
G0
P1

C0
P0
P1
P2

C3

G0
P1
P2
P3
G1
P2
P3
G2
P3

C4

G3
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

14

Block Generate and Propagate


Gj:i and Pj:i denote the Generate and Propagate functions, respectively, for a group of bits
from positions i to j. We call them Block Generate and Block Propagate. Gj:i equals 1 if
the group generates a carry independent of the incoming carry. Pj:i equals 1 if an
incoming carry propagates through the entire group. For example, G3:2 is equal to 1 if a
carry is generated at bit position 3, or if a carry out is generated at bit position 2 and
propagates through position 3. G3:2 = G3 + P3G2. P3:2 is true if an incoming carry
propagates through both bit positions 2 and 3. P3:2 = P3P2
C2 = (G1 + P1 G0 ) + (P1 P0 )C0 = G1:0 + P1:0 C0
C4 = G3 + P3 G2 + P3 P2 G1 + P3 P2 P1 G0 + P3 P2 P1 P0 C0
= (G3 + P3 G2 ) + (P3 P2 )Co,1 = G3:2 + P3:2 C2
= G3:2 + P3:2(G1:0 + P1:0 C0) = G3:0 + P3:0 C0

The carry out of a 4-bit block can thus be computed using only the block generate and propagate
signals for each 2-bit section, plus the carry in to bit 0. The same formulation will be used to generate
the carry out signals for a 16-bit adder using the block generate and propagate from 4-bit sections.
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

15

More Definitions
( g, p ) ( g', p' ) = ( g + pg', pp' )
The above dot operator obeys the associative property, but it is not commutative
(G3:2,P3:2) = (G3,P3) (G2,P2)

( Co, 3, 0 ) = ( ( G3, P 3 ) ( G2, P 2 ) ( G 1, P 1 ) ( G0, P 0 )) ( C i, 0, 0 )

( G3:0 , P3:0 ) = [ ( G3, P 3) ( G2, P2 ) ] [ ( G1, P1 ) ( G0, P0 ) ]


= ( G 3:2, P3:2 ) ( G 1:0 , P1:0 )
( Co, k, 0 )

L8/9: 6.111 Spring 2006

= ( ( G k, P k ) ( G k 1 , P k 1 ) ( G , P ) )
0 0

Introductory Digital Systems Laboratory

( C i 0, 0 )
,

16

Logarithmic Look-Ahead Adder


A0

F
A1

A2

A3

A4

A5

A6

A7

tp: O(N)
A0
A1
A2
A3

A4
A5
A6

tp:O(log2N)

A7
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

17

S0
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
S15

(A0, B0)
(A1, B1)
(A2, B2)
(A3, B3)
(A4, B4)
(A5, B5)
(A6, B6)
(A7, B7)
(A8, B8)
(A9, B9)
(A10, B10)
(A11, B11)
(A12, B12)
(A13, B13)
(A14, B14)
(A15, B15)

16-bit Kogge-Stone Tree Adder

Sum Logic

Propagate, Generate Logic

L8/9: 6.111 Spring 2006


Introductory Digital Systems Laboratory
18

Adder Performance

Ripple

Bypass

Select
Lookahead

Delay vs. number of bits


L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

19

Addition of M, N-bit Numbers


IN1N-1
IN0N-1
IN1N-2
IN0N-2

IN11
IN01
IN10
IN00

IN2N-1

IN3N-1

INM-1N-1

IN2N-2

IN3N-2

INM-1N-2

IN21

IN31

INM-11

IN20

IN30

INM-10

Cin =0

L8/9: 6.111 Spring 2006

Cin =0

Cin =0

Introductory Digital Systems Laboratory

Cin =0

20

16-bit Carry Lookahead Schematic


181 configured for A+B:
M = 0, S3-0 = 1001

A3:0
Cn

A7:4

B3:0

181 Cn+4
P

Cn

181 Cn+4
P

S3:0

P3:0

G3:0

Cin

A11:8

B7:4

Cn

B11:8

181 Cn+4

S11:8

S7:4

A15:12 B15:12
Cn

181 Cn+4

S15:12

P0 G0 P1 G1 P2 G2 P3 G3
G
P

182

Cn
Cn+x

Cn+y

Cn+z

182 computes Cin for later stages,


using block G & P from earlier stages
L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

21

Binary Multiplication

x3

Partial product computation


is simple (single and gate)

x3

x1

x2
x1

x2

y0

x0
x0

y1
z0

x3

HA

FA

FA

x2

x1

x0

HA
y2
z1

x3

FA

FA

FA

x2

x1

x0

HA
y3
z2

z7
L8/9: 6.111 Spring 2006

FA

FA

FA

HA

z6

z5

z4

z3

Introductory Digital Systems Laboratory

22

A Serial (Magnitude) Multiplier


Shift/LD
0

[4]
D

x3

0
1

x2

0
1

xBus

[3]
D

[2]

CLK

yReg

CLK
L8/9: 6.111 Spring 2006

acc_out

LD

D Q

XY

CLK
CLK

Y3

0
1

Y1

x0

[0]

Y0

Shift/LD

Shift

[1]

x1

[5]

add_out

[6]

0
1

rst

Y2

Shift

xBus [7]

Introductory Digital Systems Laboratory

23

Timing Diagram

CLK
Shift
xreg
yreg
Acc_out
X*Y

0 0 0 0 x3 x2 x1 x0

0 0 0 x3 x2 x1 x0 0

0 0 x3 x2 x1 x0 0 0

0 x3 x2 x1 x0 0 0 0

0 0 0 0 x3 x2 x1 x0

y0 y1 y2 y3

y1 y2 y3 X

y2 y3 X X

y3 X X X

y0 y1 y2 y3

00000000

Accum_1

Accum_2

Accum_3

00000000

PRODUCT

PRODUCT

L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

24

Verilog of Serial Multiplier


module serialmult(shift, clk,
x, y, xy);
input shift, clk;
input [3:0] x, y;
output [7:0] xy;
reg [7:0] xReg;
reg [3:0] yReg;
reg [7:0] xBus, acc_out,
xy_int;
wire[7:0] add_out;
assign add_out = xBus +
acc_out;
assign xy = xy_int;
always @ (yReg[0] or xReg)
begin
if (yReg[0] == 1'b0) xBus =
8'b0;
else xBus = xReg;
end

L8/9: 6.111 Spring 2006

always @ (posedge clk)


begin
if (shift == 1'b0)
begin
xReg <= {4'b0, x};
yReg <= y;
acc_out <= 8'b0;
xy_int <= add_out;
end
else
begin
xReg <= {xReg[6:0], 1'b0};
yReg <= {y[3], yReg[3:1]};
acc_out <= add_out;
xy_int <= xy;
end // if shift
end // always
endmodule

Introductory Digital Systems Laboratory

25

Simulation

L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

26

Twos Complement Multiplication

x3
x3

x1

x2
x1

x2

x0
x0

y1
z0

x3

y0

FA

FA

FA

x2

x1

x0

HA
y2
z1

x3

FA

FA

FA

x2

x1

x0

y3
z2

HA

FA

FA

FA

HA

z7

z6

z5

z4

z3

L8/9: 6.111 Spring 2006

HA

Introductory Digital Systems Laboratory

27

Summary

Performance of arithmetic blocks dictate the


performance of a digital system

Architectural and logic transformations can


enable significant speed up (e.g., adder delay
from O(N) to O(log2(N))

Similar concepts and formulation can be applied


at the system level

Timing analysis is tricky: watch out for false


paths!

Area-Delay trade-offs (serial vs. parallel


implementations)

L8/9: 6.111 Spring 2006

Introductory Digital Systems Laboratory

28

You might also like