You are on page 1of 137

Chapter 5

Digital Design and Computer Architecture: ARM® Edition


Sarah L. Harris and David Money Harris

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <1>
Chapter 5 :: Topics
• Introduction
• Arithmetic Circuits
• Number Systems
• Sequential Building Blocks
• Memory Arrays
• Logic Arrays

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <2>
Introduction
• Digital building blocks:
– Gates, multiplexers, decoders, registers,
arithmetic circuits, counters, memory arrays,
logic arrays
• Building blocks demonstrate hierarchy,
modularity, and regularity:
– Hierarchy of simpler components
– Well-defined interfaces and functions
– Regular structure easily extends to different sizes
• Will use these building blocks in Chapter
7 to build microprocessor
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <3>
ARITHMETIC CIRCUITS
Arithmetic circuits are the central building blocks of
computers.
Computers and digital logic perform many arithmetic
functions:
addition,
subtraction,
comparisons,
shifts,
multiplication,
and division.
We are implementating hardware of these operations.
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <4>
Addition
Addition is one of the most common
operations in digital systems.
We first discuss how to add two 1-bit binary
numbers.
We then extend to N-bit binary numbers.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <5>
The half adder has
Two inputs- A and B,

and Two outputs- S and Cout.

S is the sum of A and B.

IfA and B are both 1, S is 2,


which cannot be represented with a single binary
digit.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <6>
Instead, it is indicated with a carry out Cout in the
next column.
The half adder can be built from an XOR gate and an
AND gate.
In a multi-bit adder, Cout is added or carried
in to the next most significant bit.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <7>
Full Adder

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <8>
1-Bit Adders
Half Full
Adder Adder
A B A B

Cout Cout Cin


+ +
S S

A B Cout S Cin A B Cout S


0 0 0 0 0 0 0 0 0
0 1 0 1 0 0 1 0 1
1 0 0 1 0 1 0 0 1
1 1 1 0 0 1 1 1 0
1 0 0 0 1
S =AB 1 0 1 1 0
Cout = AB 1 1 0 1 0
1 1 1 1 1

S = A  B Cin
Cout = AB + ACin + BCin

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <9>
Multibit Adders (CPAs)
• Types of carry propagate adders (CPAs):
– Ripple-carry (slow)
– Carry-lookahead (fast)
– Prefix (faster)
• Carry-lookahead and prefix adders faster for large
adders but require more hardware
Symbol
A B
N N

Cout Cin
+
N
S

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <10>
Ripple-Carry Adder
• Chain 1-bit adders together
• Carry ripples through entire chain
• Disadvantage: slow

A31 B31 A30 B30 A1 B1 A0 B0

Cout Cin
+ C30 + C29 C1 + C0 +
S31 S30 S1 S0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <11>
Ripple-Carry Adder Delay

tripple = NtFA
where tFA is the delay of a 1-bit full adder

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <12>
Carry-Lookahead Adder

The fundamental reason that large ripple-carry adders


are slow is that the carry signals must propagate
through every bit in the adder.
A carrylookahead adder (CLA) is another type of carry
Carry-Lookahead Adder
propagate adder
solves this problem by dividing the adder into
blocks and
providing circuitry to quickly determine the
carry out of a block as soon as the carry in is known.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <13>
Thus it is said to look ahead across the blocks rather
than waiting to ripple through all the full adders inside
a block.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <14>
For example, a 32-bit adder is divided into eight 4-bit
blocks .
CLAs use generate (G) andpropagate (P) signals that
describe how a column or block determines the carry
out.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <15>
The ith column of an adder is said to generate a carry if it
produces a carry out independent of the carry in.
The ith column of an adder is guaranteed to
generate a carry Ci if Ai and Bi are both 1.
Hence Gi, the generate signal for column i, is calculated as
Gi =AiBi.
The column is said to propagate a carry if it produces a
carry out whenever there is a carry in.
The ith column will propagate a carry in, Ci−1, if either Ai
or Bi is 1. Thus, Pi =Ai +Bi.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <16>
Carry-Lookahead Adder
Compute Cout for k-bit blocks using generate and propagate signals
Some definitions:
– Column i produces a carry out by either generating a carry out or
propagating a carry in to the carry out
– Generate (Gi) and propagate (Pi) signals for each column:
• Generate: Column i will generate a carry out if Ai and Bi are both 1.

G i = Ai Bi
• Propagate: Column i will propagate a carry in to the carry out if Ai or Bi is 1.

Pi = Ai + Bi
• Carry out: The carry out of column i (Ci) is:

Ci = Ai Bi + (Ai + Bi )Ci-1 = Gi + Pi Ci-1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <17>
Block Propagate and Generate
Now use column Propagate and Generate signals to
compute Block Propagate and Generate signals for
k-bit blocks, i.e.:
• Compute if a k-bit group will propagate a carry in (to the
block) to the carry out (of the block)
• Compute if a k-bit group will generate a carry out (of the
block)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <18>
Block Propagate and Generate Signals
• Example: Block propagate and generate
signals for 4-bit blocks (P3:0 and G3:0):
P3:0 = P3P2 P1P0
G3:0 = G3 + G2P3 + G1P2P3 + G0P1P2P3
= G3 + P3 (G2 + P2 (G1 + P1G0 )

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <19>
Block Propagate and Generate Signals
• In general for a block spanning bits i through j,
Pi:j = PiPi-1 Pi-2 … Pj
Gi:j = Gi + Pi (Gi-1 + Pi-1 (Gi-2 + Pi-2 … Gj )
Ci = Gi:j + Pi:j Cj-1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <20>
32-bit CLA with 4-bit Blocks
B31:28 A31:28 B27:24 A27:24 B7:4 A7:4 B3:0 A3:0

4-bit CLA C27 4-bit CLA C23 C7 4-bit CLA C3 4-bit CLA
Cout Cin
Block Block Block Block

S31:28 S27:24 S7:4 S3:0

B3 A3 B2 A2 B1 A1 B0 A0
C2 C1 C0
Cin
+ + + +
S3 S2 S1 S0

G3:0 G3
P3
G2
P2
G1
P1
G0

P3
Cout P3:0 P2
P1
Cin
P0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <21>
Carry-Lookahead Addition
• Step 1: Compute Gi and Pi for all columns
• Step 2: Compute G and P for k-bit blocks
• Step 3: Cin propagates through each k-bit
propagate/generate logic (meanwhile
computing sums)
• Step 4: Compute sum for most significant k-
bit block

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <22>
Carry-Lookahead Addition
• Step 1: Compute Gi and Pi for all columns
G i = Ai Bi
Pi = Ai + Bi

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <23>
Carry-Lookahead Addition
• Step 1: Compute Gi and Pi for all columns
• Step 2: Compute G and P for k-bit blocks
P3:0 = P3P2 P1P0
G3:0 = G3 + P3 (G2 + P2 (G1 + P1G0 )

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <24>
Carry-Lookahead Addition
• Step 1: Compute Gi and Pi for all columns
• Step 2: Compute G and P for k-bit blocks
• Step 3: Cin propagates through each k-bit
propagate/generate logic (meanwhile
B31:28 A31:28 B27:24 A27:24 B7:4 A7:4 B3:0 A3:0
computing sums) 4-bit CLA C27 4-bit CLA C23 C7 4-bit CLA C3 4-bit CLA
Cout Cin
Block Block Block Block

S31:28 S27:24 S7:4 S3:0


B3 A3 B2 A2 B1 A1 B0 A0
C C C
+ 2 + 1 + 0 + Cin
S3 S2 S1 S0 B3 A3 B2 A2 B1 A1 B0 A0
G3:0 C C C
G3
P3 + 2 + 1 + 0 + Cin B3 A3 B2 A2 B1 A1 B0 A0
G2 S3 S2 S1 S0 C C C
P
G2
1 G3:0 G3 + 2 + 1 + 0 + Cin
P1 P3 S3 S2 S1 S0
G0 G2
B3 A3 B2 A2 B1 A1 B0 A0 P2 G3:0 G3
C C C P3:0 P3 G1 P3
+ 2 + 1 + 0 + Cin Cout P2
P1
P1
G0 G2
S3 S2 S1 S0 Cin P0 P2
G1
P3 P1
G3:0 G3 Cout P3:0 P2 G0
P3 Cin P1
G2 P0 P3
P2 Cout P3:0 P2
G1 P
P1 Cin P1
G0 0
P3:0 P3
Cout P2
Cin P1
P0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <25>
Carry-Lookahead Addition
• Step 1: Compute Gi and Pi for all columns
• Step 2: Compute G and P for k-bit blocks
• Step 3: Cin propagates through each k-bit
propagate/generate logic (meanwhile
computing sums)
• Step 4: Compute sum for most significant k-
bit block B3 A3 B2 A2 B1 A1 B0 A0

S3
G3:0
C
S2
C
S1
C
+ 2 + 1 + 0 + Cin
S0
G3
P3
G2
B3 A3 B2 A2 B1 A1 B0 A0

S3
C
S2
C
S1
C
+ 2 + 1 + 0 + Cin
S0 B3 A3 B2 A2 B1 A1 B0 A0
P C C C
G2
1 G3:0 G3 + 2 + 1 + 0 + Cin
P1 P3 S3 S2 S1 S0
G0 G2
B3 A3 B2 A2 B1 A1 B0 A0 P2 G3:0 G3
C C C P3:0 P3 G1 P3
+ 2 + 1 + 0 + Cin Cout P2
P1
P1
G0 G2
S3 S2 S1 S0 Cin P0 P2
G1
P3 P1
G3:0 G3 Cout P3:0 P2 G0
P3 Cin P1
G2 P0 P3
P2 Cout P3:0 P2
G1 P
P1 Cin P1
G0 0
P3:0 P3
Cout P2
Cin P1
P0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <26>
Carry-Lookahead Adder Delay
For N-bit CLA with k-bit blocks:
tCLA = tpg + tpg_block + (N/k – 1)tAND_OR + ktFA

– tpg : delay to generate all Pi, Gi


– tpg_block : delay to generate all Pi:j, Gi:j
– tAND_OR : delay from Cin to Cout of final AND/OR gate in k-bit CLA block

An N-bit carry-lookahead adder is generally much faster than a


ripple-carry adder for N > 16

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <27>
Prefix adders
Prefix adders extend the generate and propagate
logic of the carrylookahead adder to perform
addition even faster.
They first compute G and P for pairs of columns,
then for blocks of 4, then for blocks of 8, then 16,
and so forth until the generate signal for every
column is known.
. The sums are computed from these generate
signals.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <28>
The strategy of a prefix adder is to compute the carry in
Ci−1 for each column i as quickly as possible, then to
compute the sum, using Si = Ai⊕Bi⊕Ci–1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <29>
Prefix Adder
• Computes carry in (Ci-1) for each column, then
computes sum:
Si = (Ai ^ Bi) ^ Ci-1
• Computes G and P for 1-, 2-, 4-, 8-bit blocks, etc.
until all Gi (carry in) known
• log2N stages

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <30>
Prefix Adder
• Carry in either generated in a column or propagated from a
previous column.
• Column -1 holds Cin, so
G-1 = Cin
• Carry in to column i = carry out of column i-1:
Ci-1 = Gi-1:-1
Gi-1:-1: generate signal spanning columns i-1 to -1
• Sum equation:
Si = (Ai ^ Bi) ^ Gi-1:-1
• Goal: Quickly compute G0:-1, G1:-1, G2:-1, G3:-1, G4:-1, G5:-1, …
(called prefixes) (= C0, C1, C2, C3, C4, C5, …)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <31>
Prefix Adder
• Generate and propagate signals for a block spanning bits i:j
Gi:j = Gi:k + Pi:k Gk-1:j
Pi:j = Pi:kPk-1:j
• In words:
– Generate: block i:j will generate a carry if:
• upper part (i:k) generates a carry or
• upper part (i:k) propagates a carry generated in
lower part (k-1:j)
– Propagate: block i:j will propagate a carry if both the
upper and lower parts propagate the carry

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <32>
16-Bit Prefix Adder Schematic
A i Bi

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 -1 i

P i:i G i:i

14:13 12:11 10:9 8:7 6:5 4:3 2:1 0:-1


Pi:k Pk-1:j Gi:k Gk-1:j

14:11 13:11 10:7 9:7 6:3 5:3 2:-1 1:-1

i:j

14:7 13:7 12:7 11:7 6:-1 5:-1 4:-1 3:-1


Pi:j Gi:j

14:-1 13:-1 12:-1 11:-1 10:-1 9:-1 8:-1 7:-1 Gi-1:-1 Ai Bi

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 i

Si

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <33>
Prefix Adder Delay
tPA = tpg + log2N(tpg_prefix ) + tXOR

tpg: delay to produce Pi, Gi (AND or OR gate)


tpg_prefix: delay of black prefix cell (AND-OR gate)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <34>
Adder Delay Comparisons
Compare delay of: 32-bit ripple-carry, CLA, and prefix adders
• CLA has 4-bit blocks
• 2-input gate delay = 10 ps; full adder delay = 30 ps
tripple = NtFA = 32(30 ps)
= 960 ps
tCLA = tpg + tpg_block + (N/k – 1)tAND_OR + ktFA
= [10 + 60 + (7)20 + 4(30)] ps
= 330 ps
tPA = tpg + log2N(tpg_prefix ) + tXOR
= [10 + log232(20) + 10] ps
= 120 ps

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <35>
Subtracter

Symbol Implementation
A B
N
A B
N N
N N
-
N +
Y N
Y

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <36>
Comparator: Equality

Symbol Implementation
A3
B3

A B A2
4 4 B2
Equal
= A1
B1
Equal
A0
B0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <37>
Comparator: Less Than

A B
N N

-
N
[N-1]

A<B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <38> 5-<38>
ALU: Arithmetic Logic Unit
ALU should perform:
• Addition
• Subtraction
• AND
• OR

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <39>
ALU: Arithmetic Logic Unit
ALUControl1:0 Function
00 Add
01 Subtract
10 AND
11 OR

Example: Perform A + B
ALUControl = 00
Result = A + B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <40>
ALU: Arithmetic Logic Unit
ALUControl1:0 Function
00 Add
01 Subtract
10 AND
11 OR
Example: Perform A OR B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <41>
ALU: Arithmetic Logic Unit
ALUControl1:0 Function
00 Add
01 Subtract
10 AND
11 OR
Example: Perform A OR B
ALUControl1:0 = 11
Mux selects output of OR gate as Result, so
Result = A OR B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <42>
ALU: Arithmetic Logic Unit
ALUControl1:0 Function
00 Add
01 Subtract
10 AND
11 OR
Example: Perform A + B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <43>
ALU: Arithmetic Logic Unit
ALUControl1:0 Function
00 Add
01 Subtract
10 AND
11 OR
Example: Perform A + B
ALUControl1:0 = 00
ALUControl0 = 0, so:
Cin to adder = 0
2nd input to adder is B
Mux selects Sum as Result, so
Result = A + B

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <44>
ALU with Status Flags

Flag Description
N Result is Negative
Z Result is Zero
C Adder produces Carry out
V Adder oVerflowed

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <45>
ALU with Status Flags

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <46>
ALU with Status Flags: Negative
N = 1 if:
Result is negative
So, N is connected to
most significant bit of
Result

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <47>
ALU with Status Flags: Zero
Z = 1 if:
all of the bits of Result
are 0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <48>
ALU with Status Flags: Carry
C = 1 if:
Cout of Adder is 1
AND
ALU is adding or
subtracting (ALUControl
is 00 or 01)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <49>
ALU with Status Flags: oVerflow
V = 1 if:
The addition of 2 same-
signed numbers
produces a result with
the opposite sign

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <50>
ALU with Status Flags: oVerflow
V = 1 if:
ALU is performing addition or subtraction
(ALUControl1 = 0)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <51>
ALU with Status Flags: oVerflow
V = 1 if:
ALU is performing addition or subtraction
(ALUControl1 = 0)
AND
A and Sum have opposite signs

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <52>
ALU with Status Flags: oVerflow
V = 1 if:
ALU is performing addition or subtraction
(ALUControl1 = 0)
AND
A and Sum have opposite signs
AND
A and B have same signs upon addition OR
A and B have different signs upon subtraction

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <53>
ALU with Status Flags: oVerflow
V = 1 if:
ALU is performing addition or subtraction
(ALUControl1 = 0)
AND
A and Sum have opposite signs
AND
A and B have same signs upon addition
(ALUControl0 = 0) OR
A and B have different signs upon subtraction
(ALUControl0 = 1)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <54>
Shifters
Logical shifter: shifts value to left or right and fills empty spaces with 0’s
– Ex: 11001 >> 2 = 00110
– Ex: 11001 << 2 = 00100

Arithmetic shifter: same as logical shifter, but on right shift, fills empty
spaces with the old most significant bit (msb)
– Ex: 11001 >>> 2 = 11110
– Ex: 11001 <<< 2 = 00100

Rotator: rotates bits in a circle, such that bits shifted off one end are shifted
into the other end
– Ex: 11001 ROR 2 = 01110
– Ex: 11001 ROL 2 = 00111

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <55> 5-<55>
Copyright © 2007 Elsevier
Shifter Design
A 3 A 2 A1 A0 shamt1:0
2
00 S1:0
01

10
Y3
11

shamt1:0 00
S1:0
2 01
Y2
10

A3:0 4 >> 4 Y3:0


11

00
S1:0
01

10
Y1
11

00
S1:0
01

10
Y0
11

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <56>
Shifters as Multipliers, Dividers

• A << N = A × 2N
– Example: 00001 << 2 = 00100 (1 × 22 = 4)
– Example: 11101 << 2 = 10100 (-3 × 22 = -12)
• A >>> N = A ÷ 2N
– Example: 01000 >>> 2 = 00010 (8 ÷ 22 = 2)
– Example: 10000 >>> 2 = 11100 (-16 ÷ 22 = -4)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <57>
Multipliers
• Partial products formed by multiplying a single
digit of the multiplier with multiplicand
• Shifted partial products summed to form result
Decimal Binary
230 multiplicand 0101
x 42 multiplier x 0111
460 partial 0101
+ 920 products 0101
9660 0101
+ 0000
result 0100011

230 x 42 = 9660 5 x 7 = 35

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <58>
4 x 4 Multiplier
A B
4 4

x
8
A3 A2 A1 A0
P
B0
B1
0

A3 A2 A1 A0 0
x B3 B2 B1 B0 B2
A3B0 A2B0 A1B0 A0B0
A3B1 A2B1 A1B1 A0B1 0
B3
A3B2 A2B2 A1B2 A0B2
+ A3B3 A2B3 A1B3 A0B3
0
P7 P6 P5 P4 P3 P2 P1 P0
P7 P6 P5 P4 P3 P2 P1 P0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <59>
Dividers
A/B = Q + R/B
Decimal Example: 2584/15 = 172 R4

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <60>
Dividers
A/B = Q + R/B
Decimal Example: 2584/15 = 172 R4
Long-Hand:

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <61>
Dividers
A/B = Q + R/B
Decimal Example: 2584/15 = 172 R4
Long-Hand: Long-Hand Revisited:

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <62>
Dividers
A/B = Q + R/B
Decimal: 2584/15 = 172 R4 Binary: 1101/0010 = 0110 R1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <63>
Divider Algorithm
A/B = Q + R/B
Binary: 1101/10 = 0110 R1
R’ = 0
for i = N-1 to 0
R = {R’ << 1, Ai}
D=R-B
if D < 0, Qi= 0; R’= R
else Qi= 1; R’= D
R=R’

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <64>
4 x 4 Divider
Legend
R B
R B
Cout Cin Cout Cin
+
D
D
N R'

0
N

R'
Division: A/B = Q + R/B
R’ = 0
for i = N-1 to 0
R = {R’ << 1, Ai}
D=R-B
if D < 0, Qi=0, R’=R
else Qi=1, R’=D
Each row computes one iteration of the division algorithm. R=R’

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <65>
4 x 4 Divider
Legend
R B
R B
Cout Cin Cout Cin
+
D
D
N R'

0
N

R'

Each row computes one iteration of the division algorithm.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <66>
Number Systems
Numbers we can represent using binary
representations
– Positive numbers
• Unsigned binary
– Negative numbers
• Two’s complement
• Sign/magnitude numbers

What about fractions?

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <67>
Numbers with Fractions
Two common notations:
• Fixed-point: binary point fixed
• Floating-point: binary point floats to the right of
the most significant 1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <68>
Fixed-Point Numbers
• 6.75 using 4 integer bits and 4 fraction bits:
01101100
0110.1100
2 1 -1 -2
2 + 2 + 2 + 2 = 6.75
• Binary point is implied
• The number of integer and fraction bits must be
agreed upon beforehand

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <69>
Fixed-Point Number Example
• Represent 7.510 using 4 integer bits and 4
fraction bits.

01111000

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <70>
Signed Fixed-Point Numbers
• Representations:
– Sign/magnitude
– Two’s complement
• Example: Represent -7.510 using 4 integer and 4 fraction
bits
– Sign/magnitude:
11111000
– Two’s complement:
1. +7.5: 01111000
2. Invert bits: 10000111
3. Add 1 to lsb: + 1
10001000

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <71>
Floating-Point Numbers
• Binary point floats to the right of the most significant 1
• Similar to decimal scientific notation

• For example, write 27310 in scientific notation:


273 = 2.73 × 102
• In general, a number is written in scientific notation as:
± M × BE
– M = mantissa
– B = base
– E = exponent
– In the example, M = 2.73, B = 10, and E = 2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <72>
Floating-Point Numbers

1 bit 8 bits 23 bits

Sign Exponent Mantissa

• Example: represent the value 22810 using a 32-bit floating


point representation

We show three versions – final version is called the IEEE 754


floating-point standard

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <73>
Floating-Point Representation 1
1. Convert decimal to binary
22810 = 111001002
2. Write the number in “binary scientific notation”:
111001002 = 1.110012 × 27
3. Fill in each field of the 32-bit floating point number:
– The sign bit is positive (0)
– The 8 exponent bits represent the value 7
– The remaining 23 bits are the mantissa
1 bit 8 bits 23 bits
0 00000111 11 1001 0000 0000 0000 0000
Sign Exponent Mantissa

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <74>
Floating-Point Representation 2
• First bit of the mantissa is always 1:
– 22810 = 111001002 = 1.11001 × 27
• So, no need to store it: implicit leading 1
• Store just fraction bits in 23-bit field

1 bit 8 bits 23 bits


0 00000111 110 0100 0000 0000 0000 0000
Sign Exponent Fraction

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <75>
Floating-Point Representation 3
• Biased exponent: bias = 127 (011111112)
– Biased exponent = bias + exponent
– Exponent of 7 is stored as:
127 + 7 = 134 = 0x100001102
• The IEEE 754 32-bit floating-point representation of 22810

1 bit 8 bits 23 bits


0 10000110 110 0100 0000 0000 0000 0000
Sign Biased Fraction
Exponent
in hexadecimal: 0x43640000

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <76>
Floating-Point Example
Write -58.2510 in floating point (IEEE 754)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <77>
Floating-Point Example
Write -58.2510 in floating point (IEEE 754)
1. Convert decimal to binary:
58.2510 = 111010.012
2. Write in binary scientific notation:
1.1101001 × 25
3. Fill in fields:
Sign bit: 1 (negative)
8 exponent bits: (127 + 5) = 132 = 100001002
23 fraction bits: 110 1001 0000 0000 0000 0000
1 bit 8 bits 23 bits
1 100 0010 0 110 1001 0000 0000 0000 0000
Sign Exponent Fraction
in hexadecimal: 0xC2690000

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <78>
Floating-Point: Special Cases

Number Sign Exponent Fraction


0 X 00000000 00000000000000000000000
∞ 0 11111111 00000000000000000000000
-∞ 1 11111111 00000000000000000000000
NaN X 11111111 non-zero

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <79>
Floating-Point Precision
• Single-Precision:
– 32-bit
– 1 sign bit, 8 exponent bits, 23 fraction bits
– bias = 127

• Double-Precision:
– 64-bit
– 1 sign bit, 11 exponent bits, 52 fraction bits
– bias = 1023

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <80>
Floating-Point: Rounding
• Overflow: number too large to be represented
• Underflow: number too small to be represented
• Rounding modes:
– Down
– Up
– Toward zero
– To nearest
• Example: round 1.100101 (1.578125) to only 3 fraction bits
– Down: 1.100
– Up: 1.101
– Toward zero: 1.100
– To nearest: 1.101 (1.625 is closer to 1.578125 than 1.5 is)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <81>
Floating-Point Addition
1. Extract exponent and fraction bits
2. Prepend leading 1 to form mantissa
3. Compare exponents
4. Shift smaller mantissa if necessary
5. Add mantissas
6. Normalize mantissa and adjust exponent if necessary
7. Round result
8. Assemble exponent and fraction back into floating-point
format

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <82>
Floating-Point Addition Example
Add the following floating-point numbers:
0x3FC00000
0x40500000

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <83>
Floating-Point Addition Example
1. Extract exponent and fraction bits
1 bit 8 bits 23 bits
0 01111111 100 0000 0000 0000 0000 0000
Sign Exponent Fraction
1 bit 8 bits 23 bits
0 10000000 101 0000 0000 0000 0000 0000
Sign Exponent Fraction

For first number (N1): S = 0, E = 127, F = .1


For second number (N2): S = 0, E = 128, F = .101

2. Prepend leading 1 to form mantissa


N1: 1.1
N2: 1.101

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <84>
Floating-Point Addition Example
3. Compare exponents
127 – 128 = -1, so shift N1 right by 1 bit

4. Shift smaller mantissa if necessary


shift N1’s mantissa: 1.1 >> 1 = 0.11 (× 21)

5. Add mantissas
0.11 × 21
+ 1.101 × 21
10.011 × 21

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <85>
Floating Point Addition Example
6. Normalize mantissa and adjust exponent if necessary
10.011 × 21 = 1.0011 × 22
7. Round result
No need (fits in 23 bits)

8. Assemble exponent and fraction back into floating-point


format
S = 0, E = 2 + 127 = 129 = 100000012, F = 001100..
1 bit 8 bits 23 bits
0 10000001 001 1000 0000 0000 0000 0000
Sign Exponent Fraction
in hexadecimal: 0x40980000

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <86>
Counters
• Increments on each clock edge
• Used to cycle through numbers. For example,
– 000, 001, 010, 011, 100, 101, 110, 111, 000, 001…
• Example uses:
– Digital clock displays
– Program counter: keeps track of current instruction executing
Symbol Implementation

CLK
N CLK
N N
+ Q
Q N N r
1
Reset
Reset

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <87>
Counter Verilog (FSM style)
module counter (input logic clk, reset,
output logic [N-1:0] q);
logic [N-1:0] nextq;

// register
always_ff @(posedge clk, posedge reset)
if (reset) q <= 0;
else q <= nextq;
Symbol Implementation

// next state CLK


CLK
N
assign nextq = q + 1; N N
Q

+
endmodule Q N 1
N r
Reset
Reset

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <88>
Counter Verilog (better idiom)
module counter (input logic clk, reset,
output logic [N-1:0] q);
always_ff @(posedge clk, posedge reset)
if (reset) q <= 0;
else q <= q+1;
endmodule

Symbol Implementation

CLK
N CLK
N N
Q

+
Q N N r
1
Reset
Reset

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <89>
Divide-by-2N Counter
• Most significant bit of an N-bit counter toggles every 2N
cycles.
• Useful for slowing a clock. Ex: blink an LED
• Example: 50 MHz clock, 24-bit counter
• 2.98 Hz

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <90>
Digitally Controlled Oscillator
• N-bit counter
• Add p on each cycle, instead of 1
• Most significant bit toggles at fout = fclk * p / 2N

• Example: fclk = 50 MHz clock


• How to generate a fout = 200 Hz signal?
• p/2N = 200 / 50 MHz
• Try N = 24, p = 67  fout = 199.676 Hz
• Or N = 32, p = 17179  fout = 199.990 Hz

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <91>
Shift Registers
• Shift a new bit in on each clock edge
• Shift a bit out on each clock edge
• Serial-to-parallel converter: converts serial input (Sin) to
parallel output (Q0:N-1)

Symbol: Implementation:
CLK
N
Q Sin Sout

Sin Sout
Q0 Q1 Q2 QN-1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <92>
Shift Register with Parallel Load
• When Load = 1, acts as a normal N-bit register
• When Load = 0, acts as a shift register
• Now can act as a serial-to-parallel converter (Sin to Q0:N-1) or
a parallel-to-serial converter (D0:N-1 to Sout)

D0 D1 D2 DN-1
Load
Clk
Sin 0 0 0 0 Sout
1 1 1 1

Q0 Q1 Q2 QN-1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <93>
Shift Register Verilog Idiom
module shiftreg(input logic clk,
input logic load, sin,
input logic [N-1:0] d,
output logic [N-1:0] q,
output logic sout);
always_ff @(posedge clk)
if (load) q <= d;
else q <= {q[N-2:0], sin};
assign sout = q[N-1];
D0 D1 D2 DN-1
endmodule
Load
Clk
Sin 0 0 0 0 Sout
1 1 1 1

Q0 Q1 Q2 QN-1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <94>
Memory Arrays
• Efficiently store large amounts of data
• 3 common types:
– Dynamic random access memory (DRAM)
– Static random access memory (SRAM)
– Read only memory (ROM)
• M-bit data value read/ written at each
unique N-bit address
N
Address Array

Data

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <95>
Memory Arrays
• 2-dimensional array of bit cells
• Each bit cell stores one bit
• N address bits and M data bits: N
Address Array
– 2Nrows and M columns
– Depth: number of rows (number of words)
M
– Width: number of columns (size of word)
Data
– Array size: depth × width = 2N × M
Address Data
11 0 1 0
2
Address Array 10 1 0 0
depth
01 1 1 0
3 00 0 1 1
Data width

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <96>
Memory Array Example
• 22 × 3-bit array
• Number of words: 4
• Word size: 3-bits
• For example, the 3-bit word stored at address 10 is 100

Address Data
11 0 1 0
2
Address Array 10 1 0 0
depth
01 1 1 0
3 00 0 1 1
Data width

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <97>
Memory Arrays

1024-word x
10
Address 32-bit
Array

32

Data

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <98>
Memory Array Bit Cells
bitline
wordline
stored
bit

bitline = bitline =
wordline = 1 wordline = 0
stored stored
bit = 0 bit = 0

bitline = bitline =
wordline = 1 wordline = 0
stored stored
bit = 1 bit = 1

(a) (b)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <99>
Memory Array Bit Cells
bitline
wordline
stored
bit

bitline = 0 bitline = Z
wordline = 1 wordline = 0
stored stored
bit = 0 bit = 0

bitline = 1 bitline = Z
wordline = 1 wordline = 0
stored stored
bit = 1 bit = 1

(a) (b)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <100>
Memory Array
• Wordline:
– like an enable
– single row in memory array read/written
– corresponds to unique address
– only one wordline HIGH at once
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1

Data2 Data1 Data0


Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <101>
Types of Memory
• Random access memory (RAM): volatile
• Read only memory (ROM): nonvolatile

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <102>
RAM: Random Access Memory
• Volatile: loses its data when power off
• Read and written quickly
• Main memory in your computer is RAM
(DRAM)

Historically called random access memory because any data


word accessed as easily as any other (in contrast to
sequential access memories such as a tape recorder)

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <103>
ROM: Read Only Memory
• Nonvolatile: retains data when power off
• Read quickly, but writing is impossible or
slow
• Flash memory in cameras, thumb drives, and
digital cameras are all ROMs
Historically called read only memory because ROMs were
written at manufacturing time or by burning fuses. Once
ROM was configured, it could not be written again. This is
no longer the case for Flash memory and other types of
ROMs.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <104>
Types of RAM
• DRAM (Dynamic random access memory)
• SRAM (Static random access memory)
• Differ in how they store data:
– DRAM uses a capacitor
– SRAM uses cross-coupled inverters

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <105>
Robert Dennard, 1932 -
• Invented DRAM in 1966
at IBM
• Others were skeptical
that the idea would
work
• By the mid-1970’s DRAM
in virtually all computers

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <106>
DRAM
• Data bits stored on capacitor
• Dynamic because the value needs to be refreshed
(rewritten) periodically and after read:
– Charge leakage from the capacitor degrades the value
– Reading destroys the stored value

bitline bitline
wordline wordline
stored
bit stored
bit

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <107>
DRAM

bitline bitline
wordline wordline

stored + + stored
bit = 1 bit = 0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <108>
SRAM
bitline
wordline
stored
bit

bitline bitline
wordline

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <109>
Memory Arrays Review
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1

Data2 Data1 Data0


DRAM bit cell: SRAM bit cell:
bitline bitline bitline
wordline wordline

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <110>
ROM: Dot Notation

bitline
2:4
Decoder wordline
11
2
Address
bit cell
10
containing 0

01
bitline
wordline
00
bit cell
Data2 Data1 Data0 containing 1

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <111>
Types of ROMs
Type Name Description
ROM Read Only Memory Chip is hardwired with presence or absence of
transistors. Changing requires building a new chip.
PROM Programmable ROM Fuses in series with each transistor are blown to
program bits. Can’t be changed after
programming.
EPROM Electrically Charge is stored on a floating gate to activate or
Programmable ROM deactivate transistor. Erasing requires exposure to
UV light.
EEPROM Electrically Erasable Like EPROM, but erasing can be done electrically.
Programmable ROM
Flash Flash Memory Like EEPROM, but erasing is done on large blocks
to amortize cost of erase circuit. Low cost per bit,
dominates nonvolatile storage today.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <112>
Fujio Masuoka, 1944 -
• Developed memories and high speed
circuits at Toshiba, 1971-1994
• Invented Flash memory as an
unauthorized project pursued during
nights and weekends in the late 1970’s
• The process of erasing the memory
reminded him of the flash of a camera
• Toshiba slow to commercialize the
idea; Intel was first to market in 1988
• Flash has grown into a $25 billion per
year market

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <113>
ROM Storage

2:4
Decoder Address Data
11
Address 2 11 0 1 0
10
10 1 0 0
depth
01 01 1 1 0
00 00 0 1 1
Data2 Data1 Data0
width

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <114>
ROM Logic

2:4
Decoder

Address 2
11
Data2 = A1 ^ A0
10
Data1 = A1 + A0
01

00
Data0 = A1A0
Data2 Data1 Data0

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <115>
Example: Logic with ROMs
Implement the following logic functions using a 22 × 3-bit ROM:
– X = AB
– Y=A+B 2:4
Decoder
– Z=AB
11
2
A, B
10

01

00

X Y Z

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <116>
Example: Logic with ROMs
Implement the following logic functions using a 22 × 3-bit ROM:
– X = AB
– Y=A+B 2:4
Decoder
– Z=AB
11
2
A, B
10

01

00

X Y Z

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <117>
Logic with Any Memory Array
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1

Data2 Data1 Data0

Data2 = A1  A0
Data1 = A1 + A0
Data0 = A1A0
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <118>
Logic with Memory Arrays
Implement the following logic functions using a 22 × 3-bit
memory array:
– X = AB
– Y=A+B
– Z=AB

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <119>
Logic with Memory Arrays
Implement the following logic functions using a 22 × 3-bit
memory array:
– X = AB 2:4
– Y=A+B Decoder
wordline3
bitline2 bitline1 bitline0
11
– Z=AB stored stored stored
A, B 2
bit = 1 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 0 bit = 1 bit = 1
01
stored stored stored
bit = 0 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 0 bit = 0

X Y Z

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <120>
Logic with Memory Arrays
Called lookup tables (LUTs): look up output at each input
combination (address)
4-word x 1-bit Array

2:4
Decoder bitline
Truth
Table 00
stored
A A1
bit = 0
A B Y 01
B A0
0 0 0 stored
0 1 0 bit = 0
1 0 0 10
1 1 1 stored
bit = 0
11
stored
bit = 1

Y
Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <121>
Multi-ported Memories
• Port: address/data pair
• 3-ported memory
– 2 read ports (A1/RD1, A2/RD2)
– 1 write port (A3/WD3, WE3 enables writing)
• Register file: small multi-ported memory
CLK

WE3
A1 RD1
N M
A2 RD2
N M

A3 Array
N
WD3
M

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <122>
SystemVerilog Memory Arrays
// 256 x 64 memory module with one read/write port
module dmem(input logic clk, we,
input logic [7:0] a,
input logic [63:0] wd,
output logic [63:0] rd);

logic [63:0] RAM[255:0];

always @(posedge clk)


begin
rd <= RAM[a]; // synchronous read
if (we)
RAM[a] <= wd; // synchronous write
end
endmodule

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <123>
SystemVerilog Register File
// 16 x 32 register file with two read, 1 write port
module rf(input logic clk, we3,
input logic [3:0] a1, a2, a3,
input logic [31:0] wd3,
output logic [31:0] rd1, rd2);

logic [31:0] RAM[15:0];

always @(posedge clk) // synchronous write


if (we3)
RAM[a3] <= wd3;
assign rd1 = RAM[a1]; // asynchronous read
assign rd2 = RAM[a2];
endmodule

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <124>
Logic Arrays
• PLAs (Programmable logic arrays)
– AND array followed by OR array
– Combinational logic only
– Fixed internal connections
• FPGAs (Field programmable gate arrays)
– Array of Logic Elements (LEs)
– Combinational and sequential logic
– Programmable internal connections

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <125>
PLAs
• X = ABC + ABC
Inputs
• Y = AB M

AND Implicants OR
ARRAY N ARRAY

P
Outputs
A B C
OR ARRAY

ABC

ABC

AB

AND ARRAY
X Y

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <126>
PLAs: Dot Notation
Inputs
M

AND Implicants OR
ARRAY N ARRAY

P
Outputs
A B C
OR ARRAY

ABC

ABC

AB

AND ARRAY
X Y

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <127>
FPGA: Field Programmable Gate Array
• Composed of:
– LEs (Logic elements): perform logic
– IOEs (Input/output elements): interface with outside
world
– Programmable interconnection: connect LEs and
IOEs
– Some FPGAs include other building blocks such as
multipliers and RAMs

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <128>
General FPGA Layout

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <129>
LE: Logic Element
• Composed of:
– LUTs (lookup tables): perform combinational logic
– Flip-flops: perform sequential logic
– Multiplexers: connect LUTs and flip-flops

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <130>
Altera Cyclone IV LE

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <131>
Altera Cyclone IV LE
• The Altera Cyclone IV LE has:
– 1 four-input LUT
– 1 registered output
– 1 combinational output

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <132>
LE Configuration Example
Show how to configure a Cyclone IV LE to perform the following
functions:
– X = ABC + ABC
– Y = AB

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <133>
LE Configuration Example
Show how to configure a Cyclone IV LE to perform the following
functions:
– X = ABC + ABC
– Y = AB (A) (B)
data 1 data 2
(C)
data 3 data 4
(X)
LUT output
0 0 0 X 0
0 0 1 X 1
A data 1
0 1 0 X 0
B data 2
0 1 1 X 0 C
data 3 X
1 0 0 X 0
0 data 4
1 0 1 X 0 LUT
1 1 0 X 1
LE 1
1 1 1 X 0

(A) (B) (Y)


data 1 data 2 data 3 data 4 LUT output
0 0 X X 0 A data 1
0 1 X X 0 B data 2
1 0 X X 1 0 data 3 Y
1 1 X X 0 0 data 4 LUT

LE 2

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <134>
LE Example: AND5
How many LEs are required to build a 5-input AND gate?

Solution: 2. First performs AND4 (function of 4 variables).


Second performs AND2 of the first result and the 5th input.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <135>
LE Example: 3-bit counter
How many LEs are required to build a 3-bit counter?

Solution: 3. The counter has 3 flip-flops, so it requires at least 3


LEs. The add logic for each bit is a function of less than 4
variables, so it can fit in the LUT before the flop. Hence, 3 LEs is
sufficient.

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <136>
FPGA Design Flow
Using a CAD tool (such as Altera’s Quartus II)
• Enter the design with a HDL
• Simulate the design
• Synthesize design and map it onto FPGA
• Download the configuration onto the FPGA
• Test the design

This is an iterative process!

Digital Design and Computer Architecture: ARM® Edition © 2015 Chapter 5 <137>

You might also like