You are on page 1of 26

FPGA

FPGA Interconnect
Interconnect Planning
Planning

Amit Singh

Malgorzata Marek-Sadowska

Xilinx Inc.
San Jose, CA
amit.singh@xilinx.com

University of California, Santa Barbara


Santa Barbara, CA
mms@ece.ucsb.edu

Outline
Introduction
Previous Work
Architecture Model
Architecture & Design Rents exponent

Clustering
Spatial regularity

Fanout distribution
Area minimization
Area-delay minimization
Performance

Conclusions

Introduction
Clustered FPGAs
System-on-Chip (SOC)

Performance, Power, Area


Matching design and architecture complexities
Circuit packing/clustering

Fanout distribution (Zarkesh-Ha et al. 2000)


Rents rule

Segment length planning in FPGAs


Area reduction
Area-Delay product minimization

Previous Work
Rents rule
Landman, Russo (1974), Donath (1979)

Interconnect distribution model


Davis et al. (1998), Zarkesh-Ha et al. (2000)
Local, semi-global, global wiring requirement
Homogeneous, heterogeneous systems

FPGA segment length distribution


Betz et al. (1999)
Type of routing switches
Impact segment length distribution on area-delay
product
Best area-delay product: Mix of length 4 and 8
segments

Clustered FPGAs
Clusters of LEs, connection boxes and switch-boxes
Regular 2-D mesh array
Example: Xilinx Virtex series, Altera APEX & Stratix

A Logic Element

Cluster of LEs

Cluster

C-Box

wSegment Length

Routing Model

S-Box
wW

Cluster

Cluster

Routing Model

Buffered routing switches


Buffer chain delay

Pass-transistor chain delay

How much interconnect?


~80% of FPGA area = interconnects
Routing resource utilization (RRU) is low
100% logic utilization

Depopulating logic clusters


Regularity

Interconnect complexity guided fanout distributionRents rule


Average fanout
Segment lengths (shorter segments or longer

segments?)
Switch type (tri-state buffers or pass-transistor?)

Rents Rule
Rents rule: Landman and Russo in 1971.
Average number of terminals and blocks per module in a
partitioned design:

T=tB

p = Rent exponent
100

t average # term./block

Measure for the complexity


of the interconnection topology.

10

(simple) 0 p 1 (complex)
average
Rents rule
1

10

100

Typical values: 0.5 p 0.75


1000

Rents Rule
Definitions
Pd Rents parameter for Design

Routing resource utilization

Pa Rents parameter for Architecture


Pa = 0.64

Rents parameter

Logic Clustering
Example

Separation : Sum of
all terminals of nets
incident to LE

y
Degree : Number of
nets incident to LE

separation
c=
d2

2
Net weight: w( x ) =
r

G ( X ) = [2nw( x) (1 + )] k

IO (k + 1)n

Pa

Clustering: Seed selection

degree = 4; separation = 18, c = 1.25

Nets absorbed = 1

degree = 4; separation = 8, c = 0.5

Nets absorbed = 4

Rents Rule: Depopulation


Case 1: Pd <= Pa
Achieve spatial uniformity.
Case 2: Pd > Pa
Need more routing resources.

Solution Depopulate clusters

A
A

Regularity
Better clustering for increased routability

Avg. fan-out
2.7

Ours:
Avg. fan-out
3.7

Routing
succeeded
a channel
Routing
succeeded
with awith
channel
widthwidth
factorfactor
of 21 of 12

Rents Rule: Depopulation

# Clusters

Cluster Pin Utilization


80
70
60
50
40
30
20
10
0
0

10

15

20

# Cluster Pins utilized


Ours

T-VPack

25

30

Segment length Planning


Typical segment distribution

What is the best mix of segments for:


i) Area minimization ii) Area-delay minimization?

Fanout Distribution
Inter-cluster wire requirement
Equivalent Rents parameter of system
n

keq = N ( k

Ni
i

i =1

peq =

pN
i =1

Netlist profile
Low fanout nets smaller length segments
Global nets Longer segments (long lines)

Global segments buffered


Shorter segments not buffered(pass transistor
switches)

Fanout distribution
Array of clusters
350
300

nets

250
200
150
100
50
0
1

kN ((m 1) p 1 m p 1 )
Net (m) =
m
Foavg

1 ( FoMax + 1) p 1
=
1
p2
1 ( FoMax + 1) ( p, FoMax )

10

Pins per net


Actual

Predicted

( p, FoMax ) =

FoMax

n =1

np
n 2 (n + 1)

Fanout distribution: Area-Minimization


Good Placement

Example

Fanout distribution: Area-Minimization

Area (Transistors)

1.00E+07
8.00E+06
6.00E+06
4.00E+06
2.00E+06
0.00E+00
1

4
Circuit

Ours

Length 1

Xilinx-like

Fanout distribution: Area-Minimization

Critical Path (ns)

3.00E-07
2.50E-07
2.00E-07
1.50E-07
1.00E-07
5.00E-08
0.00E+00
1

Circuit
Ours

Length 1

Xilinx-like

Fanout distribution:Area-Delay Product


Performance!
Average fanout: Critical path model

Fanout distribution:Area-Delay Product


Timing-driven Placer and Router
1
0.8
0.6
0.4
0.2

Circuits
Ours

Xilinx-like

en
g

ts

se
q

di
ffe
q
ds
ip
ex
5p
m
is
ex
3
s2
98

de
s

0
al
u4
ap
ex
2
ap
ex
4
bi
gk
ey

Area-delay product

1.2

Fanout distribution:Performance
Using a Timing-driven Placer and Router
Normalized Critical Path
1.4

1
0.8
0.6
0.4
0.2

Circuits

Ours

Xilinx-like

en
g
Av
g.

ts

se
q

di
ffe
q
ds
ip
ex
5p
m
is
ex
3
s2
98

de
s

0
al
u4
ap
ex
2
ap
ex
4
bi
gk
ey

Normalized delay

1.2

Conclusions & Future Work


FPGA Clustering for Regularity
Rents rule

Fanout distribution based segment length planning


Area minimization
Reduced wire-length

Area-Delay minimization
Reduction of 29% over state-of-art

Fixed FPGA Architecture


20% better area-delay product

Future work: Metal layer assignment


Applying technique to a pipelined FPGA

You might also like