Professional Documents
Culture Documents
FPGA Interconnect
Interconnect Planning
Planning
Amit Singh
Malgorzata Marek-Sadowska
Xilinx Inc.
San Jose, CA
amit.singh@xilinx.com
Outline
Introduction
Previous Work
Architecture Model
Architecture & Design Rents exponent
Clustering
Spatial regularity
Fanout distribution
Area minimization
Area-delay minimization
Performance
Conclusions
Introduction
Clustered FPGAs
System-on-Chip (SOC)
Previous Work
Rents rule
Landman, Russo (1974), Donath (1979)
Clustered FPGAs
Clusters of LEs, connection boxes and switch-boxes
Regular 2-D mesh array
Example: Xilinx Virtex series, Altera APEX & Stratix
A Logic Element
Cluster of LEs
Cluster
C-Box
wSegment Length
Routing Model
S-Box
wW
Cluster
Cluster
Routing Model
segments?)
Switch type (tri-state buffers or pass-transistor?)
Rents Rule
Rents rule: Landman and Russo in 1971.
Average number of terminals and blocks per module in a
partitioned design:
T=tB
p = Rent exponent
100
t average # term./block
10
(simple) 0 p 1 (complex)
average
Rents rule
1
10
100
Rents Rule
Definitions
Pd Rents parameter for Design
Rents parameter
Logic Clustering
Example
Separation : Sum of
all terminals of nets
incident to LE
y
Degree : Number of
nets incident to LE
separation
c=
d2
2
Net weight: w( x ) =
r
G ( X ) = [2nw( x) (1 + )] k
IO (k + 1)n
Pa
Nets absorbed = 1
Nets absorbed = 4
A
A
Regularity
Better clustering for increased routability
Avg. fan-out
2.7
Ours:
Avg. fan-out
3.7
Routing
succeeded
a channel
Routing
succeeded
with awith
channel
widthwidth
factorfactor
of 21 of 12
# Clusters
10
15
20
T-VPack
25
30
Fanout Distribution
Inter-cluster wire requirement
Equivalent Rents parameter of system
n
keq = N ( k
Ni
i
i =1
peq =
pN
i =1
Netlist profile
Low fanout nets smaller length segments
Global nets Longer segments (long lines)
Fanout distribution
Array of clusters
350
300
nets
250
200
150
100
50
0
1
kN ((m 1) p 1 m p 1 )
Net (m) =
m
Foavg
1 ( FoMax + 1) p 1
=
1
p2
1 ( FoMax + 1) ( p, FoMax )
10
Predicted
( p, FoMax ) =
FoMax
n =1
np
n 2 (n + 1)
Example
Area (Transistors)
1.00E+07
8.00E+06
6.00E+06
4.00E+06
2.00E+06
0.00E+00
1
4
Circuit
Ours
Length 1
Xilinx-like
3.00E-07
2.50E-07
2.00E-07
1.50E-07
1.00E-07
5.00E-08
0.00E+00
1
Circuit
Ours
Length 1
Xilinx-like
Circuits
Ours
Xilinx-like
en
g
ts
se
q
di
ffe
q
ds
ip
ex
5p
m
is
ex
3
s2
98
de
s
0
al
u4
ap
ex
2
ap
ex
4
bi
gk
ey
Area-delay product
1.2
Fanout distribution:Performance
Using a Timing-driven Placer and Router
Normalized Critical Path
1.4
1
0.8
0.6
0.4
0.2
Circuits
Ours
Xilinx-like
en
g
Av
g.
ts
se
q
di
ffe
q
ds
ip
ex
5p
m
is
ex
3
s2
98
de
s
0
al
u4
ap
ex
2
ap
ex
4
bi
gk
ey
Normalized delay
1.2
Area-Delay minimization
Reduction of 29% over state-of-art