Professional Documents
Culture Documents
LowLeakageCortexA9With
CadenceCCopt
Brite
Arthur Liang, Titan Wang
Designoverview
ARMdualcoreCortexA9
32Kicacheand32KDcache,includesNeon
UseSMIC40nmlowleakageprocess
ImplementationwithCadenceCCopt
CCoptFlowMethodology
RTL
RTL
Synthesis
Synthesis
Netlist
Netlist
Placement
Placement
Pre-CTS Opt
Pre-CTS Opt
CTS
Post-CTS Optimization
CCOpt
Clock Concurrent Optimization
Routing &
Post-route opt
Routing &
Post-route opt
GDSII
GDSII
New CCopt
Flow
TraditionalEDICTSMethodology
T
Unnecessary
Nofundamentaltiming
requirementthatclocks
needtobebalanced
Expensive
clock
P[i1]
P[i+1]
Clockbufferexplosionto
minimizeskew
Otherexpensiveoptions(e.g,
mesh,spine,..)
Balanced CTS
P[i]
G[i1]max
G[i]max
SevereIRDrop
Allflops/RAMsforcedto
triggeratthesametime
TraditionalTimingOptimization
Criticalpath
Manyiterations
Excessiveruntime
Areaexplosion
Higherleakage
CCopt ClockConcurrentOptimizationFlow
T
MM/MC/OCV
Usefulskewtakesintoall
timingaspectsincludingMM,
MC,OCV,setup,hold
Efficient
clock
P[i1]
P[i+1]
Significantreductioninclock
buffers(noexplicitrequirementto
balanceTree)
CCopt
P[i]
G[i1]max
G[i]max
LowerIRDrop
Flops/RAMstriggeredat
differenttimes
Criticalandnoncriticalsinks
areskewed
Concurrentusefulskewanddatapathoptimization
Timeborrowing
Fastertimingclosure
Higherperformance
LowerArea
Lowerleakage
CCoptTechnology
TraditionalPhysicalOptimization
clock
ClockConcurrentOptimization
clock
Extendphysical
optimizationinto
theclocks
skew
Gmax
Gmax
variable
fixed
L+Gmax <T+C
fixed
variable
variable
fixed
More
degrees of
freedom
variable
A9CPUSnapshot
ReferenceCCoptScript
setCCOptMode\
cts_buffer_cells{BUF_X16B_A12TR40BUF_X13B_A12TR40BUF_X11B_A12TR40
BUF_X6B_A12TR40}\
cts_inverter_cells{INV_X16B_A12TR40INV_X13B_A12TR40INV_X11B_A12TR40
INV_X6B_A12TR40}\
cts_clock_gating_cells{PREICG_X11B_A12TR40}\
cts_target_slew0.08\
cts_target_nonleaf_slew0.08\
cts_target_skew0.15\
io_optoff\
ccopt_auto_limit_insertion_delay_factor1.2\
ccopt_enable_downsizertrue\
ercfix\
cts_use_inverterstrue
CCoptTimingSummary
z STA Timing Summary With CCopt
CCoptClockTreeSummary
z STA Timing Summary With CCopt
Clock Tree Name
: "CLK"
Clock Period
: 1.10000
Number of Levels
: 21
Number of Sinks
: 54562
Number of CT Buffers
: 1262
Total Area of CT Buffers
: 4689.66
Max Global Skew
: 0.2268
Conclusion
Ccoptisabletodeterminetheproperclock
offsets insteadofmanuallyskewingaclock
inaniterativeprocess
HaveincreasedA9cpufrequency
Canreduceclocktreebuffer
Ccoptisinternallymakingtradeoffsbetween
timing/power/schedule
Thanks