You are on page 1of 12

ClockTreeSynthesisofSMIC40nm

LowLeakageCortexA9With
CadenceCCopt

Brite
Arthur Liang, Titan Wang

Designoverview

ARMdualcoreCortexA9
32Kicacheand32KDcache,includesNeon
UseSMIC40nmlowleakageprocess
ImplementationwithCadenceCCopt

CCoptFlowMethodology
RTL

RTL

Synthesis

Synthesis

Netlist

Netlist

Placement

Placement

Pre-CTS Opt

Pre-CTS Opt

CTS
Post-CTS Optimization

CCOpt
Clock Concurrent Optimization

Routing &
Post-route opt

Routing &
Post-route opt

GDSII

GDSII

Traditional EDI Balanced


Clocks Flow

New CCopt
Flow

TraditionalEDICTSMethodology
T

Unnecessary
Nofundamentaltiming
requirementthatclocks
needtobebalanced

Expensive

clock
P[i1]

P[i+1]

Clockbufferexplosionto
minimizeskew
Otherexpensiveoptions(e.g,
mesh,spine,..)

Balanced CTS
P[i]

G[i1]max

G[i]max

SevereIRDrop
Allflops/RAMsforcedto
triggeratthesametime

TraditionalTimingOptimization

Criticalpath
Manyiterations
Excessiveruntime
Areaexplosion
Higherleakage

CCopt ClockConcurrentOptimizationFlow
T

MM/MC/OCV
Usefulskewtakesintoall
timingaspectsincludingMM,
MC,OCV,setup,hold

Efficient

clock
P[i1]

P[i+1]

Significantreductioninclock
buffers(noexplicitrequirementto
balanceTree)

CCopt
P[i]
G[i1]max
G[i]max

LowerIRDrop
Flops/RAMstriggeredat
differenttimes
Criticalandnoncriticalsinks
areskewed

Concurrentusefulskewanddatapathoptimization

Timeborrowing
Fastertimingclosure
Higherperformance
LowerArea
Lowerleakage

CCoptTechnology
TraditionalPhysicalOptimization

clock

ClockConcurrentOptimization

clock

Extendphysical
optimizationinto
theclocks

skew

Gmax

Gmax

Gmax <T skew

variable

fixed

L+Gmax <T+C
fixed

variable

variable

fixed

More
degrees of
freedom

variable

A9CPUSnapshot

ReferenceCCoptScript
setCCOptMode\
cts_buffer_cells{BUF_X16B_A12TR40BUF_X13B_A12TR40BUF_X11B_A12TR40
BUF_X6B_A12TR40}\
cts_inverter_cells{INV_X16B_A12TR40INV_X13B_A12TR40INV_X11B_A12TR40
INV_X6B_A12TR40}\
cts_clock_gating_cells{PREICG_X11B_A12TR40}\
cts_target_slew0.08\
cts_target_nonleaf_slew0.08\
cts_target_skew0.15\
io_optoff\
ccopt_auto_limit_insertion_delay_factor1.2\
ccopt_enable_downsizertrue\
ercfix\
cts_use_inverterstrue

CCoptTimingSummary
z STA Timing Summary With CCopt

z STA Timing Summary With Traditional CTS Flow

CCoptClockTreeSummary
z STA Timing Summary With CCopt
Clock Tree Name
: "CLK"
Clock Period
: 1.10000
Number of Levels
: 21
Number of Sinks
: 54562
Number of CT Buffers
: 1262
Total Area of CT Buffers
: 4689.66
Max Global Skew
: 0.2268

z STA Timing Summary With Traditional CTS Flow


Clock Tree Name
: "CLK"
Clock Period
: 1.10000
Number of Levels
: 20
Number of Sinks
: 54562
Number of CT Buffers
: 1178
Total Area of CT Buffers
: 4716.22
Max Global Skew
: 0.1356

Conclusion
Ccoptisabletodeterminetheproperclock
offsets insteadofmanuallyskewingaclock
inaniterativeprocess
HaveincreasedA9cpufrequency
Canreduceclocktreebuffer
Ccoptisinternallymakingtradeoffsbetween
timing/power/schedule

Thanks

You might also like