You are on page 1of 14

ICCAD 2014 Contest

Incremental Timing-driven Placement: Timing Modeling and File Formats


v1.1 April 14th , 2014

http://cad contest.ee.ncu.edu.tw/CAD-Contest-at-ICCAD2014/problem b/

Introduction

This document outlines the timing modeling concepts and file formats for the timer necessary for
the ICCAD 2014 timing-driven placement contest problem. The goal of this problem is to have
participants drive placement based on timing information. Section 2 provide the background,
motivation, and necessary conceptual understanding for Static Timing Analysis (STA). Section
3 will details about the timer, along with its input and output file formats. For contest logistics,
deadlines and any other updates, please refer to the contest website stated above.

Static Timing Analysis (STA)


Primary
Inputs

Primary
Outputs

Input Slew siA

dAY

Output Slew soY

Input Slew siB

dBY

Circuit
Element
Interconnect
circuit

Figure 1: Generic circuit (left) and delay model representation of a circuit element (right).
A static timing analysis of a design typically provides a profile of the designs performance by
measuring the timing propagation from inputs to outputs. This analysis provides a pessimistic
bound, and thus facilitates further analysis of only problematic portions of the design. Section
2.1 will outline the terminology used in the remainder of the document, Sections 2.2 and 2.3
will describe delay modeling, and Section 2.4 will outline timing propagation.

Input Slew siA


Output Slew soY

A dAY

A
CIA

Input Slew siB

Y
B dBY

CL

CIB

Figure 2: Combinational OR gate (left), its timing model (center) and capacitances (right).

2.1

Definitions

Timing analysis of a circuit computes


the amount of time required for signals to propagate
dA
dY through various circuit elements and
from primary inputs (PIs) to primary outputs (POs)
dB
interconnect. Signals arriving at an input of an element will be available at its output(s) at some
later time; each pin-to-pin connection therefore introduces a delay during signal propagation.
For example, as shown in Figure 1 (right), the delay across the circuit element from input A
to output Y is designated by d(A, Y ). A timing path is a set of directed connections through
circuit elements and interconnect, and its delay is the sum of those components delays.
A signal transition is characterized by its input slew and output slew, where slew is defined
as the amount of time for the signal to transition from high-to-low or low-to-high.1 For example,
in Figure 1, the input slew at A is denoted by si (A), and the output slew at Y by so (Y ).
To account for timing modeling limitations in considering design and electrical complexities,
as well as multiple sources of variability, such as manufacturing variations, temperature fluctuation, voltage drops, and electromigration, timing analysis is typically done using an early-late
split, where each circuit node has an early (lower) bound and a late (upper) bound on its time.2
By convention, if the mode (early or late) is not explicitly specified, both modes should be considered. Both slew and delay are computed separately on early and late modes. For example,
in early mode, an output slew sE
o is computed using the input slew taken from the early mode,
E
si , and, similarly, in late mode, the output slew sLo is computed using sLi .

2.2

Circuit Element Modeling

In this section, we will divide our discussion into combinational and sequential circuit elements.
Combinational elements. For a given combinational cell, e.g., OR gate, we let the delay d
and output slew so for a input/output pin-pair (see Figure 2) be
d = a + b CL + c si
so = x + y CL + z si

(1)
(2)

Here, a, b, c, x, y and z are cell-dependent constants, CL is the output load at the output pin,
and si is the input slew at the input pin. CL denotes the equivalent downstream capacitance
1
2

A low (high) signal is defined as 10% (90%) of the voltage.


This is commonly accomplished by derating an existing delay value (e.g., by 5%).

Combinational Logic

FF1
D

dCKQ
Setup
Hold

FF2

dcomb

dCKQ

Setup
Hold
CK

CLOCK

li

Setup
Hold
lo
CK

CK

Figure 3: Generic D flip-flop and its timing model (left), and two FFs in series and their timing
models (right).
seen from the output of pin of the cell. For simplicity, we assume CL to be the sum of all
capacitances in the parasitic RC tree including cell pin capacitances (CI ) at the taps
X
Ck
(3)
CL =
kN

where N is the set of all nodes in the RC network. Following the example in Figure 4, the
output load is (trivially)
CL = C1 + C2 + C3 + C4 + C5
(4)
Sequential elements. Sequential circuits consist of combinational blocks interleaved by registers, usually implemented with flip-flops (FFs). Typically, sequential circuits are composed of
several stages, where a register captures data from the outputs of a combinational block from
a previous stage, and injects it into the inputs of the combinational block in the next stage.
Register operation is synchronized by clock signals generated by one or multiple clock sources.
Clock signals that reach distinct flip-flops, e.g., sinks in the clock tree, are delayed from the
clock source by a clock latency l.
A (D) flip-flop is a storage element that captures a given logic value at its input data pin
D, when a given clock edge is detected at its clock pin CK, and subsequently presents the
captured value and its complement at the output pins Q and Q. The flip-flop also enables
asynchronous preset (set) and clear (reset) of the output pins through the respective S and R
input pins.3
Setup and hold constraints. Proper operation of a flip-flop requires the logic value of the
input data pin to be stable for a specific period of time before the capturing clock edge in the
current clock cycle. This period of time is designated by the setup time tsetup . Additionally,
the logic value of the input data pin must also be stable for a specific period of time after the
capturing clock edge for the subsequent clock cycle. This period of time is designated by the
hold time thold . The flip-flop timing models are depicted in Figure 3 (left).
3

The complement, preset and clear signals are stated here for completeness. For the purposes of the contest,
their behavior will be ignored.

1
Port

Taps

Output Slew soT1

Input Slew siZ

T1

RC

RD

C4

dZT
T2

C1

T1

T1

dZT Output Slew soT


T2
2

RA

C3

RB

T2

2
C2

RE

C5

Figure 4: Generic interconnect (left), its timing model (center) and RC network (right).
Setup and hold constraints are respectively modeled as functions of the input slews at both
the clock pin CK and the data input pin D as
late
tsetup = g + h searly
iCK + j siD
early
thold = m + n slate
iCK + p siD

(5)
(6)

Here, g, h, j, m, n and p are flop-specific parameters, siCK is the input slew at CK, and siD is
the input slew at D.

2.3

Interconnect Modeling

The basic instance of interconnect (wire) is a net, which is assumed to have an input pin (P ort)
and one or more output pins (T aps), as illustrated in Figure 4 (left). Parasitic RC trees only
contain grounded capacitors and floating resistors (we will not include the discussion of coupling
capacitors or grounded resistors).
Delay. The computation of port-to-tap delays can be accurately performed through electrical
simulation. However, and for the sake of simplicity (and speed), we will assume the simpler
Elmore delay model [3], where the delay is approximated by the symmetric of the value of the
rst moment of the impulse response. To compute the delay of RC tree networks, we summarize
the topological method provided in [6].
In an RC network, consider any two nodes e and k. Let Ck be the lumped capacitance at
node k, and let Rke be the total resistance of the common path between the paths from P ort
to e and P ort to k. For example, in Figure 4 (right), the resistance between nodes 1 and T2
(R1T2 ) is RA , as that is the only common resistor between the paths Z to 1 and Z to T2 . The
Elmore delay at node e is
X
Rke Ck
(7)
de =
kN

where N is the set of all nodes in the RC network. For the example net illustrated in Figure 4
(right), the delay at node T2 (tap) is
dT2 = RA (C1 + C3 + C4 ) + (RA + RB )C2 + (RA + RB + RE )C5
4

(8)

T1

C1d1

RC

RD

C4d4

RA

C3d3

RB

T2

2
C2d2

RE

C5d5

Figure 5: Modified RC network for output slew calculation.


Output slew. The value of the output slew (so ) on any given tap node T can be approximated
by a two-step process. First, compute the output slew of the impulse response on T , which was
observed [3, 4] to be well-approximated by
q
soT 2T dT 2
(9)
where T is the second moment of the input response at node T , and dT is the corresponding
Elmore delay from Equation 7. Second, compute the slew of the response to the input ramp
by the expression given in [5]
q
si 2 + s2oT

soT

(10)

where si is the input slew.


The value of T can be computed through the efficient path-tracing algorithm for moment
computation proposed in [7], which is a generalization of the algorithm proposed in [3]. To
calculate T , first replace all capacitance values Ck in the RC network by Ck dk , where dk is the
Elmore delay from Equation 7 (see Figure 5). Second, follow the same procedure as before for
finding e
X
Rke Ck dk
(11)
e =
kN

Following the example in Figure 5, at node T2 , we have


T2 = RA (C1 d1 + C3 d3 + C4 d4 ) + (RA + RB )C2 d2 + (RA + RB + RE )C5 d5

2.4

(12)

Timing Propagation

Starting from the primary input(s), the instant that a signal reaches an input or output of a
circuit element is quantified as the arrival time (at). Similarly, starting from the primary output(s), the limits imposed for each arrival time to ensure proper circuit operation is quantified
as the required arrival time (rat). Given an arrival time and a required arrival time, the slack
at a circuit node quantifies how well timing constraints are met. That is, a positive slack means
the required time is satisfied, and a negative slack means the required time is in violation.

Actual arrival time. Starting from the primary inputs, arrival times (at) are computed by
adding delays across a path, and performing the minimum (in early mode) or maximum (in
late mode) of such accumulated times at a convergence point. This establishes bounds on the
time that a signal transition can reach any given circuit node. For example, let atE (A) and
atE (B) denote the early arrival times at pins A and B, respectively, in Figure 1 (right). The
most pessimistic early mode arrival time at the output pin Y is:
atE (Y ) = min atE (A) + dE (A, Y ),

atE (B) + dE (B, Y ) .

(13)

Conversely, in late mode, the latest time that a signal transition can reach any given circuit
node is computed. Following the same example in Figure 1 (right), the most pessimistic late
mode arrival time at Y is:
atL (Y ) = max atL (A) + dL (A, Y ),

atL (B) + dL (B, Y ) .

(14)

Required arrival time. Starting from the primary outputs, required arrival times (rat) are
computed by subtracting the delays across a path, and performing the maximum (minimum)
in early (late) mode of such accumulated times at a convergence point. That is, in early mode,
the earliest time that a signal transition must reach any circuit node is computed. For example,
in Figure 4, the most pessimistic early mode required arrival time at the output pin Y is:
ratE (Y ) = max ratE (A) dE (Y, A),

ratE (B) dE (Y, B) .

(15)

Conversely, in late mode, the earliest time that a signal transition must reach a given circuit
node is computed. Following the example in Figure 4, the most pessimistic late mode required
arrival time at the input pin Y is:
ratL (Y ) = min ratL (A) dL (Y, A),

ratL (B) dL (Y, B) .

(16)

Slack. For proper circuit operation, the following conditions must be met:
atE ratE ,
L

(17)

at rat .

(18)

To quantify how well timing constraints are met at each circuit node, slacks (slack) can be computed based on the aforementioned conditions. That is, slacks are positive when the required
times are met, and negative otherwise.
slack E = atE ratE
L

slack = rat at
6

(19)
(20)

Slew propagation. As circuit element delays and interconnect delays are functions of input
slew (si ), subsequent output slew (so ) must be propagated. One approach to slew propagation
at a convergence point is worst-slew propagation, wherein the smallest (largest) slew in early
(late) mode is propagated. Following the example in Figure 1 (right), the early and late output
slew at output pin Y , respectively, are:

E
E
sE
(21)
o (Y ) = min so (A, Y ), so (B, Y ) ,

sLo (Y ) = max sLo (A, Y ), sLo (B, Y ) ,
(22)
E
E
E
L
where sE
o (A, Y ) is a function of si (A), so (B, Y ) is a function of si (B), so (A, Y ) is a function
L
L
L
of si (A), and so (B, Y ) is a function of si (B).
Sequential signal propagation. Signal transition between two flip-flops is illustrated in
Figure 3 (right). Assuming that the clock edge is generated at the source at time 0, it reaches
the injecting (launching) flip-flop F F1 at time d(CLOCK, F F1 :CK), making the data available
at the input of the combinational block d(F F1 :CK, F F1 :Q) time later. If the propagation delay
in the combinational block is d(comb), then the data is available at the input of the capturing
flip-flop F F2 at time:

d(CLOCK, F F1 :CK) + d(F F1 :CK, F F1 :Q) + d(comb).


Assuming the clock period to be a constant P , the next clock edge reaches F F2 at time P +
d(CLOCK, F F2 :CK). For correct operation, the data must be be available at the input pin
(F F2 :D) tS time units before the next clock edge. Therefore, the late arrival time and the late
required arrival time at the data pin are:
atL (F F2 :D) = dL (CLOCK, F F1 :CK)
+ dL (F F1 :CK, F F1 :Q) + dL (comb),
ratSetup (F F2 :D) = ratL (F F2 :D)
= P + atE (F F2 :CK) tS
= P + dE (CLOCK, F F2 :CK) tS .

(23)

(24)

A similar condition is derived for ensuring that the hold time is respected. The data
input pin D of F F2 must remain stable for at least tH time after the clock edge reaches the
corresponding CK pin. Therefore, the early arrival time and early required arrival time at the
data pin are:
atE (F F2 :D) = dE (CLOCK, F F1 :CK)
+ dE (F F1 :CK, F F1 :Q) + dE (comb),

(25)

ratHold (F F2 :D) = ratE (F F2 :D)


= atL (F F2 :CK) + tH

(26)

= dL (CLOCK, F F2 :CK) + tH .
The aforementioned arrival times from Equations (23) and (25) and required arrival times
from Equations (24) and (26) induce hold and setup slacks, which are derived from Equations
(19) and (20), respectively.
7

Timer

The timer used in the ICCAD 2014 Timing-driven Placement Contest is the winner from the
TAU 2013 Variation Aware Timing Contest. IITimer takes in two input files and produce one
output file, all of which follow the same format as the TAU 2013 Contest [2]. Its usage is defined
as follows: timer <cell library> <netlist file> <output file>

3.1

Input Files

There are two input files: (i) cell library (.lib), and (ii) a netlist (.net). The formats are adapted
from the TAU 2013 Contest document. As we will not be exercising the statistical capabilities
for this contest, all sensitivities will be 0.
Cell Library. The cell library file contains the timing information of each cell, and follows the
format below.
metal <sigma corner> <resistance scale factor> <capacitance scale factor>
...
cell <cell name>
pin <pin name> input <fall capacitance> <rise capacitance>
pin <pin name> output
pin <pin name> clock <fall capacitance> <rise capacitance>
...
timing <input pin name> <output pin name> <timing sense> \
<fall slew> <rise slew> <fall delay> <rise delay>
setup <clock pin name> <input pin name> <edge type> \
<fall constraint> <rise constraint>
hold <clock pin name> <input pin name> <edge type> \
<fall constraint> <rise constraint>
preset <input pin name> <output pin name> <edge type> <slew> <delay>
clear <input pin name> <output pin name> <edge type> <slew> <delay>
For all lines that start with metal, assume that every corner will be set with nominal values,
e.g., metal 0 1.0 1.0 or metal 3 1.0 1.0. The equations to calculate delay, output slew,
setup and hold constraint times are reproduced for convenience.
d = a + b CL + c si
so = x + y CL + z si
late
tsetup = g + h searly
iCK + j siD
early
thold = m + n slate
iCK + p siD

(27)

Keywords:
metal: metal parameter scalars at given sigma corner.
cell: start of cell definition.
pin: start of pin definition.
input: output and clock: pin type.
timing: delay.
setup: setup time.
hold: hold time.
preset: preset time (output node will be set to high).
clear: clear time (output node will be set to low low).
Variable fields:
<sigma corner> is the sigma corner value () of metal parameter M for which resistance and capacitance scale factors are provided.
<resistance scale factor> is the value (mR ) by which the nominal interconnect resistance provided in the netlist should be scaled at the given metal sigma corner.
<capacitance scale factor> is the value (mC ) by which the nominal interconnect capacitance provided in the netlist should be scaled at the given metal sigma corner.
<cell name> is the name of the cell, of up to 32 characters in length, which can contain
only alphanumeric characters (the first character must be a letter).
<pin name> is the name of a pin of the cell, of up to 32 characters in length, which can
contain only alphanumeric characters (the first character must be a letter).
<fall capacitance> and <rise capacitance> are values of the pins input capacitances
in Farad, for rise/fall transitions, represented in scientific notation.
<input pin name>, <output pin name> and <clock pin name> are the names of the
input, output and clock pins of a given delay or constraint specification, of up to 32
characters in length, which can contain only alphanumeric characters (the first character
must be a letter).
<timing sense>, can be one of:
positive unate, transition direction is maintained from input to output (riserise,
fallfall).

negative unate, transition direction is reversed from input to output (risefall,


fallrise).
non unate, transition direction cannot be inferred from a single input (take the
worst, among rise/fall).
<slew>, <fall slew>, <rise slew>, are each given by 9 real numbers separated by white
spaces, where the first three correspond to the parameters x, y, z of Equation 2, and the
last six is assumed to be 0. (fall/rise refers to the transition direction in the output pin).
<delay>, <fall delay> and <rise delay>, are each given by 9 real numbers separated
by white spaces, where the first three correspond to the parameters a, b, c of Equation
1, and the last six is assumed to be 0. (fall/rise refers to the transition direction in the
output pin).
<edge type>, can be one of:
falling, constraint applies to the falling clock edge;
rising, constraint applies to the rising clock edge;
<fall constraint> and <rise constraint>, are each given by 3 real numbers separated
by white spaces, which correspond to the parameters g, h and j of Equation 5, or m, n and
p of Equation 6, if we are dealing with setup constraints or hold constraints, respectively.
The preset and clear values are presented for completeness, and should be ignored for the
timing analysis task proposed in this contest.
Netlist file. The netlist file contains the description of the circuit topology, and follows the
format below.
input <node>
output <node>
instance <cell name> <pin name>:<node> ... <pin name>:<node>
wire <port node> <tap node> ... <tap node>
res <node> <node> <resistance>
...
cap <node> <capacitance>
...
slew <node> <slew fall> <slew rise>
clock <node> <period>
at <node> <at fall early> <at fall late> <at rise early> <at rise late>
rat <node> <mode of operation> <rat fall> <rat rise>

10

Keywords:
input: primary input node.
output: primary output node.
instance: cell instance.
wire: interconnect net.
res, cap: resistor and capacitor of a parasitic RC tree (can appear in any order).
slew: input slew at the primary inputs.
clock: clock input information (node and period).
at: arrival time constraint (only used for primary input nodes).
rat: required arrival time constraint.
Variable fields:
<node>, <port node> and <tap node> are node names, of up to 64 characters in length,
which can contain alphanumeric characters, the underscore or the dash (the first character
must be a letter).
<cell name> is the name of the library cell (exactly as it will appear in the cell library
file), of up to 32 characters in length, which can contain only alphanumeric characters
(the first character must be a letter).
<pin name> is the name of a pin of the cell (exactly as it will appear in the cell library
file), of up to 32 characters in length, which can contain only alphanumeric characters.
<resistance> is the value of the resistance in Ohm, represented in scientific notation.
<capacitance> is the value of the capacitance in Farad, represented in scientific notation.
<slew fall> and <slew rise> are the fall and rise slews for the corresponding primary
input, in seconds, represented in scientific notation. Early and late slews at the inputs
are assumed to be identical.
<period> is the clock period in seconds, represented in scientific notation.
<at fall early>, <at fall late>, <at rise early> and <at rise late> are real
numbers, represented in scientific notation, which represent arrival time constraints for
fall/rise transitions in early/late mode, at the primary inputs, in seconds.
<mode of operation> is the mode of operation and can be either early or late.

11

<rat fall> and <rat rise> are real numbers, represented in scientific notation, which
represent required arrival time constraints for rise/fall transitions and early/late mode,
in seconds.
If no input slew is defined for any given primary input, it should be assumed to be 1e-12,
for both fall and rise transitions.

3.2

Output File

The result of the timing analysis of a circuit will be produced in the specified output file format.
It consists of two consecutive sets of lines, which will list timing quantities like arrival times,
slews and slacks. The first set of lines will consist of a list of lines, starting with the at keyword,
one per primary output node, containing the node name followed by the corresponding early
and late arrival times, as well as the early and late slews. This list of nodes should be ordered
lexicographically in ascending order (using the ASCII code ordering sequence). The second set
of lines will consist of a list of lines, starting with the slack keyword, one per required arrival
time constraint, followed by the node name and either the early or late keyword, indicating
either early or late mode, respectively. Finally, the value of the slack should be printed. This
list of nodes should also be ordered lexicographically in ascending order (using the ASCII code
ordering sequence). The early slack will appear before the late slack, for nodes where both exist.
Slacks are induced either by explicit required arrival time constraints defined in the netlist file,
or by implicit setup and hold constraints, that must be considered for every flip-flop in the
circuit. The slacks (early, late, or both) should therefore be reported for all the nodes in the
fanin cone of any given node for which an explicit or implicit required arrival time constraint
must be considered.
All numerical results will be given in seconds and printed in scientific notation, with 5
decimal places (e.g. 1.23456e-10). All keywords and variable fields should be separated by a
white space.
at <node> <at early fall> <at early rise> <at late fall> <at late rise>
<slew early fall> <slew early rise> <slew late fall> <slew late rise>
...
slack <node> early <slack early fall> <slack early rise>
slack <node> late <slack late fall> <slack late rise>
...

12

Updates
v1.1 April 14th , 2014 changed Equation 9 to minus.
v1.0 March 11th , 2014 initial version.

13

References
[1] J. Hu, D. Sinha and I. Keller, TAU 2014 Contest on Removing Common Path Pessimism during Timing Analysis, to appear at ISPD, 2014.
https://sites.google.com/site/taucontest2014/
[2] D. Sinha, L. Guerra e Silva, J. Wang, S. Raghunathan, D. Netrabile and A. Shebaita, TAU 2013 variation aware timing analysis contest, ISPD, 2013, pp. 171-178.
https://sites.google.com/site/taucontest2013/
[3] W. C. Elmore, The Transient Response of Damped Linear Networks with Particular
Regard to Wide-band Ampliers, Journal of Applied Physics, 19(1)(1948), pp. 5563.
[4] R. Gupta, B. Tutuianu and L. T. Pileggi, The Elmore Delay as a Bound for RC Trees with
Generalized Input Signals, IEEE Transactions on Computer-aided Design of Integrated
Circuits and Systems, 16(1)(1997), pp. 95-104.
[5] C. V. Kashyap, C. J. Alpert, F. Liu and A. Devgan, Closed-form Expressions for Extending Step Delay and Slew Metrics to Ramp Inputs for RC Trees, IEEE Transactions on
Computer-aided Design of Integrated Circuits and Systems, 23(4)(2004), pp. 509-516.
[6] P. Penfield Jr. and J. Rubinstein, Signal Delay in RC Tree Networks, Proc. Design
Automation Conference, 1981, pp. 613617.
[7] C. L. Ratzlaff and L. T. Pillage, RICE: Rapid Interconnect Circuit Evaluation Using
AWE, IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems,
13(6)(1994), pp. 763-776.

14

You might also like