You are on page 1of 10

H I G H - L E V E L S Y N T H E S I S

Introduction to the
Scheduling Problem

HIGH-LEVEL
-MIS (some
times called behavioral synthesis)
is the design task of mapping an a b
stract behavioral description of a
digital system onto a register-trans-
fer-level design to implement that
behavior. Introduced in the first ar-
1 ROBERT A. WALKER
SAMIT CHAUDHURI
Rensselaer Polytechnic Institute
tem, written in a hardware de-
scription language such as VHDL
or Verilog, into a controVdata-flow
graph (CDFG).In the CDFG,nodes
represent operations in the behav-
ioral description,such as additions
and multiplications. Edges repre-
ticle in this series by Daniel Gajski sent values-inputs to the expres-
and Loganath Ramachandran,' sion, temporary results, and the
high-levelsynthesiscan greatly im- output of the expression. In more
prove designer productivity and complex behaviors, the CDFG can
designspace exploration.As that also represent conditional branch-
introductory article defined them, es, loops, and so forth-hence the
the three central synthesis tasks in name control/data-flowgraph.
a typical high-level synthesis sys- Consider the arithmeticexpres-
tem are sion G=AB+CD+EF. This expres-
sion might be part of a larger
rn scheduling-determining the se- In most high-level synthesissystems, description-such as a digital signal
quence in which operations exe- scheduling and functional unit alloca- processor description-but for sim-
cute to produce a control step tion occur simultaneously,followed by plicity, we consider only that one ex-
schedule that specifiesthe opera- the remaining allocation tasks and pression here. We can parse this
tions executing in each control binding. This tutorial discusses the expression to build a CDFG, shown in
step, or state scheduling problem, and the next arti- Figure 1, that internally represents this
rn allocation-setting aside the ap- cle in this series will discuss the re- behavior in a high-level synthesis sys-
propriate number of functional, maining allocation problems and the tem. Scheduling,then, determines the
storage, and interconnection units binding problem. order of execution for these opera-
binding-assigning operations to tions-scheduling each operation into
functional units and values to stor- Basic scheduling problems an appropriate control step.
age units, and interconnecting Early on, a typical high-level synthe-
those components to form a com- sis system converts the input behavioral Basic concepts. Suppose we sched-
plete data path description of the desired digital sys- ule the CDFG of the arithmeticexpres-

60 0740-7475/95/$04.00 0 1995 IEEE IEEE DESION C TEST OF COMPUTERS


sion G=AB+CD+EF,as shown in Figure unit type indicates its functionality A B C D E F
1. If we begin by schedulingoperation (such as addition,multiplication, orad-
1 into control step 1, we can also sched- dition/multiplication). Let Kbe the set
Control step 1
ule operation 2 into that same control 3f available types, and U, and m,be the
step, since the two operations do not area and number of functional units of
type ke K.
/
depend on each other. However, we
Control step 2
cannot schedule operation 3 into con- For a given operation, a type func- 3
trol step 1: Operation 3 depends upon tion T: 0-Kdetermines the type of op
the results of operations 1 and 2, which eration, where z(i> = k means that
Control step 3
are not available until the end of con- operation O,EO executes on a type
trol step 1. Thus operation 3 must d e k Kfunctional unit. We call the prob
lay until control step 2 or later (let's lem of determining this type function, G
assume that it's scheduled into control and thus the type of functional unit on
step 2). As we did with operation 2, we which each operation will execute, the Figure 1. Schedule for expression
can schedule operation 4 into control typemapping problem. Since we need G=AB+CLh EF.
step 1 as well, since it does not depend to know the execution delay of each
upon the result of either operation 1 or operation to solve the scheduling prob-
2. Finally, we must scheduleoperation lem, we must solve this typemapping constrained scheduling (TCS) problem
5 into control step 3 or later, because problem before (or simultaneously as follows:
that operation depends upon the result with) the scheduling problem. In this
of operation 3. tutorial, we assume that the type map Given: a set 0 of operations; a set K of
Let 0 be the set of all operations in ping for each operation is known prior functional unit types; a type
the CDFG. If operation ole 0 uses the to scheduling. function T: O+K; a partial order
result of operation o,e 0, operation 0, These concepts let us define the un- on 0 determined by the prece-
must execute before ol can begin. Thus constrained scheduling (UCS) prob- dence constraints; and a time
there is a data dependency between lem-the basic schedulingproblem in constraint (deadline) D on the
the two operations. We would also say high-level synthesis: overall schedule length.
that 0,is an immediate predecessor of Find: a feasible (or optimal) sched-
0,, and ol is an immediate successor of Given: a set 0 of operations; a set K of ule for 0 that obeys the prece
0,. We represent this data dependency functional unit types; a type dence constraints and that
during the synthesis process as a prece- function T: O+K; and a partial meets the deadline D.
dence constraint between the two op order on 0 determined by the
erations, which the control step precedence constraints. Further, Figure 1 showed that im-
schedule must satisfy. Find: a feasible (or optimal) sched- posing an overall time constraint on the
In generating the control step sched- ule for 0 that obeys the prece- schedule may force some operations
ule fora particular CDFG, we may gen- dence constraints. into specific control steps. For exam-
erate a feasible schedule (any "legal" ple, if we constrain the schedule to a
schedule),or we may want to find one Time-constrained scheduling. total length of three control steps, op
that is optimal with respect to some o b Although the UCS problem captures the erations 1 and 2 must be scheduled
jective function (for example, the basic elements of the scheduling prob into control step 1, operation 3 into
schedule with the smallest functional lem in high-level synthesis, particular control step 2, and 5 into 3. Since we
unit area). In this latter case, we need design goals may in practice require ad- have no freedom in scheduling these
to know the type of the functional unit ditional constraints. For example, we operations (without violating the time
allocated to each operation. With that could limit the design's execution time constraint), we say they are on the crit-
information, we can compute the over- by constraining the overall length of the ical path.
all functional unit area as the sum of control step schedule. This process In general, there is a continuous
the areas of the maximum number of adds a time constraint,or deadline, on range Siof control steps, called the
functional units of each type used in the overall schedule length that the con- schedule interval, over which we can
any one control step. trol step schedule must satisfy. schedule an operation oi.The length of
For a given functional unit in a par- By adding this time constraint to the this interval is the mobility of the oper-
ticular module library, the functional UCS problem,we can define the time ation. In Figure 1,the schedule interval

SUMMER 1995 61
H I G H - L E V E L S Y N T H E S I S

U D E F ess functional unit area. (Notice that


dthough this schedule uses two fewer
iunctional units, the schedule length
dence constraints, meets the
deadlineD, and satisfiesthe r e
source constraints for each
increases by one, illustrating the clas- functional unit type.
sic serial-parallel trade-off that often

Control step 2 -K+--++ irises in scheduling problems when


trading execution time for the number
of resources.)
Advanced scheduling topics
To introduce the various scheduling
problems concisely, we have made a
Adding these resource constraints to number of overly simplistic assump
Control step 3
3 the UCS problem, we can define the r e tions. Now we turn to several advanced

T
sourceconstrained scheduling (RCS) topics, all of which must be considered
problem as follows: by any practical scheduling algorithm.
Control step 4
' G
Given: a set 0 of operations; a set K of
functional unit types; a type
function T: O+K; resource con-
Chaining and multicycling. Until
now, we have assumed that each o p
eration type requires the same amount
Figure 2. Schedule with a resource straints mk, 1 I k IK for each of time to execute, and that the control
constraint of one multiplier. functional unit type; and a par- step length+he clock period--equals
tial order on 0 determined by that execution time. In practice, differ-
the precedence constraints. ent operation types may have different
for operations 1 and 2 is [ l,l] , and their Find: a feasible (or optimal) sched- execution times, because the func-
mobility is 1; the schedule interval for ule for 0 that obeys the prece tional unit types onto which each is
operation 3 is [2,2], and its mobility is dence constraints and satisfies mapped may have different propaga-
also 1;and the schedule interval for o p the resource constraints for tion delays. Thus, an overly restrictive
eration 4 is [ 1,2], and its mobility is 2. each functional unit type. scheduling model coupled with a poor
As we will see later, this information choice for the control step length can
can be of great benefit during the Note that we can impose a separate r e result in poorly used functional units
scheduling process. source constraint on each functional and an overly long schedule.
unit type. Consider the CDFG of Figure 3a,
Resource-constrained schedul- which maps the multiplication and ad-
ing. Another set of constraints com- Time- and resourceconstrained dition operations onto a multiplier and
monly added to the UCS problem, scheduling. Finally, we can combine adder with 1W and 5 h s propagation
reflecting the design goal of limiting the the TCS and RCS problems to define delays. (For simplicity, assume that
chip area, are constraints on the num- the time- and resource-constrained these functional unit propagation de-
ber of functional units of each type. For scheduling (TRCS) problem, con- lays include any necessary register set-
example, two multipliers may fit with- straining both the overall schedule up times.) If we set the control step
in the available chip area, while eight length and the number of functional length to 100 ns, and schedule each op
multipliers may not. In this case we units of each type: eration into exactly one control step,
may need to impose a resource (func- the overall length of the schedule will
tional unit) constraint on the design, be 200 ns, and the multiplier will be
limiting the number of multipliers to Given: a set 0 of operations; a set K of used only half of the time.
two. functional unit types; a type However, as Figure 3b shows, pack-
Consider the effect of adding r e function T: O+K; resource con- ing the two additions into a single con-
source constraints to the example oi straints mk, 1 I k I K for each trol step will increase the multiplier
Figure 1. Generated without resource functional unit type: a partial or- usage and decrease the schedule
constraints, that schedule requires at der on 0 determined by the length. These two additions are chained
least three multipliers and one adder. precedence constraints; and a operations; we implement these at the
However, if we constrain the numbei time constraint (deadline) Don register-transferlevel by connecting the
of multipliers to one, the schedule the overall schedule length. output of the first adder directly to the
shown in Figure 2 might result. Thai Find: a feasible (or optimal) sched- input of the second adder (that is, with-
schedule would require substantiallq ule for 0 that obeys the prece out the intervening register that would

62 IEEE DESIOW & TEST OF COMPVlCId


otherwise latch the result of the first ;traints between them. However, few
adder at the control step boundary). In ligital systems work in isolation, so d e
this example, chaining increases the ;igners may also need to specify more
multiplier usage and decreases the letailed timing constraints on certain
schedule length to 100 ns, but at the iperations. There are minimum timing
cost of an additional adder. :onstraints, which specify that one op
As Figure 3c shows,setting the clock ?rationmust execute at least a specified
length to 50 ns (the shorter propagation imount of time after another operation,
delay) and executing the multiplica- ind maximum timing constraints,
tion over two control steps will also im- Nhich specifythat one operation must
prove the schedule of Figure 3a. Such :xecute no more than a specified
a multiplication is a multicycle opera- imount of time after another operation.
tion: It must execute continuously Most schedulers handle these timing 200 ns
- .

throughout its entire execution time, Zonstraints by adding additional con-


fa)
and its input values must be latched jtraint edges to the CDFG, then treating
throughout that period. In this exam- ;hose additional edges in much the
ple, multicycling decreases the sched- jame manner as other constraints. 0 ns
ule length to 100 ns, just like chaining,
but without the cost of another adder. Common scheduling algorithms
However, multicycling does use twice This tutorial has defined the four ba-
as many control steps as chaining, sic scheduling problems in high-level
which may result in a larger controller. synthesis. To solve those problems, sys
tems use either heuristic algorithms,
Control constructs. The basic which find feasible (possibly subopti- fb)
scheduling problems presented just mal) solutions, or exact algorithms,
now were all defined for a single basic which find optimal solutions. Four
block-one section of straight-line scheduling algorithms commonly used
code with only one entry and one exit by high-levelsynthesis systems are
point. Since most hardware description
languages support conditionals, loops, H as-soon-as-possible/as-late-as-
and other control constructs,the sched- possible (ASAFVALAP) scheduling
uler must consider those constructsdur- list scheduling
ing the scheduling process. When H force-directed scheduling
scheduling conditional branches, the H integer linear programming (ILP)
(4
schedulershould exploit any potential Figure 3. Chained and multicycle
parallelism by sharing functional units The first three are constructive heuris- operations: no chaining or multicycling
between mutually exclusive branches; tic algorithms that iteratively select and (a);two chained additions (61;and a
for example,the same adder can serve schedule one operation at a time into multicycle multiplication (c).
in both the then and else clauses of an an appropriate control step. Since these
if statement. When scheduling loops, greedy strategies make a series of local
the scheduler should exploit any pc- decisions,selectingat each point the sin- though at the cost of more processing
tential parallelism by loop folding- gle "best"operation/controlstep pairing time. In contrast to the first three, which
overlapping the loop executions in a without backtracking or lookahead, schedule one operation at a time, this
pipelined fashion. they may miss the globally optimal so- algorithm produces a schedule for all
lution. However, they do produce re- operationssimultaneously.
Timing constraints. The basic sults quickly, and those results may be
scheduling problems also capture a va- sufficient for practical application. ASAPIALAP scheduling. ASAP
riety of constraints on the execution The fourth scheduling algorithm, and ALAP scheduling are the two sim-
time of the schedule: the length of the based on solving an ILP formulation,is plest scheduling algorithms used in
schedule,the schedule intervals of the an exact algorithm, guaranteed to find high-level synthesis to solve the UCS
operations,and the precedence con- the globally optimal schedule, al- problem. ASAP scheduling(see Figure

SUMMER 1995 63
H I G H - L E V E L S Y N T H E S I S

tareachoptuat&no,
%Otha5nOilWWhW
w

SrChQdukd */
dw
-w4=-cstsp
OfWt)+d4ShUWdh&

Figure 4. As-soon-as-possible
scheduling.

4) schedules each operation, one at a


time, into the earliest possible control
step. AMP scheduling is similar, but
schedules each operation into the lat-
est possible control step.
Although limited by their greedy na- Figure 5. list scheduling.
ture,these algorithms quickly solve the
UCS problem. However, to find ac-
ceptable solutionsto the RCS and TCS erations from that list and schedules
problems,we need more sophisticated them into the current control step.
algorithms. In choosing which ready-list opera-
tion to schedule,the algorithmsorts the Figure 6. Force-directed scheduling.
List scheduling. A common choice ready list according to some priority
for solvingthe RCS problem is list sched- function,always choosing the highest
uling (see Figure 5),*a venerable alge priority operation for scheduling into eration. Unfortunately, there is no
rithm based on work done over 30 years the current control step. One common agreement on which priority function
ago by H u , ~and long used in project priority function is based on mobility, is best, and furthermore,the choice of-
management and microcode com- defined earlier as the length of an o p ten depends on the CDFG structure.
paction. Unlike ASAP/ALAP schedul- eration’sschedule interval. Operations The list-scheduling algorithm may
ing, which processes each operation in with smaller mobility rate a higher pri- also vary in its treatment of the ready
a fixed order, list scheduling processes ority, since there are fewer possible list. As Figure 5 showed, this algorithm
each control step sequentially,choos- control steps into which those opera- constructs the ready list only once per
ing in each iteration the best operation tions can be scheduled. Also, delaying control step. It could instead construct
from all appropriate operationsto place them to a later control step would more the ready list every time it chooses a
into the control step,subject to resource likely increase the overall length of the data-ready operation,thereby choosing
constraints. schedule: this is especially true for an operation from a more uptodate list
During the scheduling process, list those operations with a mobility of 1, at the cost of additional computation.
scheduling uses a ready list (hence the which we regard as on the critical path. Another variation is to maintain a s e p
name) to keep track of data-ready o p The priority function selected clear- arate ready list for each functional unit
erations. Data-readyoperationsare un- ly biases the results of the list-schedul- type k~ K, thus making it easier to con-
scheduled operationsthe algorithm can ing algorithm. Some systems give sider only those operations that meet
schedule into the current control step higher priority to operations with low- the resource constraints.
without violating the precedence con- er mobility. Others give higher priority Although less efficient computa-
straints (those operations whose im- to operations with more immediate tionally than ASAP scheduling,due to
mediate predecessors have been successors, arguing that scheduling its more global selection of the next o p
scheduled into earlier control steps). As them in the current control step would eration to schedule and its simple yet
long as the ready list contains data- make the largest number of operations intuitive priority function,list schedul-
ready operations that meet the resource data ready, thereby allowing the earli- ing remains a common choice for solv-
constraints, the algorithm chooses op- est possible consideration of each o p ing the RCS problem.

64 IEEE DESIGN & TEST OF COMPUTERS


Control step 1

Contro1step2 (m
1 2
-
*

U,
-

11
-
- 2.83

2.33

Control step 3 0.83

Control step 4 I 0.00


8
fa) fb)
Figure 7. Initial schedule intervals la] and multiplication histogram (b).

Forcedirected scheduling. Force u d z 3 z 3 v u d z z d z


directed scheduling (see Figure 6), orig-
inally developed as part of Carleton
Control step 1
University's HAL system,4 is a popular
constructive algorithm that solves the
TCS problem by uniformly distributing
Control step 2
the operations of each type across a
timeconstrained schedule. Balancing
the operationsin this manner results in
higher functional unit usage, and thus
minimizes the number of functional \ I I I

units of each type.


Since force-directed scheduling
solves the TCS problem, the algorithm
must first determinethe schedule length
Control step 4
08
ul Yl ctrl
(the overall time constraint). Con-
structing an ASAP schedule and mea- Figure 8. Data-flow graph and ASAP schedule for the differential equation (DiffEqJ
suring its length will provide a good exa~nple.~
approximation of the schedule length.
Forcedirected scheduling also consid-
ers the schedule interval of each opera- Figure 7a illustrates these probabili- ning control steps 1 and 2. Operation 6
tion,so this algorithm also constructsan ties, where the width of each operation is similar,and operations4,9,10, and 11
ALAP schedule, using the two schedules box represents the probability of sched- each have a 0.33 probability of being
to determine the schedule intervals. uling the correspondingoperation in the scheduled into a particular control step
Now consider a particular operation data-flowgraph in Figure 8 into that con- in their schedule interval.
0,that the algorithm can theoretically trol step. Operations 1,2,5,7,and 8 each Given these probabilities,we can con-
schedule into any control step s in its have a mobility of 1, so their probability struct a histogram for each functional
schedule interval& We denote itsASAP of being scheduled into a particularcon- unit type k, showing the expected cost
control step as ASAP, and its ALAP con- trol step is 1. Figure 7a shows each o p of performing all operations of that type
trol step as ALAP,. If we assume that it eration wholly within that control step, in each controlstep. For a functionalunit
has a uniform probability of being represented by a box of width of 1. O p type k , the expected functional unit cost
scheduled into any control step in the eration 3, however, has a schedule in- in control steps, E S, is
range [ASAP,,ALAP,],the probability P,, terval of [1,2]; thus its probability of
of scheduling operation 0,into a partic- being scheduled into either control step Fcostk.j ='k zh?',j

ular control step S,E SI is f,,= I/(ALAf, - in that range is 0.5. Figure 7a represents
S A P , + 1). operation 3 with a box of width 0.5span- where c, is the cost of a functional unit

SUMMER 1 9 9 s 6s
H I G H - L E V E L S Y N T H E S I S

The latter choice is preferable, as it


of type k, and Ik is the index set of all ciated with scheduling operation 3 into
operations of type k. tends to reduce the expected multipli- control step 1 (see Figure 7 ) is only that
Figure 7b shows the histogram for er cost and more uniformly distributes direct force, since no other schedule
multiplication operations, assuming a multiplications across the schedule. intervals are affected.
unit multiplier cost. This histogram Forcedirected scheduling iterative
takes the following form: ly builds a control step schedule,keep Total- Force3,mu,tJ
ing the schedule balanced as follows. = Fo~Ce3snultJ
FCostInlI,t,1 = C,l + 4.1+ ‘3J First, it creates the initial histograms. = 2.83 - (2.83 + 2.33)/2
Then it computes the expected func- = 4.25
‘41 + 5‘1. + ‘6.1
= 1+1+ 0.5 tional unit cost of schedulingeach un- However, the total force associated
+ 0.33+ 0 + 0 scheduled operation into each control with scheduling operation 3 into con-
= 2.83 step in itsscheduleinterval and makes trol step 2 is that direct force plus the
FCOstmult2 = 62+ 42 +3‘2 the operation/control step scheduling indirect force of schedulingoperation
that results in the smallest increase (or 6 into control step 3
+ 4‘2 + ‘52 + ‘62
= 0 + O +0.5 largest decrease) in cost. It then up- Total-Force3mult2
+ 0.33 + 1+ 0.5 dates the histograms, and the process
continues until there are no more un- = F0rce3~ultp + F0rce6~ult,3
= 2.33 = 2.33 - (2.83 + 2.33112
FCostmu,t,,= 0 + 0 + 0 scheduled operations.
In forcedirected scheduling, we + 0.83 - (2.33 + 0.83)/2
+ 0.33+0 + 0.5 compute the increase in expected hnc- = -1.0
= 0.83 tional unit cost that results from assign-
FCOStmUltA =0
ing an operation to a particular control Given these two choices, the algorithm
step as the sum of a set of forces (hence would choose the operationlcontrol
The algorithm can now compute the the name). The direct force of an oper- step scheduling that results in the
expected number mk of functional ation ojwith schedule interval Si being largest decrease in cost (force). In this
units of type k as the maximum num- scheduled into control step sj E Siis example, it would schedule operation
ber of functional units of that type in 3 into control step 2, as we conjectured
any control step Forcei,k,i= earlier.
Although less efficient computa-

I
mk = max(FCostk,j)
I d 1
where S is the index set of all control
tionally than either ASAP or list sched-
uling, due to its global selection of the
For a particular operation (and there next operation to schedule and to its
steps.Thus this example should require fore a particular functional unit type), effectiveness in uniformly distributing
r2.831= 3 multipliers. the direct force thus is the difference the operations across the schedule,
Force-directed scheduling mini- between the expected functional unit forcedirected scheduling is a common
mizes the number of functional units cost in that control step and the aver- choice for solving the TCS problem.
of each type by uniformly distributing age expected functional unit cost over
the operations of that type across the that operation’sschedule interval. Integer linear programming for-
schedule (that is, to balance the his- Since scheduling an operation into a mulations. Mathematical program-
togram). Consider the effect of sched- particular control step may affect the ming formulations, among them
uling operation 3 into either control schedule intervals of other operatiow integer linear programming (ILP), have
step 1 or 2. If the algorithm schedules for example, operation 6 described ear- solved a wide range of problems in
operation 3 into control step 1, the max- lier-we must consider those indirect high-level synthesis, beginning with
imum expected multiplier cost will be costs aswell. Thuswe must compute the Hafer’s early scheduling formulati~n.~
3.33. Therefore, the design will require total force associated with an operation Here we discuss 1LP formulations for
four multipliers. However, if it sched- being scheduled into a particular con- optimally solving the TCS, RCS, and
ules operation 3 into control step 2 trol step as the sum of 1) its direct force, TRCS problems.
(changing operation 6’s schedule in- and 2) the indirect force on any other The biggest advantage of these for-
terval to [3,3]), the maximum expected operation whose schedule interval that mulations is the solution quality. Unlike
multiplier cost will be 2.33, and the d e scheduling affects. the constructive heuristics described
sign will require only three multipliers. In our example,the total force asso- earlier, a commercial ILP solver is guar-

66 IEEE DUION li n m 01 COMRWMS


Control step
anteed to find an optimal schedule
1 2 3 4 5
from these formulations.Unfortunately,
Precedence
this guarantee comes at a price: lLPs
cannot, in general, be solved in poly- lxVq = 1
1 Assignment
nomial time. Thus the trade-off is b e
X constraint
tween guarantees of solution quality U
al
._
and algorithm runtime. Fortunately, a s
.- 2 IXV'I = 1
carefully designed ILP formulation p r o c
E Assignment
al
duces results acceptably quickly for n
0 constraint
small and mediumsized problems: on- 3 1xq =1
going research on bounding tech- Assignment
niques may soon allow larger problems constraint
to be solved as well. 1 xvadd,ll
22 I,'add.
31 5 2 Ixvadd. 51 5 2

ILP problems6are problems that ei- Resource Resource Resource


constraint constraint constraint
ther maximize or minimize some ob-
jective function of many variables, Figure 9. A CDFG (a)and its constraint graph lb], assuming a schedule of length 5
subject to linear equality and inequal- 9nd a resource constraint of two adders.
ity constraints, and integrality restric-
tions on all of the variables. These
problems also commonly use linear ob- each functional unit type k corre- iodes Vthat we have already seen.
jective functions and require nonneg- sponds to a set of nodes Vk,s = { ( i s ) We can now define assignment con-
ative variables. We write an ILP as sd,;z@=k}, which can map onto that ;traints on the scheduling problem,
functional unit type during that control :hus ensuring that a feasible schedule
step. For example, in Figure 9b, as- ias exactly one node per operation:
z,~= min{cTxlxE PF;xinteger) suming all three operations map onto
adders, operations 1 , 2 , and 3 might all
where fF={Ax Ib, x E W+},and where be scheduled onto adders during con-
W+is the set of nonnegative real (nxl) trol step 3, as the vertical shaded oval in
vectors, c is a (nxl) real vector, b is a the third column shows.
( m x l ) integer vector, and A is an Each feasible schedule contains ex- In Figure 9b, the horizontal shaded
(mxn) integer matrix. actly one node from each set of nodes oval labeled V, represents the assign-
V,, satisfies all the precedence con- ment constraint for operation 2, con-
Set o f feasible schedules. Consider straints between operations, and uses straining it to be scheduled into exactly
the set of nodes V= {(i,s)I i ~s dt i }as no more than the available number of one control step in the range [2, 41.
illustrated in Figure 9b, where a node functional units of each type. Clearly,to Earlier,we defined precedence con-
(i,s)correspondsto operation 0,being find a feasible schedule, we need some straints between two operators: a
scheduled in control step s, I is the way to determine which x,, variables are precedence clique C, is a clique in G,
index set of all operations, and Siis the 1 and which are 0. We will determine with at least one precedence edge
schedule interval over which an these values by specifying aset of equal- (constraint) connecting two of its
operation 0,can be scheduled. ity and inequality constraints on the nodes. (A clique is a fully connected
Each operation 0,can conceivably scheduling problem and then construct subgraph-a graph in which each
be scheduled anywhere in its schedule an ILP formulation of the problem. node connectsto all other nodes of the
interval S,, so corresponding to each Solvingthe ILP formulation using an ILP subgraph.)This concept enables us to
operation 0,is a set of nodes V,= {(i,s) solver will give us an optimal solution define precedence constraints on the
I SE&}. For example, in Figure 9b, as- for a specified objective function. scheduling problem, thus preventing
suming a deadline of five control steps, two nodes in precedence conflict from
it can conceivably schedule operation Constraintson the schedulingproblem. being in the same feasible schedule:
2 into either control step 2,3, or 4,as To characterize the constraints on the
shown by the horizontal shaded oval scheduling problem, we need to
labeled V,. construct a constraint graph G, as follows
Furthermore, for each control step s, (see Figure 9b). The nodes of G, are the

SUMMER 1995 67
H I G H - L E V E L S Y N T H E S I S

In Figure 9b, the shaded vertical oval in where, for functional units of type k E K, 11, No. 4, Winter 1994, pp. 44-54.
column 3 represents a precedence ak is a weight (usually based on area), 2. B.M. Pangrle and D.D. Gajski, "Design
clique stating that if either operation 1, and mk is the number of functional Tools for Intelligent Silicon Compila-
2, or 3 is scheduled into control step 3, units of that type. tion," IEEE Trans. Computer-AidedDe-
the two remaining operations cannot be For the RCS problem, we can mini- sign, Vol. C A M , No. 6, Nov. 1987, pp.
scheduled into that control step as well. mize the number of control steps by in- 10981112.
Finally, we can define resource con- troducing a dummy operation o d , 3. T.C. Hu, "Parallel Sequencing and As
straints on the scheduling problem, en- adding edges to ensure that Od is sched- sembly tine Problems," Operations R e
suring that in each control step the uled after all other operations, and search,Vol.9,No. 6,Nov. 1961,pp.841848.
number of operations of each type scheduling od as early as possible. 4. P.G. Paulin and J.P. Knight, "Algorithms
does not exceed the available number for High-Level Synthesis," IEEE Design
of functional units of that type: & Test of Computers,Vol. 6, No. 4, Dec.
1989, pp. 1831.
c x " Imk,sE S,Vk 5. L.J. Hafer and A.C. Parker, "A Formal
Ek.s Method for the Specification, Analysis,
a n d Design of Register-Transfer Level
In Figure 9b, the resource constraint at where PA@ was defined earlier, and Sd Digital Logic," IEEE Trans. Computer-
the bottom of column 3 states that no is the schedule interval of operation 0,. Aided Design, Vol. CAD-2, No. 1, Jan.
more than two addition operations can Rensselaer Polytechnic Institute's 1983, pp. 4-18.
be scheduled into control step 3. RPI-ILP system7uses this formulation to 6. G.L. Nemhauser and L.A. Wolsey, Inte-
We can represent these constraints solve the TRCS directly, and the TCS ger and Combinatorial Optimization,
succinctly in the following form: and RCS problems indirectly, quickly John Wiley & Sons, New York, 1988.
producing guaranteed optimal solu- 7. S. Chaudhuri and R.A. Walker, "Analyzing
Max = 1;M,x Il;M,x Im tions to each. Tsing Hua University's and Exploiting the Structure of the Con-
THEDA Systema and the University of straints in the ILP Approach to the Sched-
where Mais the coefficient matrix due to Waterloo's OASIC Systemguse similar uling Problem,"I U E Tmns. VLSISptems,
the assignment constraints, M, is the CO formulations. Vol. 2, No.4, Dec. 1994, pp.456471.
efficient matrix due to the precedence 8. C.-T. Hwang, J.-H. Lee, and Y.€. Hsu,
constraints, and M,is the coefficient ma- 'A Formal Approach to the Scheduling
trix due to the resource constraints. THIS TUTORIAL AlTCMPTS to define Problem in High-Level Synthesis,"IEEE
the more common variations on the Trans. Computer-AidedDesign,Vol. 10,
ILP formulation. We can now use scheduling problem in high-level syn- No. 4, Apr. 1991, pp. 464-475.
these constraints to construct ILP thesis and describes several common- 9. C.H. Gebotys, "Optimal Scheduling and
formulations representing the various ly used scheduling techniques. The Allocation of Embedded VU1 Chips,"
scheduling problems. We can easily scheduling problem will undoubtedly hoc. 29th Design Automation C o d ,
construct the formulation of the TRCS remain an area of research for years to IEEE Computer Society Press, Los
problem by combining formulations of come, as we begin to explore various Alamitos, Calif., 1992, pp. 116119.
the TCS and RCS problems. Given these related problems now that we under-
formulations, a commercial ILP solver stand the basic scheduling problem. In
can produce an optimal solution. the future, we will continue to improve
For the TCS problem, we can mini- our understanding of the relationship
mize a function calculating the num- between scheduling, allocation, and
ber of functional units of each type. binding, and will explore the relation
ships between scheduling and clock
determination, type mapping, and time
and resource bounding.
Robert A. Walker is a n assistant professor
References of computer science at Rensselaer Poly-
1. D.D. Gajski a n d L. Ramachandran, "In. technic Institute. He is the coauthor of two
troduction to High-Level Synthesis," books on high-level synthesis and is partic-
IEEE Design & Test of Computers,Vol ularly interested in those problems con-

68 ILLE DESIOW & TEST OF COMPUTERS


netted with scheduling. He has served on
the Advisory Board of the ACM SIGDA, has
been actively involved with its university
booth program, and also serves as SIGDA
secretary-treasurer.Walker received the MS
and PhD degrees from Camegie Mellon Uni-
versity. He is a senior member of the IEEE,
and a member of the Computer Society,the
ACM, ACM SIGDA, and Sigma Xi.

The Fall issue continues D&Ts emphasis on practical technology with arti-
cles of recent design and test in Asia. Guest Editor Teruhiko Yamada of Meiji
University selected articles representative of current work in China, Taiwan,
India, and Japan:

w Test Subsession Partitioning for Test Scheduling (Institute of Computing


Technology, Academia Sinica, Beijing)
w Untestable Fault Identification in Sequential Circuits by Using Symbolic
simulation (Taiwan)
rn Concurrent Error Detection Using Monitoring Machines (Indian Institute
of Technology, Bombay)
w A Protocol Test Sequence Generation Technique for Fault Localization
and Its Evaluation (Osaka University, Japan)
w Multiple Fault Diagnosis by Sensitizing Input Pairs (Ehime University,
Japan)

h i t Chaudhuri is working on his PhD at


Rensselaer Polytechnic Institute. His re- SPECIAL AlTRACTIONS
search interests include design automation, w I, series continues with two articles on testing instrumentation by Ken
high-levelsynthesis,and combinatorial o p Wallquist, and Keith Baker and Alan Hales
timization. He received the BE degree in w The Practical Application of Formal Verification-ALXTroundtable held
electronics and telecommunications engi- last March at the European Design and Test Conference in Paris
neering from the University of Calcutta and w Management perspectives in EDA by Ajit Prabhu, Deloitte &Touche
the MTech degree in electrical engineering w Recent conference reports and panel summaries
from the Indian Institute of Technology,
Kanpur, India.
Look to

IEEE n

Address correspondence about this tu-


torial to Robert A. Walker at Rensselaer
Polytechnic Institute, Computer Science for the practical information that makes your job easier
Department, Troy, NY 12180; walkerb@ and your output quality higher
cs.rpi.edu.

SUMMER 1995 69

You might also like