Professional Documents
Culture Documents
Introduction to the
Scheduling Problem
HIGH-LEVEL
-MIS (some
times called behavioral synthesis)
is the design task of mapping an a b
stract behavioral description of a
digital system onto a register-trans-
fer-level design to implement that
behavior. Introduced in the first ar-
1 ROBERT A. WALKER
SAMIT CHAUDHURI
Rensselaer Polytechnic Institute
tem, written in a hardware de-
scription language such as VHDL
or Verilog, into a controVdata-flow
graph (CDFG).In the CDFG,nodes
represent operations in the behav-
ioral description,such as additions
and multiplications. Edges repre-
ticle in this series by Daniel Gajski sent values-inputs to the expres-
and Loganath Ramachandran,' sion, temporary results, and the
high-levelsynthesiscan greatly im- output of the expression. In more
prove designer productivity and complex behaviors, the CDFG can
designspace exploration.As that also represent conditional branch-
introductory article defined them, es, loops, and so forth-hence the
the three central synthesis tasks in name control/data-flowgraph.
a typical high-level synthesis sys- Consider the arithmeticexpres-
tem are sion G=AB+CD+EF. This expres-
sion might be part of a larger
rn scheduling-determining the se- In most high-level synthesissystems, description-such as a digital signal
quence in which operations exe- scheduling and functional unit alloca- processor description-but for sim-
cute to produce a control step tion occur simultaneously,followed by plicity, we consider only that one ex-
schedule that specifiesthe opera- the remaining allocation tasks and pression here. We can parse this
tions executing in each control binding. This tutorial discusses the expression to build a CDFG, shown in
step, or state scheduling problem, and the next arti- Figure 1, that internally represents this
rn allocation-setting aside the ap- cle in this series will discuss the re- behavior in a high-level synthesis sys-
propriate number of functional, maining allocation problems and the tem. Scheduling,then, determines the
storage, and interconnection units binding problem. order of execution for these opera-
binding-assigning operations to tions-scheduling each operation into
functional units and values to stor- Basic scheduling problems an appropriate control step.
age units, and interconnecting Early on, a typical high-level synthe-
those components to form a com- sis system converts the input behavioral Basic concepts. Suppose we sched-
plete data path description of the desired digital sys- ule the CDFG of the arithmeticexpres-
SUMMER 1995 61
H I G H - L E V E L S Y N T H E S I S
T
sourceconstrained scheduling (RCS) topics, all of which must be considered
problem as follows: by any practical scheduling algorithm.
Control step 4
' G
Given: a set 0 of operations; a set K of
functional unit types; a type
function T: O+K; resource con-
Chaining and multicycling. Until
now, we have assumed that each o p
eration type requires the same amount
Figure 2. Schedule with a resource straints mk, 1 I k IK for each of time to execute, and that the control
constraint of one multiplier. functional unit type; and a par- step length+he clock period--equals
tial order on 0 determined by that execution time. In practice, differ-
the precedence constraints. ent operation types may have different
for operations 1 and 2 is [ l,l] , and their Find: a feasible (or optimal) sched- execution times, because the func-
mobility is 1; the schedule interval for ule for 0 that obeys the prece tional unit types onto which each is
operation 3 is [2,2], and its mobility is dence constraints and satisfies mapped may have different propaga-
also 1;and the schedule interval for o p the resource constraints for tion delays. Thus, an overly restrictive
eration 4 is [ 1,2], and its mobility is 2. each functional unit type. scheduling model coupled with a poor
As we will see later, this information choice for the control step length can
can be of great benefit during the Note that we can impose a separate r e result in poorly used functional units
scheduling process. source constraint on each functional and an overly long schedule.
unit type. Consider the CDFG of Figure 3a,
Resource-constrained schedul- which maps the multiplication and ad-
ing. Another set of constraints com- Time- and resourceconstrained dition operations onto a multiplier and
monly added to the UCS problem, scheduling. Finally, we can combine adder with 1W and 5 h s propagation
reflecting the design goal of limiting the the TCS and RCS problems to define delays. (For simplicity, assume that
chip area, are constraints on the num- the time- and resource-constrained these functional unit propagation de-
ber of functional units of each type. For scheduling (TRCS) problem, con- lays include any necessary register set-
example, two multipliers may fit with- straining both the overall schedule up times.) If we set the control step
in the available chip area, while eight length and the number of functional length to 100 ns, and schedule each op
multipliers may not. In this case we units of each type: eration into exactly one control step,
may need to impose a resource (func- the overall length of the schedule will
tional unit) constraint on the design, be 200 ns, and the multiplier will be
limiting the number of multipliers to Given: a set 0 of operations; a set K of used only half of the time.
two. functional unit types; a type However, as Figure 3b shows, pack-
Consider the effect of adding r e function T: O+K; resource con- ing the two additions into a single con-
source constraints to the example oi straints mk, 1 I k I K for each trol step will increase the multiplier
Figure 1. Generated without resource functional unit type: a partial or- usage and decrease the schedule
constraints, that schedule requires at der on 0 determined by the length. These two additions are chained
least three multipliers and one adder. precedence constraints; and a operations; we implement these at the
However, if we constrain the numbei time constraint (deadline) Don register-transferlevel by connecting the
of multipliers to one, the schedule the overall schedule length. output of the first adder directly to the
shown in Figure 2 might result. Thai Find: a feasible (or optimal) sched- input of the second adder (that is, with-
schedule would require substantiallq ule for 0 that obeys the prece out the intervening register that would
SUMMER 1995 63
H I G H - L E V E L S Y N T H E S I S
tareachoptuat&no,
%Otha5nOilWWhW
w
SrChQdukd */
dw
-w4=-cstsp
OfWt)+d4ShUWdh&
Figure 4. As-soon-as-possible
scheduling.
Contro1step2 (m
1 2
-
*
U,
-
11
-
- 2.83
2.33
ular control step S,E SI is f,,= I/(ALAf, - in that range is 0.5. Figure 7a represents
S A P , + 1). operation 3 with a box of width 0.5span- where c, is the cost of a functional unit
SUMMER 1 9 9 s 6s
H I G H - L E V E L S Y N T H E S I S
I
mk = max(FCostk,j)
I d 1
where S is the index set of all control
tionally than either ASAP or list sched-
uling, due to its global selection of the
For a particular operation (and there next operation to schedule and to its
steps.Thus this example should require fore a particular functional unit type), effectiveness in uniformly distributing
r2.831= 3 multipliers. the direct force thus is the difference the operations across the schedule,
Force-directed scheduling mini- between the expected functional unit forcedirected scheduling is a common
mizes the number of functional units cost in that control step and the aver- choice for solving the TCS problem.
of each type by uniformly distributing age expected functional unit cost over
the operations of that type across the that operation’sschedule interval. Integer linear programming for-
schedule (that is, to balance the his- Since scheduling an operation into a mulations. Mathematical program-
togram). Consider the effect of sched- particular control step may affect the ming formulations, among them
uling operation 3 into either control schedule intervals of other operatiow integer linear programming (ILP), have
step 1 or 2. If the algorithm schedules for example, operation 6 described ear- solved a wide range of problems in
operation 3 into control step 1, the max- lier-we must consider those indirect high-level synthesis, beginning with
imum expected multiplier cost will be costs aswell. Thuswe must compute the Hafer’s early scheduling formulati~n.~
3.33. Therefore, the design will require total force associated with an operation Here we discuss 1LP formulations for
four multipliers. However, if it sched- being scheduled into a particular con- optimally solving the TCS, RCS, and
ules operation 3 into control step 2 trol step as the sum of 1) its direct force, TRCS problems.
(changing operation 6’s schedule in- and 2) the indirect force on any other The biggest advantage of these for-
terval to [3,3]), the maximum expected operation whose schedule interval that mulations is the solution quality. Unlike
multiplier cost will be 2.33, and the d e scheduling affects. the constructive heuristics described
sign will require only three multipliers. In our example,the total force asso- earlier, a commercial ILP solver is guar-
SUMMER 1995 67
H I G H - L E V E L S Y N T H E S I S
In Figure 9b, the shaded vertical oval in where, for functional units of type k E K, 11, No. 4, Winter 1994, pp. 44-54.
column 3 represents a precedence ak is a weight (usually based on area), 2. B.M. Pangrle and D.D. Gajski, "Design
clique stating that if either operation 1, and mk is the number of functional Tools for Intelligent Silicon Compila-
2, or 3 is scheduled into control step 3, units of that type. tion," IEEE Trans. Computer-AidedDe-
the two remaining operations cannot be For the RCS problem, we can mini- sign, Vol. C A M , No. 6, Nov. 1987, pp.
scheduled into that control step as well. mize the number of control steps by in- 10981112.
Finally, we can define resource con- troducing a dummy operation o d , 3. T.C. Hu, "Parallel Sequencing and As
straints on the scheduling problem, en- adding edges to ensure that Od is sched- sembly tine Problems," Operations R e
suring that in each control step the uled after all other operations, and search,Vol.9,No. 6,Nov. 1961,pp.841848.
number of operations of each type scheduling od as early as possible. 4. P.G. Paulin and J.P. Knight, "Algorithms
does not exceed the available number for High-Level Synthesis," IEEE Design
of functional units of that type: & Test of Computers,Vol. 6, No. 4, Dec.
1989, pp. 1831.
c x " Imk,sE S,Vk 5. L.J. Hafer and A.C. Parker, "A Formal
Ek.s Method for the Specification, Analysis,
a n d Design of Register-Transfer Level
In Figure 9b, the resource constraint at where PA@ was defined earlier, and Sd Digital Logic," IEEE Trans. Computer-
the bottom of column 3 states that no is the schedule interval of operation 0,. Aided Design, Vol. CAD-2, No. 1, Jan.
more than two addition operations can Rensselaer Polytechnic Institute's 1983, pp. 4-18.
be scheduled into control step 3. RPI-ILP system7uses this formulation to 6. G.L. Nemhauser and L.A. Wolsey, Inte-
We can represent these constraints solve the TRCS directly, and the TCS ger and Combinatorial Optimization,
succinctly in the following form: and RCS problems indirectly, quickly John Wiley & Sons, New York, 1988.
producing guaranteed optimal solu- 7. S. Chaudhuri and R.A. Walker, "Analyzing
Max = 1;M,x Il;M,x Im tions to each. Tsing Hua University's and Exploiting the Structure of the Con-
THEDA Systema and the University of straints in the ILP Approach to the Sched-
where Mais the coefficient matrix due to Waterloo's OASIC Systemguse similar uling Problem,"I U E Tmns. VLSISptems,
the assignment constraints, M, is the CO formulations. Vol. 2, No.4, Dec. 1994, pp.456471.
efficient matrix due to the precedence 8. C.-T. Hwang, J.-H. Lee, and Y.€. Hsu,
constraints, and M,is the coefficient ma- 'A Formal Approach to the Scheduling
trix due to the resource constraints. THIS TUTORIAL AlTCMPTS to define Problem in High-Level Synthesis,"IEEE
the more common variations on the Trans. Computer-AidedDesign,Vol. 10,
ILP formulation. We can now use scheduling problem in high-level syn- No. 4, Apr. 1991, pp. 464-475.
these constraints to construct ILP thesis and describes several common- 9. C.H. Gebotys, "Optimal Scheduling and
formulations representing the various ly used scheduling techniques. The Allocation of Embedded VU1 Chips,"
scheduling problems. We can easily scheduling problem will undoubtedly hoc. 29th Design Automation C o d ,
construct the formulation of the TRCS remain an area of research for years to IEEE Computer Society Press, Los
problem by combining formulations of come, as we begin to explore various Alamitos, Calif., 1992, pp. 116119.
the TCS and RCS problems. Given these related problems now that we under-
formulations, a commercial ILP solver stand the basic scheduling problem. In
can produce an optimal solution. the future, we will continue to improve
For the TCS problem, we can mini- our understanding of the relationship
mize a function calculating the num- between scheduling, allocation, and
ber of functional units of each type. binding, and will explore the relation
ships between scheduling and clock
determination, type mapping, and time
and resource bounding.
Robert A. Walker is a n assistant professor
References of computer science at Rensselaer Poly-
1. D.D. Gajski a n d L. Ramachandran, "In. technic Institute. He is the coauthor of two
troduction to High-Level Synthesis," books on high-level synthesis and is partic-
IEEE Design & Test of Computers,Vol ularly interested in those problems con-
The Fall issue continues D&Ts emphasis on practical technology with arti-
cles of recent design and test in Asia. Guest Editor Teruhiko Yamada of Meiji
University selected articles representative of current work in China, Taiwan,
India, and Japan:
IEEE n
SUMMER 1995 69