Professional Documents
Culture Documents
net/publication/243783239
CITATIONS READS
8 64
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Pao-Ann Hsiung on 08 February 2014.
PAPER
DESC: A Hardware-Software Codesign Methodology for
Distributed Embedded Systems
Trong-Yen LEE† , Regular Member, Pao-Ann HSIUNG†† , and Sao-Jie CHEN† , Nonmembers
SUMMARY The hardware-software codesign of distributed and generates rapid prototyping through co-emulation.
embedded systems is a more challenging task, because each phase When a system consists of parts that must be lo-
of codesign, such as copartitioning, cosynthesis, cosimulation,
cated at in different physical locations, it is called a dis-
and coverification must consider the physical restrictions im-
posed by the distributed characteristics of such systems. Dis- tributed system. The design of distributed systems has
tributed systems often contain several similar parts for which de- always been a challenging task. Codesign of distributed
sign reuse techniques can be applied. Object-oriented (OO) code- systems must solve not only hardware-software com-
sign approach, which allows physical restriction and object de- munication issues within a single embedded unit, but
sign reuse, is adopted in our newly proposed Distributed Embed-
ded System Codesign (DESC) methodology. DESC methodology
also the communication issues between different parts
uses three types of models: Object Modeling Technique (OMT) of a distributed system (which may consist of either
models for system description and input, Linear Hybrid Au- hardware, or software, or both). Hardware-software
tomata (LHA) models for internal modeling and verification, and copartitioning must take into consideration the phys-
SES/workbench simulation models for performance evaluation. A
ical restrictions of a distributed system. Besides timing
two-level partitioning algorithm is proposed specifically for dis-
tributed systems. Software is synthesized by task scheduling and and cost constraints, part modularity and physical re-
hardware is synthesized by system-level and object-oriented tech- strictions are also important factors that must be taken
niques. Design alternatives for synthesized hardware-software into account in the codesign of distributed systems. In-
systems are then checked for design feasibility through rapid terface must be synthesized not only for the hardware
prototyping using hardware-software emulators. Through a case
study on a Vehicle Parking Management System (VPMS), we
and the software within a system part, but also for the
depict each design phase of the DESC methodology to show ben- different parts of a distributed system.
efits of OO codesign and the necessity of a two-level partitioning In this paper, we propose a complete code-
algorithm. sign methodology for distributed embedded systems
key words: distributed embedded systems, emulation, two-level
called Distributed Embedded System Codesign (DESC)
partitioning, object-oriented codesign, software scheduling
methodology. To allow maximum exposure to all the
design phases in DESC, we have omitted some tech-
1. Introduction
nical details so that the reader can obtain a more
overall picture of how distributed systems are code-
Distributed systems such as distance teaching facilities,
signed. DESC uses three types of semantic models
vehicle parking systems, auditorium air-conditioning,
for different purposes, namely Object Modeling Tech-
coal-mine signal systems, and others abound in our
nique (OMT) models for system description and in-
everyday life and are almost all embedded with some
put, Linear Hybrid Automata (LHA) models for in-
sort of integrated chips and/or processors for running
ternal modeling and verification, and SES/workbench
software. These systems are difficult to design due
simulation models for performance evaluation. The
to their physical restrictions and distributed behav-
two-level partitioning technique used in DESC includes:
ior. Different from the hardware-software codesign
(1) Design space exploration to determine the num-
techniques for centralized systems, symmetricities in
ber of processors for software execution and the hard-
distributed structure, hierarchical system architecture,
ware cost in a distributed embedded system, and (2)
and physical restrictions have all to be considered in our
Hardware-software copartitioning to produce a final
newly proposed Distributed Embedded System Code-
system partition result. SES/workbench simulation
sign (DESC) methodology. DESC offers design reuse
tool [1] is used to evaluate the performance of the copar-
through its object-oriented synthesis, considers physical
tition results. Then, hardware is synthesized using a re-
constraints through its hierarchical system partitioning,
cently proposed system-level object-oriented (OO) de-
Manuscript received January 19, 2000. sign methodology called Intelligent Concurrent Object-
Manuscript revised September 18, 2000. Oriented Synthesis (ICOS) methodology [2]. And soft-
†
The authors are with the Department of Electrical ware is synthesized by task scheduling on system pro-
Engineering, National Taiwan University, Taipei, Taiwan,
cessor(s). The synthesized results are further validated
R.O.C.
††
The author is with the Department of Computer Sci- for feasibility by rapid prototyping through hardware-
ence and Information Engineering, National Chung Cheng software emulators. Finally, a network of LHA models
University, Chiayi, Taiwan, R.O.C. for a synthesized system is used to formally verify the
LEE et al.: A HARDWARE-SOFTWARE CODESIGN METHODOLOGY
327
user-given system constraints. Prakash and Parker’s formulation, Wolf [6] developed
The contributions of our proposed DESC method- a heuristic algorithm for architectural co-synthesis of
ology are described as follows. From the given system distributed embedded computing systems. Haworth et
specification, DESC detects real physical restrictions al. [22] proposed an algorithm that chooses parts from
of distributed embedded systems to achieve a final best a part library (or catalog) to implement a set of func-
design, while these physical restrictions are not consid- tions meeting cost bounds. D’Ambrosio and Hu [23]
ered explicitly by existing methods [3]–[6]. As for sys- presented a hardware-software co-synthesis approach at
tem modeling, the applications of three existing mod- the configuration level.
els in the codesign of distributed systems are newly Recently, the design of an n-CPU/m-ASIC topol-
proposed by DESC for the following reasons. (1) The ogy is of major concern. Yen and Wolf [13] proposed a
formal LHA model is suited for partitioning and anal- sensitivity-driven method for the co-synthesis of real-
ysis, and (2) the modeling capability of most existing time distributed embedded systems. The cosynthe-
methods based on task graphs [7]–[12] are quite limited, sis algorithm selects the number of processing ele-
compared to the OMT models used in DESC. Last but ments (PEs), the type of each PE, as well as allo-
not the least, a new partitioning method is proposed cating functions to PEs and scheduling their execu-
in DESC, which has many desirable features, such as tions. Dick and Jha [9] proposed an algorithm for
simplicity, efficiency, and hierarchy. hardware-software cosynthesis of distributed embedded
The article is organized as follows. Section 2 de- systems, namely MOGAC which partitions and sched-
scribes some previous and related work. Section 3 ules embedded system specifications consisting of mul-
describes our hardware-software codesign methodology tiple periodic task graphs. Recently, Dave, Lakshmi-
for distributed embedded systems and design phases of narayana, and Jha [10] proposed a heuristic-based con-
cosynthesis and emulation. A case study, Vehicle Park- structive cosynthesis technique call COSYN, which in-
ing Management System (VPMS), is given in Sect. 4. cludes allocation, scheduling, and performance estima-
Finally a conclusion is drawn in Sect. 5. tion steps. Another constructive cosynthesis system,
called COFTA, was proposed in [11], which targets
2. Related Work fault-tolerant distributed architectures and address re-
liability and availability of the embedded system during
A distributed system has more than one physical lo- cosynthesis. In [12], a heuristic-based cosynthesis tech-
cation, each of which may have different characteris- nique, called COHRA, was proposed, which takes as
tics. Previous works related to hardware-software code- input an embedded system specification in terms of hi-
sign [7], [13] have not distinguished between the differ- erarchical acyclic task graphs and generates an efficient
ent physical locations of a distributed system and their hierarchical hardware-software architecture that meets
characteristics. For example, a processor for software the real-time constraints. In [24], Dick and Jha pro-
execution, when placed at different locations, could af- posed a system synthesis algorithm, called MOCSYN,
fect the overall system performance. Further, previous which synthesizes real-time heterogeneous single-chip
work have also not considered how design parts may be hardware-software architectures using an adaptive mul-
reused in a distributed system. tiobjective genetic algorithm. A cosynthesis system,
As far as hardware design is concerned, method- called CORDS, which synthesizes multi-rate, real-time,
ologies for the system-level synthesis of general- periodic distributed embedded systems containing dy-
purpose multi-processor systems have been proposed namically re-configurable FPGAs was proposed in [25].
recently, for example, Performance Synthesis Methodol- In comparison to previous work on hardware-
ogy (PSM) [16], Intelligent Concurrent Object-Oriented software cosynthesis, which was based on task graphs as
Synthesis (ICOS) methodology [2], and Parallel Object- system model, our cosynthesis method uses object ori-
Oriented Synthesis Enviroment (POSE) [17] are three ented (OO) hierarchy, LHA, and SES as system mod-
of the most recently proposed methodologies. A new els. The task graph model basically assumes an arbi-
cosynthesis methodology for parallel systems called trary system architecture, without considering any re-
Cosynthesis Methodology for Application-Oriented strictions on subsystem locations. In contrast, our OO
Parallel Systems (CMAPS) [18] has also been recently model takes realistic physical restrictions into consider-
proposed. Some other successful methodologies for ation for the target system architecture. This is more
hardware design include the MICON system [19], [20] appropriate for distributed systems because of inher-
and the Megallan System [21]. ent architectural restrictions. Our proposed DESC is a
As far as distributed system codesign is con- more complete design methodology that not only car-
cerned, several related works can be found. Prakash ries out hardware and software syntheses, but also in-
and Parker [14] formulated heterogeneous multiproces- cludes two-level partitioning, performance analysis, and
sor system synthesis as a mixed integer linear pro- emulation of distributed embedded systems.
gram in SOS, a formal synthesis approach developed at
University of Southern California. Further, based on
IEICE TRANS. INF. & SYST., VOL.E84–D, NO.3 MARCH 2001
328
HW/SW PARTITION(N1 , N2 , N3 , . . . , Ni , . . . , Nr )
/* N1 , N2 , N3 , . . . , Ni , . . . , Nr , where Ni is an object */ 1
Generate Immovable Linear Array (ILA) and
Movable Linear Array (MLA). 2
Calculate Cost-Performance Difference (CPD) ratio for
each MLA object. 3
Sort all MLA objects in an ascending order of their CPD
ratios such that MLA = M1 , M2 , M3 , . . . , Mm 4
u:= Number of PE; Flag:=true; 5
PSS={} /* PSS:Partition Solution Set*/ 6
for (p = 0, p ≤ u, p + +) 7
{
HW Cost := M ax Cost − Cost(P E) × p; 8
k := 1, j := m; /* where k is called the lower bound
object index, j is called the upper bound object index */ 9
i := m
2
/* where i represents divider object index */ 10
Use software to implement objects from Mi to Mj and
use hardware to implement objects from Mk to Mi−1 11
while Flag=true do 12
{
switch(cost and performance estimations) 13
{
Case 1: Cost constraint not satisfied,
but performance constraints satisfied 14
j := i; 15
i := i − i−k 2
; 16
break; 17
Case 2: Cost and performance constraints satisfied 18
if the partition result s is heuristically optimal {19
Fig. 3 The flow diagram of copartitioning methodology. Flag := false; PSS := PSS ∪ {s }; break; } 20
else { if performance is more important{ 21
k := i; 22
is the value of the performance bound associated with i := i + j−i 2
; } 23
part x. Here, it is assumed that each part is associ- else { j := i; /* cost is more important */ 24
ated with only one performance bound. The denomi- i := i − i−k 2
;} 25
nator in CPD(x) is a normalization of the performance } 26
difference, which is required for a fair comparison be- break; 27
tween different parts and performance bounds. The Case 3: Cost constraint satisfied, but
CPD metric is a useful hardware-software partitioning performance constraints not satisfied 28
criterion. All objects are sorted in an ascending order k := i; 29
of their CPD ratios and placed in a horizontal one- i := i + j−i 2
; 30
dimensional array called Movable Linear Array (MLA) break; 31
from left to right. Case 4: No satisfactory partition 32
The three models in OMT, namely object, dy- Flag:=false; break; 33
namic, and functional, are used for calculating the CPD } /* end of switch */ 34
value of an object. The hardware cost and software cost } /* end of while */ 35
are given as data attributes for each object. The hard- Flag:=true; 36
} 37
ware performance and software performance for an ob-
if PSS={} print “No partition found”; 38
ject are evaluated for an object by calling corresponding else output heuristically optimal partition from PSS; 39
functional methods in a class description. Dynamic and
functional models are, in fact, implemented as func- Fig. 4 Two-level partitioning algorithm (Algorithm 1).
tional methods in a class description. Hence, all the
three models are required for CPD calculation.
The copartitioning method begins somewhere the left of the divider) are implemented in hardware.
around the middle of the sorted sequence of objects The reason that such an implementation is correct is
in MLA. A median object is selected as an initial di- two-folds. Firstly, the objects in the left part of the
vider. As shown in Fig. 5, the role played by a divider is sequence have a greater gain in performance if imple-
that all objects to the right of the divider, including the mented as hardware (i.e., a larger difference between
divider, are implemented in software and the rest (at hardware and software performance) and at a smaller
LEE et al.: A HARDWARE-SOFTWARE CODESIGN METHODOLOGY
331
THEOREM 2: Once a feasible partition is found, The hierarchical approach adopted in TLP is dif-
the final heuristically optimal partition found by the co- ferent from the traditional task-graph based hardware-
partitioning algorithm is always feasible. software partitioning [7]–[12]. The conventional ap-
Proof: Suppose a feasible partition Z is found and the proach uses task-graphs to represent a distributed em-
final partition found Z is not feasible. Here, either one bedded system such that each task graph represents a
of three cases may occur: sequential execution of processes with possible branch-
Case (1) Only cost(Z ) does not satisfy the bound: ing, either deterministic or non-deterministic. Usu-
If divider(Z ) is on the left of divider(Z), ally cycles are not allowed in task-graphs. Two or
then BSC will return either some partition more task-graphs execute concurrently and may com-
Z with divider(Z ) between divider(Z ) and municate with each other, exchanging data. Hier-
divider(Z) or Z itself. If divider(Z ) is on archy among task-graphs is generally represented as
the right of divider(Z), then this implies that flattened-out graphs. This often incurs redundancy in
cost(Z) also does not satisfy the cost bound. representation and unnecessary repeated procedures in
In either case, a contradiction arises. the partitioning process.
TLP is based on object-oriented design modeling
Case (2) Only performance(Z ) does not satisfy the
and is a hierarchical process, which is more appropri-
bound: If divider(Z ) is on the left of
ate for distributed architectures because they are inher-
divider(Z), then this implies that performance
ently hierarchical. Distributed architectures normally
(Z) also does not satisfy the performance
have physical constraints related to the environment,
bound. If divider(Z ) is on the right
such as some subsystems must be located within some
of divider(Z), then BSC will return either
pre-specified distance or some subsystems may share or
some partition Z with divider(Z ) between
may not share an ASIC or a PE. Such physical restric-
divider(Z ) and divider(Z) or Z itself. In ei-
tions are either difficult to model in task-graphs and
ther case, a contradiction arises.
not considered in previous work on partitioning. TLP
Case (3) Both cost(Z ) and performance(Z ) do not considers all such physical constraints and thus TLP is
satisfy the bounds: By Theorem (3), there is more realistic when distributed architectures are tar-
no feasible partition for the system. A contra- gets.
diction follows. In summary, the two-level partitioning algorithm
From the above three cases, we have a contradiction presented in this section is more appropriate for dis-
arising in each case. Hence, our assumption is wrong, tributed embedded systems than conventional parti-
that is, if a feasible partition is ever found, the final tioning algorithms due to its consideration of the dis-
partition found by BSC would be feasible. tributed characteristics in the systems.
THEOREM 3: If a completely infeasible partition
3.3 Cosynthesis and Emulation Phase
(both cost and performance constraints are not satis-
fied) is ever found during BSC, then there exists no
In this section, we will introduce how to cosynthesize
feasible partition for the system. Hence, the partition-
and emulate the resulting system obtained from par-
ing algorithm can stop searching.
titioning. Cosynthesis includes hardware design and
Proof: Suppose a completely infeasible partition Z is software design. Checking for design feasibility through
found during the binary search and suppose there exists rapid prototyping using hardware-software emulator
a feasible partition Z for the system. The following two will be introduced in the emulation phase of codesign.
cases occur: Given the partition results from the previous
Case (1) Divider(Z ) is on the left of divider(Z): This phase, in this phase, hardware is synthesized by an
implies that performance(Z ) does not satisfy object-oriented system-level design methodology called
the minimum performance bound of the sys- ICOS [2] and software by scheduling tasks on proces-
tem. A contradiction follows. sors. The synthesized system designs are then checked
Case (2) Divider(Z ) is on the right of divider(Z): This for feasibility by rapid-prototyping, which includes
implies that cost(Z ) does not satisfy the max- hardware and software emulation and testing. This
imum cost bound of the system. A contradic- design phase is presented as follows, which produces
tion follows. design results in the form of feasible synthesized dis-
When all objects in a system satisfy the two as- tributed hardware-software systems.
sumptions on hardware-software cost and performance,
TLP will find a heuristically optimal solution. But, if 3.3.1 Hardware Synthesis
there are one or more objects whose hardware-software
cost and performance do not satisfy the two assump- The Intelligent Concurrent Object-oriented Synthesis
tions, then TLP will not be able to find a feasible solu- (ICOS) [2] methodology was used for hardware syn-
tion as TLP is not an exhaustive approach. thesis in DESC for the following three reasons. (1)
LEE et al.: A HARDWARE-SOFTWARE CODESIGN METHODOLOGY
333
driver, entry sensor driver, exit motor driver, and en- Table 2 shows each iteration of applying TLP to the
try motor driver. In this example, we assume that the VPMS example. An analysis of this application shows
ENTRY Management Subsystem and the EXIT Man- that an exhaustive search for the heuristically optimal
agement Subsystem are located near each other such partitioning requires evaluating 20 different partitions.
that the two sensor drivers can be implemented as one This number would be much greater when larger exam-
component (ASIC or CPU) and the two motor drivers ples are considered.
can also be implemented as another component both to From Table 2, we observe that when all three parts
be shared by the two subsystems. Henceforth, we will are implemented as hardware (i.e. # CPU = 0), there is
consider only three parts (counter, sensor driver, and no feasible partition. Here, feasibility implies satisfac-
motor driver) for copartitioning. tion of all cost and performance constraints. For each
Applying TLP to VPMS, by default TLP assumes iteration of the CSE level, the divider is selected to be
that at most three CPUs are used for software imple- the counter part. We see that out of 20 possible parti-
mentation. In Table 1, the CPD values for each of the tions, TLP evaluates only 6 partitions, two of which are
three parts were calculated according to Eq. (1) and feasible. The heuristically optimal partition is then se-
the objects arranged in an ascending order in the MLA lected to be best one of the feasible two. Here, “best” is
array. The resulting order is sensor driver, counter, defined in terms of minimum cost or minimum response
motor driver. In the CSE level, TLP iterates through time as preferred by the designer. From Table 3, it is
0, 1, 2, . . ., and 3 CPUs. For each selected number observed that at least for VPMS, TLP finds the heuris-
of CPUs, the BSC level determines feasible partitions. tically optimal partitioning by evaluating only 30% of
LEE et al.: A HARDWARE-SOFTWARE CODESIGN METHODOLOGY
337
[23] J.G. D’Ambrosio and X. Hu, “Configuration-level hard- Trong-Yen Lee received the B.S.
ware/software partitioning for real-time embedded sys- and M.S. degrees in industrial educa-
tems,” Proc. International Workshop on Hardware- tion (major in electronic engineering), Na-
Software Co-Design, pp.34–41, 1994. tional Taiwan Normal University, Taiwan,
[24] R.P. Dick and N.K. Jha, “MOCSYN: Multiobjective core- ROC, in 1981 and 1988, respectively, and
based single-chip system synthesis,” Design Automation the Ph.D. degree in electrical engineer-
and Test, Europe (DATE’99), pp.263–270, ACM Press, ing from the National Taiwan University,
March 1999. Taiwan, ROC, in Jan. 2001. He has
[25] R.P. Dick and N.K. Jha, “CORDS: Hardware-software been teaching electronic engineering in
co-synthesis of reconfigurable real-time distributed em- Nankang Vocational High School, Taipei,
bedded systems,” IEEE/ACM International Conference Taiwan since 1980. His current research
on Computer-Aided Design (ICCAD’98), pp.62–68, ACM interests include hardware-software codesign, parallel architec-
Press, Nov. 1998. tures, simulation, and design automation systems.
[26] J. Rumbaugh, M. Blaha, W. Premerlani, F. Eddy, and W.
Lorensen, Object-oriented modeling and design, Prentice-
Hall, Englewood Cliffs, NJ, USA, 1991.
[27] T.A. Henzinger, P.-H. Ho, and H. Wong-Toi, “A user guide Pao-Ann Hsiung received the B.S.
to HyTech,” Proc. International Conference on Tools and degree in mathematics and the Ph.D. de-
Algorithms for the Construction and Analysis of Systems, gree in electrical engineering from the Na-
LNCS, vol.1019, pp.41–71, Springer Verlag, 1995. tional Taiwan University (NTU), Taipei,
[28] P.-A. Hsiung, “Timing coverification of concurrent embed- Taiwan, ROC, in 1991 and 1996, respec-
ded real-time systems,” Proc. 7th International Workshop tively. From 1993 to 1996, he was a
on Hardware/Software Codesign, pp.110–114, ACM Press, Teaching Assistant and System Adminis-
May 1999. trator in the Department of Mathemat-
[29] J.-M. Fu and S.-J. Chen, “Hardware-software coverification ics, NTU. From 1996 to 2000, he was a
of distributed embedded systems,” Proc. 1999 International post-doctoral researcher at the Institute
Conference on Parallel and Distributed Processing Tech- of Information Science, Academia Sinica,
niques and Applications (PDPTA’99), vol.6, pp.2995–3001, Taipei, Taiwan, ROC. Currently, he is an assistant professor at
June 1999. the Department of Computer Science and Information Engineer-
[30] D. Landskov, S. Davidson, B. Shriver, and P. Mallet, “Lo- ing, National Chung Cheng University, Taiwan. His main re-
cal microcode compaction techniques,” ACM Computing search interests include: hardware-software codesign, real-time
Surveys, vol.12, no.3, pp.261–294, Sept. 1980. systems, formal verification, system-level design automation of
[31] J.Y.-T. Leung and J. Whitehead, “On the complexity of multiprocessor systems, and object-oriented technology transfer.
fixed-priority scheduling of periodic, real-time tasks,” Per-
formance Evaluation, vol.2, pp.237–250, 1982.
[32] K. Ramamritham, “Allocation and scheduling of complex
periodic tasks,” Proc. International Conference on Dis- Sao-Jie Chen received the B.S. and
tributed Computing Systems, pp.108–115, 1990. M.S. degrees in electrical engineering from
[33] D.-T. Peng and K.G. Shin, “Static allocation of peri- the National Taiwan University, Taipei,
odic tasks with precedence constraints,” Proc. International Taiwan, ROC, in 1977 and 1982 respec-
Conference on Distributed Computing Systems, pp.190– tively, and the Ph.D. degree in electrical
198, 1989. engineering from the Southern Methodist
[34] D.W. Leinbaugh and M.-R. Yamani, “Guaranteed response University, Dallas, USA, in 1988. Since
times in a distributed hard-real-time environment,” Proc. 1982, he has been a member of the faculty
Real-Time Systems Symposium, pp.157–169, 1982. in the Department of Electrical Engineer-
[35] J.P. Lehoczky and Lui Sha, “Performance of real-time bus ing, National Taiwan University, where he
scheduling algorithms,” ACM Performance Evaluation re- is currently a full professor. From 1985 to
view, May 1986. 1988, he was on leave from National Taiwan University and work-
[36] J.-F. Lin, W.-B. See, and S.-J. Chen, “Performance bounds ing toward his Ph.D. at Southern Methodist University. During
on scheduling parallel tasks with communication cost,” the fall of 1999, he was a visiting scholar in the Department of
IEICE Trans. Inf. & Syst., vol.E78-D, no.3, pp.263–268, Computer Science and Engineering, University of California, San
March 1995. Diego. His current research interests include: VLSI ciruits design,
[37] J.-F. Lin and S.-J. Chen, “An analysis of multiprocessor VLSI physical design automation, object-oriented software engi-
tasks scheduling,” Computer Systems Science and Engi- neering, and multiprocessor architecture design and simulation.
neering, vol.11, no.2, pp.117–120, March 1996. Dr. Chen is a member of the Chinese Institute of Engineers, the
[38] J.-F. Lin and S.-J. Chen, “Scheduling algorithm for non- Association for Computing Machinery, the IEEE, and the IEEE
preemptive multiprocessor tasks,” Computers and Mathe- Computer Society.
matics with Applications, vol.28, no.4, pp.85–92, 1994.
[39] T.-Y. Lee, P.-A. Hsiung, and S.-J. Chen, “A case study in
hardware-software codesign of distributed systems — Vehi-
cle parking management system,” Proc. 1999 International
Conference on Parallel and Distributed Processing Tech-
niques and Applications (PDPTA’99), vol.6, pp.2982–2987,
June 1999.
[40] Intel, Embedded microcontrollers and processors, vol.1, In-
tel Corporation, 1993.