You are on page 1of 815

Introduction

• What are embedded computing systems?


• Challenges in embedded computing
system design.
• Design methodologies.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 1
Definition
• Embedded computing system: any device
that includes a programmable computer
but is not itself a general-purpose
computer.
• Take advantage of application
characteristics to optimize the design:
• don’t need all the general-purpose bells and
whistles.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 2
Embedding a computer

output analog

input analog
CPU

mem
embedded
computer
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 3
Examples
• Cell phone.
• Printer.
• Automobile: engine, brakes, dash, etc.
• Airplane: engine, flight controls,
nav/comm.
• Digital television.
• Household appliances.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 4
Early history
• Late 1940’s: MIT Whirlwind computer was
designed for real-time operations.
• Originally designed to control an aircraft
simulator.
• First microprocessor was Intel 4004 in
early 1970’s.
• HP-35 calculator used several chips to
implement a microprocessor in 1972.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 5
Early history, cont’d.
• Automobiles used microprocessor-based
engine controllers starting in 1970’s.
• Control fuel/air mixture, engine timing, etc.
• Multiple modes of operation: warm-up,
cruise, hill climbing, etc.
• Provides lower emissions, better fuel
efficiency.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 6
Microprocessor varieties
• Microcontroller: includes I/O devices, on-
board memory.
• Digital signal processor (DSP):
microprocessor optimized for digital signal
processing.
• Typical embedded word sizes: 8-bit, 16-
bit, 32-bit.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 7
Application examples
• Simple control: front panel of microwave
oven, etc.
• Canon EOS 3 has three microprocessors.
• 32-bit RISC CPU runs autofocus and eye
control systems.
• Digital TV: programmable CPUs +
hardwired logic for video/audio decode,
menus, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 8
Automotive embedded systems
• Today’s high-end automobile may have
100 microprocessors:
• 4-bit microcontroller checks seat belt;
• microcontrollers run dashboard devices;
• 16/32-bit microprocessor controls engine.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 9
BMW 850i brake and stability
control system
• Anti-lock brake system (ABS): pumps
brakes to reduce skidding.
• Automatic stability control (ASC+T):
controls engine to improve stability.
• ABS and ASC+T communicate.
• ABS was introduced first---needed to
interface to existing ABS module.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 10
BMW 850i, cont’d.

sensor sensor

brake brake

hydraulic
ABS
pump

brake brake

sensor sensor
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 11
Characteristics of embedded
systems
• Sophisticated functionality.
• Real-time operation.
• Low manufacturing cost.
• Low power.
• Designed to tight deadlines by small
teams.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 12
Functional complexity
• Often have to run sophisticated
algorithms or multiple algorithms.
• Cell phone, laser printer.
• Often provide sophisticated user
interfaces.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 13
Real-time operation
• Must finish operations by deadlines.
• Hard real time: missing deadline causes
failure.
• Soft real time: missing deadline results in
degraded performance.
• Many systems are multi-rate: must handle
operations at widely varying rates.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 14
Non-functional requirements
• Many embedded systems are mass-
market items that must have low
manufacturing costs.
• Limited memory, microprocessor power, etc.
• Power consumption is critical in battery-
powered devices.
• Excessive power consumption increases
system cost even in wall-powered devices.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 15
Design teams
• Often designed by a small team of
designers.
• Often must meet tight deadlines.
• 6 month market window is common.
• Can’t miss back-to-school window for
calculator.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 16
Why use microprocessors?
• Alternatives: field-programmable gate
arrays (FPGAs), custom logic, etc.
• Microprocessors are often very efficient:
can use same logic to perform many
different functions.
• Microprocessors simplify the design of
families of products.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 17
The performance paradox
• Microprocessors use much more logic to
implement a function than does custom
logic.
• But microprocessors are often at least as
fast:
• heavily pipelined;
• large design teams;
• aggressive VLSI technology.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 18
Power
• Custom logic uses less power, but CPUs have
advantages:
• Modern microprocessors offer features to
help control power consumption.
• Software design techniques can help reduce
power consumption.
• Heterogeneous systems: some custom logic for
well-defined functions, CPUs+software for
everything else.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 19
Platforms
• Embedded computing platform: hardware
architecture + associated software.
• Many platforms are multiprocessors.
• Examples:
• Single-chip multiprocessors for cell phone
baseband.
• Automotive network + processors.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 20
The physics of software
• Computing is a physical act.
• Software doesn’t do anything without
hardware.
• Executing software consumes energy,
requires time.
• To understand the dynamics of software
(time, energy), we need to characterize
the platform on which the software runs.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 21
What does “performance”
mean?
• In general-purpose computing,
performance often means average-case,
may not be well-defined.
• In real-time systems, performance means
meeting deadlines.
• Missing the deadline by even a little is bad.
• Finishing ahead of the deadline may not help.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 22
Characterizing performance
• We need to analyze the system at several
levels of abstraction to understand
performance:
• CPU.
• Platform.
• Program.
• Task.
• Multiprocessor.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 23
Challenges in embedded
system design
• How much hardware do we need?
• How big is the CPU? Memory?
• How do we meet our deadlines?
• Faster hardware or cleverer software?
• How do we minimize power?
• Turn off unnecessary logic? Reduce memory
accesses?

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 24
Challenges, etc.
• Does it really work?
• Is the specification correct?
• Does the implementation meet the spec?
• How do we test for real-time characteristics?
• How do we test on real data?
• How do we work on the system?
• Observability, controllability?
• What is our development platform?

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 25
Design methodologies
• A procedure for designing a system.
• Understanding your methodology helps
you ensure you didn’t skip anything.
• Compilers, software engineering tools,
computer-aided design (CAD) tools, etc.,
can be used to:
• help automate methodology steps;
• keep track of the methodology itself.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 26
Design goals
• Performance.
• Overall speed, deadlines.
• Functionality and user interface.
• Manufacturing cost.
• Power consumption.
• Other requirements (physical size, etc.)

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 27
Levels of abstraction

requirements

specification

architecture

component
design
system
integration
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 28
Top-down vs. bottom-up
• Top-down design:
• start from most abstract description;
• work to most detailed.
• Bottom-up design:
• work from small components to big system.
• Real design uses both techniques.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 29
Stepwise refinement
• At each level of abstraction, we must:
• analyze the design to determine
characteristics of the current state of the
design;
• refine the design to add detail.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 30
Requirements
• Plain language description of what the
user wants and expects to get.
• May be developed in several ways:
• talking directly to customers;
• talking to marketing representatives;
• providing prototypes to users for comment.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 31
Functional vs. non-functional
requirements
• Functional requirements:
• output as a function of input.
• Non-functional requirements:
• time required to compute output;
• size, weight, etc.;
• power consumption;
• reliability;
• etc.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 32
Our requirements form

name
purpose
inputs
outputs
functions
performance
manufacturing cost
power
physical size/weight

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 33
Example: GPS moving map
requirements

• Moving map
obtains position I-78
from GPS, paints

Scotch Road
map from local
database.

lat: 40 13 lon: 32 19
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 34
GPS moving map needs
• Functionality: For automotive use. Show major
roads and landmarks.
• User interface: At least 400 x 600 pixel screen.
Three buttons max. Pop-up menu.
• Performance: Map should scroll smoothly. No
more than 1 sec power-up. Lock onto GPS
within 15 seconds.
• Cost: $120 street price = approx. $30 cost of
goods sold.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 35
GPS moving map needs, cont’d.
• Physical size/weight: Should fit in hand.
• Power consumption: Should run for 8
hours on four AA batteries.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 36
GPS moving map
requirements form
name GPS moving map
purpose consumer-grade
moving map for driving
inputs power button, two
control buttons
outputs back-lit LCD 400 X 600
functions 5-receiver GPS; three
resolutions; displays
current lat/lon
performance updates screen within
0.25 sec of movement
manufacturing cost $100 cost-of-goods-
sold
power 100 mW
physical size/weight no more than 2: X 6:,
12 oz.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 37
Specification
• A more precise description of the system:
• should not imply a particular architecture;
• provides input to the architecture design
process.
• May include functional and non-functional
elements.
• May be executable or may be in
mathematical form for proofs.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 38
GPS specification
• Should include:
• What is received from GPS;
• map data;
• user interface;
• operations required to satisfy user requests;
• background operations needed to keep the
system running.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 39
Architecture design
• What major components go satisfying the
specification?
• Hardware components:
• CPUs, peripherals, etc.
• Software components:
• major programs and their operations.
• Must take into account functional and
non-functional specifications.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 40
GPS moving map block diagram

GPS search display


renderer
receiver engine

user
database interface

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 41
GPS moving map hardware
architecture

display frame CPU


buffer
GPS
receiver

memory
panel I/O

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 42
GPS moving map software
architecture

position database pixels


renderer
search

user
timer
interface

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 43
Designing hardware and
software components
• Must spend time architecting the system
before you start coding.
• Some components are ready-made, some
can be modified from existing designs,
others must be designed from scratch.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 44
System integration
• Put together the components.
• Many bugs appear only at this stage.
• Have a plan for integrating components to
uncover bugs quickly, test as much
functionality as early as possible.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 45
Summary
• Embedded computers are all around us.
• Many systems have complex embedded
hardware and software.
• Embedded systems pose many design
challenges: design time, deadlines, power,
etc.
• Design methodologies help us manage the
design process.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 46
Introduction
• Object-oriented design.
• Unified Modeling Language (UML).

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 47
System modeling
• Need languages to describe systems:
• useful across several levels of abstraction;
• understandable within and between
organizations.
• Block diagrams are a start, but don’t cover
everything.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 48
Object-oriented design
• Object-oriented (OO) design: A
generalization of object-oriented
programming.
• Object = state + methods.
• State provides each object with its own
identity.
• Methods provide an abstract interface to the
object.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 49
Objects and classes
• Class: object type.
• Class defines the object’s state elements
but state values may change over time.
• Class defines the methods used to interact
with all objects of that type.
• Each object has its own state.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 50
OO design principles
• Some objects will closely correspond to
real-world objects.
• Some objects may be useful only for
description or implementation.
• Objects provide interfaces to read/write
state, hiding the object’s implementation
from the rest of the system.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 51
UML
• Developed by Booch et al.
• Goals:
• object-oriented;
• visual;
• useful at many levels of abstraction;
• usable for all aspects of design.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 52
UML object
object name
class name
d1: Display

pixels is a pixels: array[] of pixels


2-D array elements
menu_items

comment
attributes
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 53
UML class

Display
class name

pixels
elements
menu_items

mouse_click()
operations
draw_box

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 54
The class interface
• The operations provide the abstract
interface between the class’s
implementation and other classes.
• Operations may have arguments, return
values.
• An operation can examine and/or modify
the object’s state.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 55
Choose your interface properly
• If the interface is too small/specialized:
• object is hard to use for even one application;
• even harder to reuse.
• If the interface is too large:
• class becomes too cumbersome for designers to
understand;
• implementation may be too slow;
• spec and implementation are probably buggy.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 56
Relationships between objects
and classes
• Association: objects communicate but one
does not own the other.
• Aggregation: a complex object is made of
several smaller objects.
• Composition: aggregation in which owner
does not allow access to its components.
• Generalization: define one class in terms
of another.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 57
Class derivation

• May want to define one class in terms of


another.
• Derived class inherits attributes, operations of
base class.

Derived_class
UML
generalization
Base_class

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 58
Class derivation example

Display
base
pixels class
elements
menu_items
pixel()
derived class set_pixel()
mouse_click()
draw_box

BW_display Color_map_display
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 59
Multiple inheritance
base classes

Speaker Display

Multimedia_display

derived class

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 60
Links and associations
• Link: describes relationships between
objects.
• Association: describes relationship
between classes.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 61
Link example

• Link defines the contains relationship:

message
msg = msg1 message set
length = 1102
count = 2
message
msg = msg2
length = 2114

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 62
Association example

# contained messages # containing message sets

message message set


0..* 1
msg: ADPCM_stream
count : integer
length : integer contains

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 63
Stereotypes
• Stereotype: recurring combination of
elements in an object or class.
• Example:
• <<foo>>

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 64
Behavioral description
• Several ways to describe behavior:
• internal view;
• external view.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 65
State machines

transition

a b

state state name

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 66
Event-driven state machines
• Behavioral descriptions are written as
event-driven state machines.
• Machine changes state when receiving an
input.
• An event may come from inside or outside
of the system.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 67
Types of events
• Signal: asynchronous event.
• Call: synchronized communication.
• Timer: activated by time.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 68
Signal event

<<signal>>
mouse_click a

leftorright: button
mouse_click(x,y,button)
x, y: position

b
declaration

event description
Overheads for Computers as
© 2008 Wayne Wolf Components, 2nd ed. 69
Call event

draw_box(10,5,3,2,blue)

c d

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 70
Timer event

tm(time-value)

e f

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 71
Example state machine

start input/output
mouse_click(x,y,button)/ region = menu/
find_region(region) which_menu(i) call_menu(I)
region got menu called
found item menu item
region = drawing/
find_object(objid) highlight(objid)

found object
object highlighted

Overheads for Computers as


finish
© 2008 Wayne Wolf Components, 2nd ed. 72
Sequence diagram
• Shows sequence of operations over time.
• Relates behaviors of multiple objects.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 73
Sequence diagram example

m: Mouse d1: Display u: Menu

mouse_click(x,y,button)
which_menu(x,y,i)

time
call_menu(i)

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 74
Summary
• Object-oriented design helps us organize
a design.
• UML is a transportable system design
language.
• Provides structural and behavioral description
primitives.

Overheads for Computers as


© 2008 Wayne Wolf Components, 2nd ed. 75
Introduction
• Example: model train controller.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 76
Purposes of example
• Follow a design through several levels of
abstraction.
• Gain experience with UML.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 77
Model train setup

rcvr motor

power
supply
console

ECC command address header


© 2000 Morgan Overheads for Computers as
Kaufman Components 2nd ed. 78
Requirements
• Console can control 8 trains on 1 track.
• Throttle has at least 63 levels.
• Inertia control adjusts responsiveness
with at least 8 levels.
• Emergency stop button.
• Error detection scheme on messages.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 79
Requirements form

name model train controller


purpose control speed of <= 8 model trains
inputs throttle, inertia, emergency stop,
train #
outputs train control signals
functions set engine speed w. inertia;
emergency stop
performance can update train speed at least 10
times/sec
manufacturing cost $50
power wall powered
physical console comfortable for 2 hands; < 2
size/weight lbs.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 80
Digital Command Control
• DCC created by model railroad hobbyists,
picked up by industry.
• Defines way in which model trains,
controllers communicate.
• Leaves many system design aspects open,
allowing competition.
• This is a simple example of a big trend:
• Cell phones, digital TV rely on standards.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 81
DCC documents
• Standard S-9.1, DCC Electrical Standard.
• Defines how bits are encoded on the rails.
• Standard S-9.2, DCC Communication
Standard.
• Defines packet format and semantics.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 82
DCC electrical standard

• Voltage moves
around the power logic 1 logic 0
supply voltage; adds
no DC component.
• 1 is 58 ms, 0 is at
time
least 100 ms.
58 ms >= 100 ms

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 83
DCC communication standard
• Basic packet format: PSA(sD)+E.
• P: preamble = 1111111111.
• S: packet start bit = 0.
• A: address data byte.
• s: data byte start bit.
• D: data byte (data payload).
• E: packet end bit = 1.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 84
DCC packet types
• Baseline packet: minimum packet that
must be accepted by all DCC
implementations.
• Address data byte gives receiver address.
• Instruction data byte gives basic instruction.
• Error correction data byte gives ECC.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 85
Conceptual specification
• Before we create a detailed specification,
we will make an initial, simplified
specification.
• Gives us practice in specification and UML.
• Good idea in general to identify potential
problems before investing too much effort in
detail.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 86
Basic system commands

command name parameters

set-speed speed
(positive/negative)
set-inertia inertia-value (non-
negative)
estop none

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 87
Typical control sequence

:console :train_rcvr
set-inertia
set-speed

set-speed
estop

set-speed

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 88
Message classes

command

set-speed set-inertia estop


value: unsigned-
value: integer
integer

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 89
Roles of message classes
• Implemented message classes derived
from message class.
• Attributes and operations will be filled in for
detailed specification.
• Implemented message classes specify
message type by their class.
• May have to add type as parameter to data
structure in implementation.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 90
Subsystem collaboration
diagram

Shows relationship between console and


receiver (ignores role of track):

1..n: command

:console :receiver

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 91
System structure modeling
• Some classes define non-computer
components.
• Denote by *name.
• Choose important systems at this point to
show basic relationships.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 92
Major subsystem roles
• Console:
• read state of front panel;
• format messages;
• transmit messages.
• Train:
• receive message;
• interpret message;
• control the train.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 93
Console system classes

console
1 1
1 1 1 1

panel formatter transmitter

1 1 1 1
receiver* sender*

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 94
Console class roles
• panel: describes analog knobs and
interface hardware.
• formatter: turns knob settings into bit
streams.
• transmitter: sends data on track.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 95
Train system classes

train set

1 1..t
1 1
train
1 1 motor
receiver interface
1 1
1 1 controller 1 1
detector* pulser*

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 96
Train class roles
• receiver: digitizes signal from track.
• controller: interprets received commands
and makes control decisions.
• motor interface: generates signals
required by motor.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 97
Detailed specification
• We can now fill in the details of the
conceptual specification:
• more classes;
• behaviors.
• Sketching out the spec first helps us
understand the basic relationships in the
system.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 98
Train speed control

• Motor controlled by pulse width


modulation:

duty
cycle +
V
-

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 99
Console physical object
classes

knobs* pulser*
train-knob: integer pulse-width: unsigned-
speed-knob: integer integer
inertia-knob: unsigned- direction: boolean
integer
emergency-stop: boolean
sender* detector*

send-bit() read-bit() : integer

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 100
Panel and motor interface
classes

panel motor-interface

speed: integer
train-number() : integer
speed() : integer
inertia() : integer
estop() : boolean
new-settings()

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 101
Class descriptions
• panel class defines the controls.
• new-settings() behavior reads the controls.
• motor-interface class defines the motor
speed held as state.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 102
Transmitter and receiver
classes

transmitter receiver
current: command
send-speed(adrs: integer, new: boolean
speed: integer)
send-inertia(adrs: integer, read-cmd()
val: integer) new-cmd() : boolean
set-estop(adrs: integer) rcv-type(msg-type:
command)
rcv-speed(val: integer)
rcv-inertia(val:integer)
© 2000 Morgan Overheads for Computers as
Kaufman Components 2nd ed. 103
Class descriptions
• transmitter class has one behavior for
each type of message sent.
• receiver function provides methods to:
• detect a new message;
• determine its type;
• read its parameters (estop has no
parameters).

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 104
Formatter class

formatter
current-train: integer
current-speed[ntrains]: integer
current-inertia[ntrains]:
unsigned-integer
current-estop[ntrains]: boolean
send-command()
panel-active() : boolean
operate()

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 105
Formatter class description
• Formatter class holds state for each train,
setting for current train.
• The operate() operation performs the
basic formatting task.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 106
Control input cases
• Use a soft panel to show current panel
settings for each train.
• Changing train number:
• must change soft panel settings to reflect
current train’s speed, etc.
• Controlling throttle/inertia/estop:
• read panel, check for changes, perform
command.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 107
Control input sequence
diagram

:knobs :panel :formatter :transmitter


change in read panel
change in change in speed/

control panel-active
train number inertia/estop

settings panel settings send-command


read panel
send-speed,
panel settings send-inertia.
read panel send-estop
change in
train panel settings
number new-settings
set-knobs
© 2000 Morgan Overheads for Computers as
Kaufman Components 2nd ed. 108
Formatter operate behavior

update-panel()

panel-active() new train number


idle
send-command()
other

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 109
Panel-active behavior

T current-train = train-knob
panel*:read-train() update-screen
changed = true
F

T
panel*:read-speed() current-speed = throttle
changed = true
F
... ...
© 2000 Morgan Overheads for Computers as
Kaufman Components 2nd ed. 110
Controller class

controller
current-train: integer
current-speed[ntrains]: integer
current-direction[ntrains]: boolean
current-inertia[ntrains]:
unsigned-integer

operate()
issue-command()

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 111
Setting the speed
• Don’t want to change speed
instantaneously.
• Controller should change speed gradually
by sending several commands.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 112
Sequence diagram for set-
speed command

:receiver :controller :motor-interface :pulser*


new-cmd
cmd-type
rcv-speed set-speed set-pulse
set-pulse
set-pulse
set-pulse
set-pulse

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 113
Controller operate behavior

wait for a
command
from receiver
receive-command()

issue-command()

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 114
Refined command classes

command
type: 3-bits
address: 3-bits
parity: 1-bit

set-speed set-inertia estop


type=010 type=001
type=000
value: 7-bits value: 3-bits

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 115
Summary
• Separate specification and programming.
• Small mistakes are easier to fix in the spec.
• Big mistakes in programming cost a lot of
time.
• You can’t completely separate
specification and architecture.
• Make a few tasteful assumptions.

© 2000 Morgan Overheads for Computers as


Kaufman Components 2nd ed. 116
Instruction sets
• Computer architecture taxonomy.
• Assembly language.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 117
von Neumann architecture
• Memory holds data, instructions.
• Central processing unit (CPU) fetches
instructions from memory.
• Separate CPU and memory distinguishes
programmable computer.
• CPU registers help out: program counter
(PC), instruction register (IR), general-
purpose registers, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 118
CPU + memory

address
200
PC
memory data
CPU
200 ADD r5,r1,r3 ADD IR
r5,r1,r3

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 119
Harvard architecture

address
data memory
data PC
CPU
address

program memory data

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 120
von Neumann vs. Harvard
• Harvard can’t use self-modifying code.
• Harvard allows two simultaneous memory
fetches.
• Most DSPs use Harvard architecture for
streaming data:
• greater memory bandwidth;
• more predictable bandwidth.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 121
RISC vs. CISC
• Complex instruction set computer (CISC):
• many addressing modes;
• many operations.
• Reduced instruction set computer (RISC):
• load/store;
• pipelinable instructions.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 122
Instruction set characteristics
• Fixed vs. variable length.
• Addressing modes.
• Number of operands.
• Types of operands.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 123
Programming model
• Programming model: registers visible to
the programmer.
• Some registers are not visible (IR).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 124
Multiple implementations
• Successful architectures have several
implementations:
• varying clock speeds;
• different bus widths;
• different cache sizes;
• etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 125
Assembly language
• One-to-one with instructions (more or
less).
• Basic features:
• One instruction per line.
• Labels provide names for addresses (usually
in first column).
• Instructions often start in later columns.
• Columns run to end of line.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 126
ARM assembly language
example
label1 ADR r4,c
LDR r0,[r4] ; a comment
ADR r4,d
LDR r1,[r4]
SUB r0,r0,r1 ; comment

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 127
Pseudo-ops
• Some assembler directives don’t
correspond directly to instructions:
• Define current address.
• Reserve storage.
• Constants.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 128
CPUs
• Input and output.
• Supervisor mode, exceptions, traps.
• Co-processors.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 129
I/O devices

• Usually includes some non-digital


component.
• Typical digital interface to CPU:

status

mechanism
reg
CPU
data
reg
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 130
Application: 8251 UART
• Universal asynchronous receiver
transmitter (UART) : provides serial
communication.
• 8251 functions are integrated into
standard PC interface chip.
• Allows many communication parameters
to be programmed.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 131
Serial communication

• Characters are transmitted separately:

no
char

start bit 0 bit 1 ... bit n-1 stop

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 132
Serial communication
parameters
• Baud (bit) rate.
• Number of bits per character.
• Parity/no parity.
• Even/odd parity.
• Length of stop bit (1, 1.5, 2 bits).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 133
8251 CPU interface

status
(8 bit)
CPU xmit/
8251
rcv
data serial
(8 bit) port

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 134
Programming I/O
• Two types of instructions can support I/O:
• special-purpose I/O instructions;
• memory-mapped load/store instructions.
• Intel x86 provides in, out instructions.
Most other CPUs use memory-mapped
I/O.
• I/O instructions do not preclude memory-
mapped I/O.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 135
ARM memory-mapped I/O
• Define location for device:
DEV1 EQU 0x1000
• Read/write code:
LDR r1,#DEV1 ; set up device adrs
LDR r0,[r1] ; read DEV1
LDR r0,#8 ; set up value to write
STR r0,[r1] ; write value to device

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 136
Peek and poke
• Traditional HLL interfaces:
int peek(char *location) {
return *location; }

void poke(char *location, char


newval) {
(*location) = newval; }

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 137
Busy/wait output
• Simplest way to program device.
• Use instructions to test when device is ready.
current_char = mystring;
while (*current_char != ‘\0’) {
poke(OUT_CHAR,*current_char);
while (peek(OUT_STATUS) != 0);
current_char++;
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 138
Simultaneous busy/wait input
and output
while (TRUE) {
/* read */
while (peek(IN_STATUS) == 0);
achar = (char)peek(IN_DATA);
/* write */
poke(OUT_DATA,achar);
poke(OUT_STATUS,1);
while (peek(OUT_STATUS) != 0);
}
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 139
Interrupt I/O
• Busy/wait is very inefficient.
• CPU can’t do other work while testing device.
• Hard to do simultaneous I/O.
• Interrupts allow a device to change the
flow of control in the CPU.
• Causes subroutine call to handle device.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 140
Interrupt interface

intr request
status

mechanism
intr ack reg
PC
IR

CPU
data/address data
reg

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 141
Interrupt behavior
• Based on subroutine call mechanism.
• Interrupt forces next instruction to be a
subroutine call to a predetermined
location.
• Return address is saved to resume executing
foreground program.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 142
Interrupt physical interface
• CPU and device are connected by CPU
bus.
• CPU and device handshake:
• device asserts interrupt request;
• CPU asserts interrupt acknowledge when it
can handle the interrupt.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 143
Example: character I/O
handlers
void input_handler() {
achar = peek(IN_DATA);
gotchar = TRUE;
poke(IN_STATUS,0);
}
void output_handler() {
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 144
Example: interrupt-driven main
program
main() {
while (TRUE) {
if (gotchar) {
poke(OUT_DATA,achar);
poke(OUT_STATUS,1);
gotchar = FALSE;
}
}
}
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 145
Example: interrupt I/O with
buffers

• Queue for characters:

head tail tail

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 146
Buffer-based input handler
void input_handler() {
char achar;
if (full_buffer()) error = 1;
else { achar = peek(IN_DATA);
add_char(achar); }
poke(IN_STATUS,0);
if (nchars == 1)
{ poke(OUT_DATA,remove_char();
poke(OUT_STATUS,1); }
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 147
I/O sequence diagram

:foreground :input :output :queue

empty
a

empty

bc

c
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 148
Debugging interrupt code
• What if you forget to change registers?
• Foreground program can exhibit mysterious
bugs.
• Bugs will be hard to repeat---depend on
interrupt timing.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 149
Priorities and vectors
• Two mechanisms allow us to make
interrupts more specific:
• Priorities determine what interrupt gets CPU
first.
• Vectors determine what code is called for
each type of interrupt.
• Mechanisms are orthogonal: most CPUs
provide both.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 150
Prioritized interrupts

device 1 device 2 device n

interrupt
acknowledge

L1 L2 .. Ln
CPU

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 151
Interrupt prioritization
• Masking: interrupt with priority lower than
current priority is not recognized until
pending interrupt is complete.
• Non-maskable interrupt (NMI): highest-
priority, never masked.
• Often used for power-down.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 152
Example: Prioritized I/O

:interrupts :foreground :A :B :C

A,B

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 153
Interrupt vectors

• Allow different devices to be handled by


different code.
• Interrupt vector table:

Interrupt handler 0
vector
handler 1
table head
handler 2
handler 3

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 154
Interrupt vector acquisition

:CPU :device

receive
request
receive
ack
receive
vector

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 155
Generic interrupt mechanism

continue intr?
N Assume priority selection is
execution Y handled before this
point.
N intr priority >
ignore current
priority?
Y

ack
Y
Y N
bus error timeout? vector?
Y

call table[vector]
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 156
Interrupt sequence
• CPU acknowledges request.
• Device sends vector.
• CPU calls handler.
• Software processes request.
• CPU restores state to foreground
program.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 157
Sources of interrupt overhead
• Handler execution time.
• Interrupt mechanism overhead.
• Register save/restore.
• Pipeline-related penalties.
• Cache-related penalties.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 158
ARM interrupts
• ARM7 supports two types of interrupts:
• Fast interrupt requests (FIQs).
• Interrupt requests (IRQs).
• Interrupt table starts at location 0.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 159
ARM interrupt procedure
• CPU actions:
• Save PC. Copy CPSR to SPSR.
• Force bits in CPSR to record interrupt.
• Force PC to vector.
• Handler responsibilities:
• Restore proper PC.
• Restore CPSR from SPSR.
• Clear interrupt disable flags.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 160
ARM interrupt latency
• Worst-case latency to respond to interrupt
is 27 cycles:
• Two cycles to synchronize external request.
• Up to 20 cycles to complete current
instruction.
• Three cycles for data abort.
• Two cycles to enter interrupt handling state.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 161
C55x interrupts
• Latency is between 7 and 13 cycles.
• Maskable interrupt sequence:
• Interrupt flag register is set.
• Interrupt enable register is checked.
• Interrupt mask register is checked.
• Interrupt flag register is cleared.
• Appropriate registers are saved.
• INTM set to 1, DBGM set to 1, EALLOW set to 0.
• Branch to ISR.
• Two styles of return: fast and slow.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 162
Supervisor mode
• May want to provide protective barriers
between programs.
• Avoid memory corruption.
• Need supervisor mode to manage the
various programs.
• SHARC does not have a supervisor mode.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 163
ARM supervisor mode
• Use SWI instruction to enter supervisor
mode, similar to subroutine:
SWI CODE_1
• Sets PC to 0x08.
• Argument to SWI is passed to supervisor
mode code.
• Saves CPSR in SPSR.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 164
Exception
• Exception: internally detected error.
• Exceptions are synchronous with
instructions but unpredictable.
• Build exception mechanism on top of
interrupt mechanism.
• Exceptions are usually prioritized and
vectorized.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 165
Trap
• Trap (software interrupt): an exception
generated by an instruction.
• Call supervisor mode.
• ARM uses SWI instruction for traps.
• SHARC offers three levels of software
interrupts.
• Called by setting bits in IRPTL register.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 166
Co-processor
• Co-processor: added function unit that is
called by instruction.
• Floating-point units are often structured as
co-processors.
• ARM allows up to 16 designer-selected co-
processors.
• Floating-point co-processor uses units 1, 2.
• C55x uses co-processors as well.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 167
C55x image/video hardware
extensions
• Available in 5509 and 5510.
• Equivalent C-callable functions for other
devices.
• Available extensions:
• DCT/IDCT.
• Pixel interpolation
• Motion estimation.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 168
DCT/IDCT

• 2-D DCT/IDCT is
computed from block
two 1-D
DCT/IDCT. Column DCT
• Put data in
different banks to
maximize interim
Row
DCT
DCT
throughput.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 169
C55 DCT/IDCT coprocessor
extensions
• Load, compute, transfer to accumulators:
• ACy=copr(k8,ACx,Xmem,Ymem)
• Compute, transfer, mem write:
• ACy=copr(k8,ACx,ACy), Lmem=ACz
• Special:
• ACy=copr(k8,ACx,ACy)

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 170
Software pipelined
load/compute/store for DCT
Iteration i-1 Iteration i Iteration i+1
Dual_load Dual_load Dual_load op_i(0), load_i+1(0,1)
op_i(1), store_i-1(0,1)
4 empty 4 empty 4 empty op_i(2), store_i-1(2,3)
op_i(2), store_i-1(4,5)
3 3 3 op_i(2), store_i-1(6,7)
Dual_load Dual_load Dual_load op_i(2), load_i+1(2,3)

8 8 8
compute compute compute
empty empty empty
4 4 4
Long_store Long_store Long_store
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 171
C55 motion estimation
• Search strategy:
• Full vs. non-full.
• Accuracy:
• Full-pixel vs. half-pixel.
• Number of returned motion vectors:
• 1 (one 16x16) vs. 4 (four 8x8).
• Algorithms:
• 3-step algorithm (distance 4,2,1).
• 4-step algorithm (distance 8,4,2,1).
• 4-step with half-pixel refinement.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 172
Four-step motion estimation
breakdown

d = {8,4,2,1};
for (i=0; i<4; i++) {
compute 3 upper differences for
d[i];
compute 3 middle differences
for d[i];
compute 3 lower differences for
d[i]; X
compute minimum value;
move to next d;
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 173
C55 motion estimation
accelerator
• Includes 3 16-bit pixel data paths, 3 16-bit
absolute differences (ADs).
• Basic operation:
• [ACx,ACy] = copr(k8,ACx,ACy,Xmem,Ymem,Coeff)
• K8 = control bits (enable AD units, etc.)
• ACx, ACy = accumulated absolute differences
• Xmem, Ymem = pointers to odd, even lines of the
search window
• Pointer to two adjacent pixels from reference window

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 174
C55 pixel interpolation

• Given four pixels A, B, C, D, interpolate


three half-pixels:

A U B

M R

C D

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 175
Pixel interpolation coprocessor
operations
• Load pixels and compute:
• ACy=copr(k8,AC,Lmem)
• Load pixels, compute, and store:
• ACy=copr(k8,AACx,Lmem) || Lmem=ACz

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 176
CPUs
• Caches.
• Memory management.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 177
Caches and CPUs

address data
cache

controller
cache main
CPU
memory
address
data data

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 178
Cache operation
• Many main memory locations are mapped
onto one cache entry.
• May have caches for:
• instructions;
• data;
• data + instructions (unified).
• Memory access time is no longer
deterministic.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 179
Terms
• Cache hit: required location is in cache.
• Cache miss: required location is not in
cache.
• Working set: set of locations used by
program in a time interval.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 180
Types of misses
• Compulsory (cold): location has never
been accessed.
• Capacity: working set is too large.
• Conflict: multiple locations in working set
map to same cache entry.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 181
Memory system performance
• h = cache hit rate.
• tcache = cache access time, tmain = main
memory access time.
• Average memory access time:
• tav = htcache + (1-h)tmain

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 182
Multiple levels of cache

CPU L1 cache L2 cache

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 183
Multi-level cache access time
• h1 = cache hit rate.
• h2 = rate for miss on L1, hit on L2.
• Average memory access time:
• tav = h1tL1 + (h2-h1)tL2 + (1- h2-h1)tmain

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 184
Replacement policies
• Replacement policy: strategy for choosing
which cache entry to throw out to make
room for a new memory location.
• Two popular strategies:
• Random.
• Least-recently used (LRU).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 185
Cache organizations
• Fully-associative: any memory location
can be stored anywhere in the cache
(almost never implemented).
• Direct-mapped: each memory location
maps onto exactly one cache entry.
• N-way set-associative: each memory
location can go into one of n sets.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 186
Cache performance benefits
• Keep frequently-accessed locations in fast
cache.
• Cache retrieves more than one word at a
time.
• Sequential accesses are faster after first
access.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 187
Direct-mapped cache

1 0xabcd byte byte byte ...


valid tag data
cache block

tag index offset


=

hit value
byte
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 188
Write operations
• Write-through: immediately copy write to
main memory.
• Write-back: write to main memory only
when location is removed from cache.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 189
Direct-mapped cache locations
• Many locations map onto the same cache
block.
• Conflict misses are easy to generate:
• Array a[] uses locations 0, 1, 2, …
• Array b[] uses locations 1024, 1025, 1026, …
• Operation a[i] + b[i] generates conflict
misses.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 190
Set-associative cache

• A set of direct-mapped caches:

Set 1 Set 2 ... Set n

hit data
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 191
Example: direct-mapped vs.
set-associative

address data
000 0101
001 1111
010 0000
011 0110
100 1000
101 0001
110 1010
111 0100
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 192
Direct-mapped cache behavior

• After 001 access: • After 010 access:


block tag data block tag data
00 - - 00 - -
01 0 1111 01 0 1111
10 - - 10 0 0000
11 - - 11 - -

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 193
Direct-mapped cache behavior,
cont’d.

• After 011 access: • After 100 access:


block tag data block tag data
00 - - 00 1 1000
01 0 1111 01 0 1111
10 0 0000 10 0 0000
11 0 0110 11 0 0110

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 194
Direct-mapped cache behavior,
cont’d.

• After 101 access: • After 111 access:


block tag data block tag data
00 1 1000 00 1 1000
01 1 0001 01 1 0001
10 0 0000 10 0 0000
11 0 0110 11 1 0100

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 195
2-way set-associtive cache
behavior
• Final state of cache (twice as big as
direct-mapped):
set blk 0 tag blk 0 data blk 1 tag blk 1 data
00 1 1000 - -
01 0 1111 1 0001
10 0 0000 - -
11 0 0110 1 0100

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 196
2-way set-associative cache
behavior
• Final state of cache (same size as direct-
mapped):
set blk 0 tag blk 0 data blk 1 tag blk 1 data
0 01 0000 10 1000
1 10 0111 11 0100

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 197
Example caches
• StrongARM:
• 16 Kbyte, 32-way, 32-byte block instruction
cache.
• 16 Kbyte, 32-way, 32-byte block data cache
(write-back).
• SHARC:
• 32-instruction, 2-way instruction cache.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 198
Memory management units

• Memory management unit (MMU)


translates addresses:

logical physical
address memory address main
CPU management
memory
unit

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 199
Memory management tasks
• Allows programs to move in physical
memory during execution.
• Allows virtual memory:
• memory images kept in secondary storage;
• images returned to main memory on demand
during execution.
• Page fault: request for location not
resident in memory.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 200
Address translation
• Requires some sort of register/table to
allow arbitrary mappings of logical to
physical addresses.
• Two basic schemes:
• segmented;
• paged.
• Segmentation and paging can be
combined (x86).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 201
Segments and pages

page 1
page 2
segment 1

memory

segment 2

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 202
Segment address translation

segment base address logical address

segment lower bound range range


segment upper bound check error

physical address

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 203
Page address translation

page offset

page i base

concatenate

page offset

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 204
Page table organizations

page
descriptor
page descriptor

flat tree

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 205
Caching address translations
• Large translation tables require main
memory access.
• TLB: cache for address translation.
• Typically small.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 206
ARM memory management
• Memory region types:
• section: 1 Mbyte block;
• large page: 64 kbytes;
• small page: 4 kbytes.
• An address is marked as section-mapped
or page-mapped.
• Two-level translation scheme.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 207
ARM address translation

Translation table 1st index 2nd index offset


base register

descriptor concatenate
1st level table

concatenate
descriptor
2nd level table physical address

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 208
CPUs
• CPU performance
• CPU power consumption.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 209
Elements of CPU performance
• Cycle time.
• CPU pipeline.
• Memory system.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 210
Pipelining
• Several instructions are executed
simultaneously at different stages of
completion.
• Various conditions can cause pipeline
bubbles that reduce utilization:
• branches;
• memory system delays;
• etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 211
Performance measures
• Latency: time it takes for an instruction to
get through the pipeline.
• Throughput: number of instructions
executed per time period.
• Pipelining increases throughput without
reducing latency.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 212
ARM7 pipeline
• ARM 7 has 3-stage pipe:
• fetch instruction from memory;
• decode opcode and operands;
• execute.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 213
ARM pipeline execution

add r0,r1,#5
fetch decode execute

sub r2,r3,r6 fetch decode execute

cmp r2,#3 fetch decode execute

time
1 2 3

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 214
Pipeline stalls
• If every step cannot be completed in the
same amount of time, pipeline stalls.
• Bubbles introduced by stall increase
latency, reduce throughput.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 215
ARM multi-cycle LDMIA
instruction

ldmia fetch decodeex ld r2ex ld r3


r0,{r2,r3}

sub
fetch decode ex sub
r2,r3,r6

cmp fetch decodeex cmp


r2,#3

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 216
Control stalls
• Branches often introduce stalls (branch
penalty).
• Stall time may depend on whether branch is
taken.
• May have to squash instructions that
already started executing.
• Don’t know what to fetch until condition is
evaluated.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 217
ARM pipelined branch

bne foo fetch decode ex bne ex bne ex bne

sub
fetch decode
r2,r3,r6

foo add fetch decode ex add


r0,r1,r2

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 218
Delayed branch
• To increase pipeline efficiency, delayed
branch mechanism requires n instructions
after branch always executed whether
branch is executed or not.
• SHARC supports delayed and non-delayed
branches.
• Specified by bit in branch instruction.
• 2 instruction branch delay slot.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 219
Example: ARM execution time
• Determine execution time of FIR filter:
for (i=0; i<N; i++)
f = f + c[i]*x[i];
• Only branch in loop test may take more
than one cycle.
• BLT loop takes 1 cycle best case, 3 worst
case.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 220
FIR filter ARM code

; loop initiation code ; loop body


MOV r0,#0 ; use r0 for i, set to 0 loop LDR r4,[r3,r8] ; get value of c[i]
MOV r8,#0 ; use a separate index for arrays LDR r6,[r5,r8] ; get value of x[i]
ADR r2,N ; MUL r4,r4,r6 ; compute c[i]*x[i]
get address for N ADD r2,r2,r4 ; add into running sum
LDR r1,[r2] ; get value of N ; update loop counter and array index
MOV r2,#0 ; use r2 for f, set to 0 ADD r8,r8,#4 ; add one to array index
ADR r3,c ; load r3 with address of base of c ADD r0,r0,#1 ; add 1 to i
ADR r5,x ; load r5 with address of base of x ; test for exit
CMP r0,r1
BLT loop ;
if i < N, continue loop
loopend ...

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 221
FIR filter performance by block

Block Variable # instructions # cycles


Initialization tinit 7 7
Body tbody 4 4
Update tupdate 2 2
Test ttest 2 [2,4]

tloop = tinit+ N(tbody + tupdate) + (N-1) ttest,worst + ttest,best

Loop test succeeds is worst case


Loop test fails is best case
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 222
C55x pipeline
• C55x has 7-stage pipe:
• fetch;
• decode;
• address: computes data/branch addresses;
• access 1: reads data;
• access 2: finishes data read;
• Read stage: puts operands on internal
busses;
• execute.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 223
C55x organization
B
C,
D bus
D busses
3 data read busses 16
3 data read address busses 24
program address bus 24
program
read bus Program
32 Instruction Address Data
flow
unit unit unit
Dual
Dual-multiply
operand
Instruction
Data read
Single
Writes operand unit
read
coefficient
fetch
from memory
2 data write busses 16
2 data write address busses 24

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 224
C55x pipeline hazards
• Processor structure:
• Three computation units.
• 14 operators.
• Can perform two operations per
instruction.
• Some combinations of operators are not
legal.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 225
C55x hazards
• A-unit ALU/A-unit ALU.
• A-unit swap/A-unit swap.
• D-unit ALU,shifter,MAC/D-unit ALU,shifter,MAC
• D-unit shifter/D-unit shift, store
• D-unit shift, store/D-unit shift, store
• D-unit swap/D-unit swap
• P-unit control/P-unit control

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 226
Memory system performance
• Caches introduce indeterminacy in
execution time.
• Depends on order of execution.
• Cache miss penalty: added time due to a
cache miss.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 227
Types of cache misses
• Compulsory miss: location has not been
referenced before.
• Conflict miss: two locations are fighting
for the same block.
• Capacity miss: working set is too large.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 228
CPU power consumption
• Most modern CPUs are designed with
power consumption in mind to some
degree.
• Power vs. energy:
• heat depends on power consumption;
• battery life depends on energy consumption.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 229
CMOS power consumption
• Voltage drops: power consumption
proportional to V2.
• Toggling: more activity means more
power.
• Leakage: basic circuit characteristics; can
be eliminated by disconnecting power.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 230
CPU power-saving strategies
• Reduce power supply voltage.
• Run at lower clock frequency.
• Disable function units with control signals
when not in use.
• Disconnect parts from power supply when
not in use.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 231
C55x low power features
• Parallel execution units---longer idle shutdown
times.
• Multiple data widths:
• 16-bit ALU vs. 40-bit ALU.
• Instruction caches minimizes main memory
accesses.
• Power management:
• Function unit idle detection.
• Memory idle detection.
• User-configurable IDLE domains allow programmer
control of what hardware is shut down.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 232
Power management styles
• Static power management: does not
depend on CPU activity.
• Example: user-activated power-down mode.
• Dynamic power management: based on
CPU activity.
• Example: disabling off function units.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 233
Application: PowerPC 603
energy features
• Provides doze, nap, sleep modes.
• Dynamic power management features:
• Uses static logic.
• Can shut down unused execution units.
• Cache organized into subarrays to minimize
amount of active circuitry.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 234
PowerPC 603 activity
• Percentage of time units are idle for SPEC
integer/floating-point:
unit Specint92 Specfp92
D cache 29% 28%
I cache 29% 17%
load/store 35% 17%
fixed-point 38% 76%
floating-point 99% 30%
system register 89% 97%

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 235
Power-down costs
• Going into a power-down mode costs:
• time;
• energy.
• Must determine if going into mode is
worthwhile.
• Can model CPU power states with power
state machine.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 236
Application: StrongARM SA-
1100 power saving
• Processor takes two supplies:
• VDD is main 3.3V supply.
• VDDX is 1.5V.
• Three power modes:
• Run: normal operation.
• Idle: stops CPU clock, with logic still powered.
• Sleep: shuts off most of chip activity; 3 steps, each
about 30 ms; wakeup takes > 10 ms.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 237
SA-1100 power state machine

Prun = 400 mW

run
10 ms
160 ms
90 ms
10 ms
90 ms
idle sleep

Pidle = 50 mW Psleep = 0.16 mW


Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 238
CPUs
• Example: data compressor.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 239
Goals
• Compress data transmitted over serial
line.
• Receives byte-size input symbols.
• Produces output symbols packed into bytes.
• Will build software module only here.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 240
Collaboration diagram for
compressor

1..m: packed
1..n: input output
symbols symbols
:input :data compressor :output

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 241
Huffman coding
• Early statistical text compression algorithm.
• Select non-uniform size codes.
• Use shorter codes for more common symbols.
• Use longer codes for less common symbols.
• To allow decoding, codes must have unique
prefixes.
• No code can be a prefix of a longer valid code.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 242
Huffman example

character P
a .45
P=1
b .24
P=.55
c .11
P=.31
d .08 P=.19
e .07
f .05 P=.12

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 243
Example Huffman code
• Read code from root to leaves:
a 1
b 01
c 0000
d 0001
e 0010
f 0011

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 244
Huffman coder requirements
table
name data compression module
purpose code module for Huffman
compression
inputs encoding table, uncoded
byte-size inputs
outputs packed compression output
symbols
functions Huffman coding
performance fast
manufacturing cost N/A
power N/A
physical size/weight N/A

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 245
Building a specification
• Collaboration diagram shows only steady-
state input/output.
• A real system must:
• Accept an encoding table.
• Allow a system reset that flushes the
compression buffer.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 246
data-compressor class

data-compressor
buffer: data-buffer
table: symbol-table
current-bit: integer

encode(): boolean,
data-buffer
flush()
new-symbol-table()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 247
data-compressor behaviors
• encode: Takes one-byte input, generates
packed encoded symbols and a Boolean
indicating whether the buffer is full.
• new-symbol-table: installs new symbol
table in object, throws away old table.
• flush: returns current state of buffer,
including number of valid bits in buffer.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 248
Auxiliary classes

data-buffer symbol-table
databuf[databuflen] : symbols[nsymbols] :
character data-buffer
len : integer len : integer

insert() value() : symbol


length() : integer load()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 249
Auxiliary class roles
• data-buffer holds both packed and
unpacked symbols.
• Longest Huffman code for 8-bit inputs is 256
bits.
• symbol-table indexes encoded verison of
each symbol.
• load() puts data in a new symbol table.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 250
Class relationships

data-compressor
1 1
1 1
data-buffer symbol-table

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 251
Encode behavior

create new buffer return true


T add to buffers
input symbol
encode buffer filled?

F add to buffer return false

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 252
Insert behavior

pack into
input T this buffer
symbol
update
fills buffer?
length

F pack bottom bits


into this buffer,
top bits into
overflow buffer
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 253
Program design
• In an object-oriented language, we can
reflect the UML specification in the code
more directly.
• In a non-object-oriented language, we
must either:
• add code to provide object-oriented features;
• diverge from the specification structure.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 254
C++ classes
Class data_buffer {
char databuf[databuflen];
int len;
int length_in_chars() { return len/bitsperbyte; }
public:
void insert(data_buffer,data_buffer&);
int length() { return len; }
int length_in_bytes() { return (int)ceil(len/8.0); }
int initialize();
...

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 255
C++ classes, cont’d.
class data_compressor {
data_buffer buffer;
int current_bit;
symbol_table table;
public:
boolean encode(char,data_buffer&);
void new_symbol_table(symbol_table);
int flush(data_buffer&);
data_compressor();
~data_compressor();
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 256
C code
struct data_compressor_struct {
data_buffer buffer;
int current_bit;
sym_table table;
}
typedef struct data_compressor_struct data_compressor,
*data_compressor_ptr;
boolean data_compressor_encode(data_compressor_ptr
mycmptrs, char isymbol, data_buffer *fullbuf) ...

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 257
Testing

• Test by encoding, then decoding:

symbol table

input symbols encoder decoder result

compare

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 258
Code inspection tests
• Look at the code for potential problems:
• Can we run past end of symbol table?
• What happens when the next symbol does
not fill the buffer? Does fill it?
• Do very long encoded symbols work
properly? Very short symbols?
• Does flush() work properly?

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 259
Bus-Based Computer Systems
• Busses.
• Memory devices.
• I/O devices:
• serial links
• timers and counters
• keyboards
• displays
• analog I/O

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 260
The CPU bus
• Bus allows CPU, memory, devices to
communicate.
• Shared communication medium.
• A bus is:
• A set of wires.
• A communications protocol.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 261
Bus protocols
• Bus protocol determines how devices
communicate.
• Devices on the bus go through sequences
of states.
• Protocols are specified by state machines,
one state machine per actor in the protocol.
• May contain asynchronous logic behavior.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 262
Four-cycle handshake

device 1
enq
device 1 device 2
ack
device 2

1 2 3 4
time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 263
Four-cycle handshake, cont’d.
1. Device 1 raises enq.
2. Device 2 responds with ack.
3. Device 2 lowers ack once it has finished.
4. Device 1 lowers enq.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 264
Microprocessor busses

• Clock provides
synchronization.
• R/W is true when
reading (R/W’ is false
when reading).
• Address is a-bit bundle
of address lines.
• Data is n-bit bundle of
data lines.
• Data ready signals
when n-bit data is
ready.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 265
Timing diagrams

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 266
Bus read

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 267
State diagrams for bus read

Get Done Send Release


data data ack

See Ack
ack Adrs Adrs

Wait Wait

CPU start device

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 268
Bus wait state

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 269
Bus burst read

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 270
Bus multiplexing

data enable device


data
CPU
adrs

adrs

Adrs enable

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 271
DMA

• Direct memory access


(DMA) performs data
transfers without
executing
instructions.
• CPU sets up transfer.
• DMA engine fetches,
writes.
• DMA controller is a
separate unit.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 272
Bus mastership
• By default, CPU is bus master and initiates
transfers.
• DMA must become bus master to perform
its work.
• CPU can’t use bus while DMA operates.
• Bus mastership protocol:
• Bus request.
• Bus grant.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 273
DMA operation

• CPU sets DMA registers


for start address, length.
• DMA status register
controls the unit.
• Once DMA is bus master,
it transfers automatically.
• May run continuously until
complete.
• May use every nth bus
cycle.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 274
Bus transfer sequence diagram

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 275
System bus configurations

• Multiple busses allow


parallelism: CPU slow device
• Slow devices on one

bridge
bus.
• Fast devices on memory slow device
separate bus.
• A bridge connects high-speed
device
two busses.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 276
Bridge state diagram

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 277
ARM AMBA bus

• Two varieties:
• AHB is high-performance.
• APB is lower-speed, lower
cost.
• AHB supports pipelining,
burst transfers, split
transactions, multiple bus
masters.
• All devices are slaves on
APB.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 278
Memory components

• Several different
types of memory:
• DRAM.
• SRAM.
• Flash.
• Each type of memory
comes in varying:
• Capacities.
• Widths.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 279
Random-access memory
• Dynamic RAM is dense, requires refresh.
• Synchronous DRAM is dominant type.
• SDRAM uses clock to improve performance,
pipeline memory accesses.
• Static RAM is faster, less dense, consumes
more power.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 280
SDRAM operation

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 281
Read-only memory
• ROM may be programmed at factory.
• Flash is dominant form of field-
programmable ROM.
• Electrically erasable, must be block erased.
• Random access, but write/erase is much
slower than read.
• NOR flash is more flexible.
• NAND flash is more dense.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 282
Timers and counters
• Very similar:
• a timer is incremented by a periodic signal;
• a counter is incremented by an
asynchronous, occasional signal.
• Rollover causes interrupt.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 283
Watchdog timer
• Watchdog timer is periodically reset by
system timer.
• If watchdog is not reset, it generates an
interrupt to reset the host.

interrupt

host CPU watchdog


reset timer

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 284
Switch debouncing
• A switch must be debounced to multiple
contacts caused by eliminate mechanical
bouncing:

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 285
Encoded keyboard
• An array of switches is read by an
encoder.
• N-key rollover remembers multiple key
depressions.

row

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 286
LED

• Must use resistor to limit current:

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 287
7-segment LCD display
• May use parallel or multiplexed input.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 288
Types of high-resolution display
• Liquid crystal display (LCD) is dominant
form.
• Plasma, OLED, etc.
• Frame buffer holds current display
contents.
• Written by processor.
• Read by video.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 289
Touchscreen
• Includes input and output device.
• Input device is a two-dimensional
voltmeter:

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 290
Touchscreen position sensing

ADC

voltage

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 291
Digital-to-analog conversion
• Use resistor tree:

R
Vout
bn
2R
bn-1
4R
bn-2
8R
bn-3

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 292
Flash A/D conversion

• N-bit result requires 2n comparators:

Vin

encoder

...
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 293
Dual-slope conversion
• Use counter to time required to
charge/discharge capacitor.
• Charging, then discharging eliminates
non-linearities.

Vin
timer

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 294
Sample-and-hold
• Samples data:

Vin converter

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 295
Bus-Based Computer Systems
• Designing with microprocessors.
• Development and debugging.
• System-level performance analysis.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 296
System architectures
• Architectures and components:
• software;
• hardware.
• Some software is very hardware-
dependent.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 297
Hardware platform architecture
Contains several elements:
• CPU;
• bus;
• memory;
• I/O devices: networking, sensors,
actuators, etc.
How big/fast much each one be?

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 298
Software architecture
Functional description must be broken into
pieces:
• division among people;
• conceptual organization;
• performance;
• testability;
• maintenance.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 299
Hardware and software
architectures
Hardware and software are intimately
related:
• software doesn’t run without hardware;
• how much hardware you need is
determined by the software requirements:
• speed;
• memory.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 300
Evaluation boards
• Designed by CPU manufacturer or others.
• Includes CPU, memory, some I/O devices.
• May include prototyping section.
• CPU manufacturer often gives out
evaluation board netlist---can be used as
starting point for your custom board
design.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 301
Adding logic to a board
• Programmable logic devices (PLDs)
provide low/medium density logic.
• Field-programmable gate arrays (FPGAs)
provide more logic and multi-level logic.
• Application-specific integrated circuits
(ASICs) are manufactured for a single
purpose.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 302
The PC as a platform
• Advantages:
• cheap and easy to get;
• rich and familiar software environment.
• Disadvantages:
• requires a lot of hardware resources;
• not well-adapted to real-time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 303
Typical PC hardware platform

CPU memory
device

interface
CPU bus

bus
high-speed bus
intr DMA timers
ctrl controller
low-speed bus
bus
interface
device
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 304
Typical busses
• PCI: standard for high-speed interfacing
• 33 or 66 MHz.
• PCI Express.
• USB (Universal Serial Bus), Firewire (IEEE
1394): relatively low-cost serial interface
with high speed.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 305
Software elements
• IBM PC uses BIOS (Basic I/O System) to
implement low-level functions:
• boot-up;
• minimal device drivers.
• BIOS has become a generic term for the
lowest-level system software.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 306
Example: StrongARM
• StrongARM system includes:
• CPU chip (3.686 MHz clock)
• system control module (32.768 kHz clock).
• Real-time clock;
• operating system timer
• general-purpose I/O;
• interrupt controller;
• power manager controller;
• reset controller.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 307
Debugging embedded systems
• Challenges:
• target system may be hard to observe;
• target may be hard to control;
• may be hard to generate realistic inputs;
• setup sequence may be complex.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 308
Host/target design

• Use a host system to prepare software for


target system:

target
system

serial line
host system
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 309
Host-based tools
• Cross compiler:
• compiles code on host for target system.
• Cross debugger:
• displays target state, allows target system to
be controlled.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 310
Software debuggers
• A monitor program residing on the target
provides basic debugger functions.
• Debugger should have a minimal footprint
in memory.
• User program must be careful not to
destroy debugger program, but , should
be able to recover from some damage
caused by user code.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 311
Breakpoints
• A breakpoint allows the user to stop
execution, examine system state, and
change state.
• Replace the breakpointed instruction with
a subroutine call to the monitor program.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 312
ARM breakpoints

0x400 MUL r4,r6,r6 0x400 MUL r4,r6,r6


0x404 ADD r2,r2,r4 0x404 ADD r2,r2,r4
0x408 ADD r0,r0,#1 0x408 ADD r0,r0,#1
0x40c B loop 0x40c BL bkpoint

uninstrumented code code with breakpoint

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 313
Breakpoint handler actions
• Save registers.
• Allow user to examine machine.
• Before returning, restore system state.
• Safest way to execute the instruction is to
replace it and execute in place.
• Put another breakpoint after the replaced
breakpoint to allow restoring the original
breakpoint.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 314
In-circuit emulators
• A microprocessor in-circuit emulator is a
specially-instrumented microprocessor.
• Allows you to stop execution, examine
CPU state, modify registers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 315
Logic analyzers

• A logic analyzer is an array of low-grade


oscilloscopes:

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 316
Logic analyzer architecture

UUT sample
microprocessor
memory

system clock vector


address
controller
state or
clock timing mode
gen
keypad display
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 317
Boundary scan

• Simplifies testing of
multiple chips on a
board.
• Registers on pins can
be configured as a
scan chain.
• Used for debuggers,
in-circuit emulators.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 318
How to exercise code
• Run on host system.
• Run on target system.
• Run in instruction-level simulator.
• Run on cycle-accurate simulator.
• Run in hardware/software co-simulation
environment.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 319
Debugging real-time code
• Bugs in drivers can cause non-
deterministic behavior in the foreground
problem.
• Bugs may be timing-dependent.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 320
System-level performance
analysis

• Performance depends
on all the elements of
the system: memory
• CPU. CPU
• Cache. cache
• Bus.
• Main memory.
• I/O device.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 321
Bandwidth as performance
• Bandwidth applies to several components:
• Memory.
• Bus.
• CPU fetches.
• Different parts of the system run at
different clock rates.
• Different components may have different
widths (bus, memory).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 322
Bandwidth and data transfers
• Video frame: 320 x 240 x 3 = 230,400
bytes.
• Transfer in 1/30 sec.
• Transfer 1 byte/msec, 0.23 sec per frame.
• Too slow.
• Increase bandwidth:
• Increase bus width.
• Increase bus clock rate.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 323
Bus bandwidth

• T: # bus cycles.
O1 D O2
• P: time/bus cycle.
• Total time for
W
transfer:
• t = TP.
• D: data payload
length.
• O1 + O2 = overhead Tbasic(N) = (D+O)N/W
O.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 324
Bus burst transfer bandwidth

• T: # bus cycles.
1 2 B O
• P: time/bus cycle.
• Total time for
… W
transfer:
• t = TP.
• D: data payload
length.
• O1 + O2 = overhead Tburst(N) = (BD+O)N/(BW)
O.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 325
Memory aspect ratios

16 M
64 M

8M

1 4 8
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 326
Memory access times
• Memory component access times comes
from chip data sheet.
• Page modes allow faster access for
successive transfers on same page.
• If data doesn’t fit naturally into physical
words:
• A = [(E/w)mod W]+1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 327
Bus performance bottlenecks

• Transfer 320 x 240


video frame @ 30
frames/sec = 612,000 memory
bytes/sec. CPU

• Is performance
bottleneck bus or
memory?

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 328
Bus performance bottlenecks,
cont’d.
• Bus: assume 1 MHz bus, D=1, O=3:
• Tbasic = (1+3)612,000/2 = 1,224,000 cycles =
1.224 sec.
• Memory: try burst mode B=4, width
w=0.5.
• Tmem = (4*1+4)612,000/(4*0.5) = 2,448,000
cycles = 0.2448 sec.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 329
Performance spreadsheet

bus memory
clock period 1.00E-06 clock period 1.00E-08
W 2 W 0.5
D 1 D 1
O 3 O 4
B 4
N 612000 N 612000

T_basic 1224000 T_mem 2448000


t 1.22E+00 t 2.45E-02

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 330
Parallelism

• Speed things up by
running several units
at once.
• DMA provides
parallelism if CPU
doesn’t need the bus:
• DMA + bus.
• CPU.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 331
Bus-Based Computer Systems
• Example: alarm clock

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 332
Alarm clock interface

Alarm on Alarm off


buzzer
PM

Alarm
ready
light
set set hour minute
time alarm button
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 333
Operations
• Set time: hold set time, depress hour,
minute.
• Set alarm time: hold set alarm, depress
hour, minute.
• Turn alarm on/off: depress alarm on/off.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 334
Alarm clock requirements

name alarm clock


purpose 24-hour digital clock with one alarm
inputs set time, set alarm, hour, minute, alarm on/off
outputs four-digit display, PM indicator, alarm ready, buzzer
functions keep time, set time, set alarm, turn alarm on/off,
activate buzzer by alarm
performance hours and digits, no seconds; not high precision
manufacturing consumer product
cost
power AC
physical fits on stand
size/weight

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 335
Alarm clock class diagram

1 1 1 1
Lights* Display Mechanism
1 1
1
Buttons*

Speaker* 1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 336
Alarm clock physical classes

Lights* Buttons* Speaker*

set-time(): boolean
digit-val() buzz()
set-alarm(): boolean
digit-scan() alarm-on(): boolean
alarm-on-light() alarm-off(): boolean
PM-light() minute(): boolean
hour(): boolean

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 337
Display class

Display
time[4]: integer
alarm-indicator: boolean
PM-indicator: boolean

set-time()
alarm-light-on()
alarm-light-off()
PM-light-on()
PM-light-off()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 338
Mechanism class
Mechanism
Seconds: integer
PM: boolean
tens-hours, ones-hours: boolean
tens-minutes, ones-minutes: boolean
alarm-ready: boolean
alarm-tens-hours, alarm-ones-hours:
boolean
alarm-tens-minutes, alarm-ones-minutes:
boolean
scan-keyboard()
update-time()
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 339
Update-time behavior

update seconds display.set-time(current time)


with rollover
F
Rollover? Time >= alarm and alarm-on? F
T
update hh:mm T
with rollover
alarm.buzzer(true)

AM->PM PM->AM

PM=true PM=false
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 340
Scan-keyboard behavior

Set-time and
not set-alarm
compute button activations and hours
Alarm-on Increment time
alarm-ready=
tens w. rollover
true
Alarm-off and AM/PM

alarm-ready= Increment time


false save button ones w. rollover
alarm.buzzer(false) states and AM/PM
Set-time and
not set-alarm
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed.
and minutes 341
System architecture
• Includes:
• periodic behavior (clock);
• aperiodic behavior (buttons, buzzer
activation).
• Two major software components:
• interrupt-driven routine updates time;
• foreground program deals with buttons,
commands.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 342
Interrupt-driven routine
• Timer probably can’t handle one-minute
interrupt interval.
• Use software variable to convert interrupt
frequency to seconds.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 343
Foreground program
• Operates as while loop:
while (TRUE) {
read_buttons(button_values);
process_command(button_values);
check_alarm();
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 344
Testing
• Component testing:
• test interrupt code on the platform;
• can test foreground program using a mock-
up.
• System testing:
• relatively few components to integrate;
• check clock accuracy;
• check recognition of buttons, buzzer, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 345
Program design and analysis
• Software components.
• Representations of programs.
• Assembly and linking.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 346
Software state machine
• State machine keeps internal state as a
variable, changes state based on inputs.
• Uses:
• control-dominated code;
• reactive systems.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 347
State machine example

no seat/-
no seat/ idle
buzzer off seat/timer on
no seat/- no belt
buzzer seated and no
Belt/buzzer on timer/-
belt/-
belt/
buzzer off no belt/timer on
belted

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 348
C implementation
#define IDLE 0
#define SEATED 1
#define BELTED 2
#define BUZZER 3
switch (state) {
case IDLE: if (seat) { state = SEATED; timer_on = TRUE; }
break;
case SEATED: if (belt) state = BELTED;
else if (timer) state = BUZZER;
break;

}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 349
Signal processing and circular
buffer
• Commonly used in signal processing:
• new data constantly arrives;
• each datum has a limited lifetime.

time time t+1

• Use a circular buffer to hold the data


d1 d2 d3 d4 d5 d6 d7

stream.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 350
Circular buffer

x1 x2 x3 x4 x5 x6

t1 t2 t3
Data stream

x1
x5 x2
x6 x3
x7 x4

Circular buffer

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 351
Circular buffers

• Indexes locate currently used data,


current input data:
input d1 use d5

d2 input d2

d3 d3

use d4 d4

time t1 time t1+1


Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 352
Circular buffer implementation:
FIR filter
int circ_buffer[N], circ_buffer_head = 0;
int c[N]; /* coefficients */

int ibuf, ic;
for (f=0, ibuff=circ_buff_head, ic=0;
ic<N; ibuff=(ibuff==N-1?0:ibuff++), ic++)
f = f + c[ic]*circ_buffer[ibuf];

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 353
Queues
• Elastic buffer: holds data that arrives
irregularly.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 354
Buffer-based queues

#define Q_SIZE 32 int dequeue() {


#define Q_MAX (Q_SIZE-1) int returnval;
int q[Q_MAX], head, tail; if (head == tail) error();
void initialize_queue() { head = returnval = q[head];
tail = 0; } if (head == Q_MAX) head =
void enqueue(int val) { 0;
if (((tail+1)%Q_SIZE) == else head++;
head) error(); return returnval;
q[tail]=val; }
if (tail == Q_MAX) tail = 0;
else tail++;
}
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 355
Models of programs
• Source code is not a good representation
for programs:
• clumsy;
• leaves much information implicit.
• Compilers derive intermediate
representations to manipulate and optiize
the program.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 356
Data flow graph
• DFG: data flow graph.
• Does not represent control.
• Models basic block: code with no entry or
exit.
• Describes the minimal ordering
requirements on operations.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 357
Single assignment form

x = a + b; x = a + b;
y = c - d; y = c - d;
z = x * y; z = x * y;
y = b + d; y1 = b + d;

original basic block single assignment form

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 358
Data flow graph

x = a + b; a b c d
y = c - d;
+ -
z = x * y;
y1 = b + d; x
y

* +
single assignment form
z y1
DFG
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 359
DFGs and partial orders

a b c d Partial order:
• a+b, c-d; b+d x*y
+ - Can do pairs of
y operations in any
x
order.
* +

z y1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 360
Control-data flow graph
• CDFG: represents control and data.
• Uses data flow graphs as components.
• Two types of nodes:
• decision;
• data flow.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 361
Data flow node

Encapsulates a data flow graph:

x = a + b;
y=c+d

Write operations in basic block form for


simplicity.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 362
Control

T v1 v4
cond value

v2 v3
F

Equivalent forms

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 363
CDFG example

T
if (cond1) bb1(); cond1 bb1()
else bb2(); F
bb3(); bb2()
switch (test1) {
case c1: bb4(); break; bb3()
case c2: bb5(); break;
case c3: bb6(); break; c3
c1 test1
}
c2
bb4() bb5() bb6()
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 364
for loop

for (i=0; i<N; i++)


i=0
loop_body();
for loop
F
i<N
i=0; T
while (i<N) {
loop_body(); i++; } loop_body()

equivalent
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 365
Assembly and linking

• Last steps in compilation:


HLL
HLL compile assembly
assembly
HLL assembly assemble

link executable link

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 366
Multiple-module programs
• Programs may be composed from several
files.
• Addresses become more specific during
processing:
• relative addresses are measured relative to
the start of a module;
• absolute addresses are measured relative to
the start of the CPU address space.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 367
Assemblers
• Major tasks:
• generate binary for symbolic instructions;
• translate labels into addresses;
• handle pseudo-ops (data, etc.).
• Generally one-to-one translation.
• Assembly labels:
ORG 100
label1 ADR r4,c

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 368
Symbol table

ADD r0,r1,r2 xx 0x8


xx ADD r3,r4,r5 yy 0x10
CMP r0,r3
yy SUB r5,r6,r7

assembly code symbol table

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 369
Symbol table generation
• Use program location counter (PLC) to
determine address of each location.
• Scan program, keeping count of PLC.
• Addresses are generated at assembly
time, not execution time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 370
Symbol table example
PLC=0x7
ADD
PLC=0x7
r0,r1,r2 xx 0x8
xx ADD r3,r4,r5 yy 0x10
PLC=0x7
CMP r0,r3
PLC=0x7
yy SUB r5,r6,r7

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 371
Two-pass assembly
• Pass 1:
• generate symbol table
• Pass 2:
• generate binary instructions

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 372
Relative address generation
• Some label values may not be known at
assembly time.
• Labels within the module may be kept in
relative form.
• Must keep track of external labels---can’t
generate full binary for instructions that
use external labels.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 373
Pseudo-operations
• Pseudo-ops do not generate instructions:
• ORG sets program location.
• EQU generates symbol table entry without
advancing PLC.
• Data statements define data blocks.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 374
Linking
• Combines several object modules into a
single executable module.
• Jobs:
• put modules in order;
• resolve labels across modules.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 375
Externals and entry points
entry point
xxx ADD r1,r2,r3 a ADR r4,yyy
B a external reference ADD r3,r4,r5
yyy %1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 376
Module ordering

• Code modules must be placed in absolute


positions in the memory space.
• Load map or linker flags control the order
of modules.
module1

module2

module3
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 377
Dynamic linking
• Some operating systems link modules
dynamically at run time:
• shares one copy of library among all
executing programs;
• allows programs to be updated with new
versions of libraries.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 378
Program design and analysis
• Compilation flow.
• Basic statement translation.
• Basic optimizations.
• Interpreters and just-in-time compilers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 379
Compilation
• Compilation strategy (Wirth):
• compilation = translation + optimization
• Compiler determines quality of code:
• use of CPU resources;
• memory access scheduling;
• code size.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 380
Basic compilation phases

HLL

parsing, symbol table

machine-independent
optimizations

machine-dependent
optimizations

assembly
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 381
Statement translation and
optimization
• Source code is translated into
intermediate form such as CDFG.
• CDFG is transformed/optimized.
• CDFG is translated into instructions with
optimization decisions.
• Instructions are further optimized.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 382
Arithmetic expressions

a*b + 5*(c-d) a b c d
* -
expression
5

DFG
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 383
Arithmetic expressions, cont’d.

a b c d ADR r4,a
MOV r1,[r4]
1 * 2 - ADR r4,b
MOV r2,[r4]
5
ADD r3,r1,r2
ADR r4,c
3 * MOV r1,[r4]
ADR r4,d
MOV r5,[r4]
SUB r6,r4,r5
4 +
MUL r7,r6,#5
ADD r8,r7,r3

DFG code
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 384
Control code generation

if (a+b > 0)
x = 5;
else a+b>0 x=5

x = 7;
x=7

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 385
Control code generation,
cont’d.
ADR r5,a
LDR r1,[r5]
ADR r5,b
1 a+b>0 x=5 2 LDR r2,b
ADD r3,r1,r2
BLE label3
LDR r3,#5
3 x=7 ADR r5,x
STR r3,[r5]
B stmtent
LDR r3,#7
ADR r5,x
STR r3,[r5]
stmtent ...
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 386
Procedure linkage
• Need code to:
• call and return;
• pass parameters and results.
• Parameters and returns are passed on
stack.
• Procedures with few parameters may use
registers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 387
Procedure stacks

growth
proc1 proc1(int a) {
proc2(5);
FP }
frame pointer
proc2
5 accessed relative to SP
SP
stack pointer

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 388
ARM procedure linkage
• APCS (ARM Procedure Call Standard):
• r0-r3 pass parameters into procedure. Extra
parameters are put on stack frame.
• r0 holds return value.
• r4-r7 hold register values.
• r11 is frame pointer, r13 is stack pointer.
• r10 holds limiting address on stack size to
check for stack overflows.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 389
Data structures
• Different types of data structures use
different data layouts.
• Some offsets into data structure can be
computed at compile time, others must be
computed at run time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 390
One-dimensional arrays

• C array name points to 0th element:

a a[0]
a[1] = *(a + 1)
a[2]

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 391
Two-dimensional arrays

• Column-major layout:
a[0,0]
a[0,1] M
...
N

... a[1,0]
a[1,1] = a[i*M+j]

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 392
Structures

• Fields within structures are static offsets:

aptr
struct { field1 4 bytes
int field1;
char field2; *(aptr+4)
} mystruct; field2

struct mystruct a, *aptr = &a;

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 393
Expression simplification
• Constant folding:
• 8+1 = 9
• Algebraic:
• a*b + a*c = a*(b+c)
• Strength reduction:
• a*2 = a<<1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 394
Dead code elimination

• Dead code:
#define DEBUG 0
0
if (DEBUG) dbg(p1); 0
• Can be eliminated by 1
analysis of control
dbg(p1);
flow, constant
folding.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 395
Procedure inlining
• Eliminates procedure linkage overhead:

int foo(a,b,c) { return a + b - c;}


z = foo(w,x,y);

z = w + x + y;

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 396
Loop transformations
• Goals:
• reduce loop overhead;
• increase opportunities for pipelining;
• improve memory system performance.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 397
Loop unrolling
• Reduces loop overhead, enables some
other optimizations.
for (i=0; i<4; i++)
a[i] = b[i] * c[i];


for (i=0; i<2; i++) {
a[i*2] = b[i*2] * c[i*2];
a[i*2+1] = b[i*2+1] * c[i*2+1];
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 398
Loop fusion and distribution
• Fusion combines two loops into 1:
for (i=0; i<N; i++) a[i] = b[i] * 5;
for (j=0; j<N; j++) w[j] = c[j] * d[j];
 for (i=0; i<N; i++) {
a[i] = b[i] * 5; w[i] = c[i] * d[i];
}
• Distribution breaks one loop into two.
• Changes optimizations within loop body.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 399
Loop tiling
• Breaks one loop into a nest of loops.
• Changes order of accesses within array.
• Changes cache behavior.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 400
Loop tiling example

for (i=0; i<N; i++) for (i=0; i<N; i+=2)


for (j=0; j<N; j++) for (j=0; j<N; j+=2)
c[i] = a[i,j]*b[i]; for (ii=0; ii<min(i+2,n); ii++)
for (jj=0; jj<min(j+2,N); jj++)
c[ii] = a[ii,jj]*b[ii];

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 401
Array padding

• Add array elements to change mapping


into cache:

a[0,0] a[0,1] a[0,2] a[0,0] a[0,1] a[0,2] a[0,2]

a[1,0] a[1,1] a[1,2] a[1,0] a[1,1] a[1,2] a[1,2]

before after
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 402
Register allocation
• Goals:
• choose register to hold each variable;
• determine lifespan of varible in the register.
• Basic case: within basic block.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 403
Register lifetime graph

w = a + b; t=1 a
x = c + w; t=2 b
y = c + d; t=3 c
d
w
x
y

1 2 3 time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 404
Instruction scheduling
• Non-pipelined machines do not need
instruction scheduling: any order of
instructions that satisfies data
dependencies runs equally fast.
• In pipelined machines, execution time of
one instruction depends on the nearby
instructions: opcode, operands.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 405
Reservation table

• A reservation table Time/instr A B


relates instr1 X
instructions/time to instr2 X X
CPU resources.
instr3 X
instr4 X

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 406
Software pipelining
• Schedules instructions across loop
iterations.
• Reduces instruction latency in iteration i
by inserting instructions from iteration
i+1.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 407
Instruction selection

• May be several ways to implement an


operation or sequence of operations.
• Represent operations as graphs, match
possible instruction sequences onto
graph.
+ +
* +
* MUL ADD *
expression templates MADD
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 408
Using your compiler
• Understand various optimization levels (-
O1, -O2, etc.)
• Look at mixed compiler/assembler output.
• Modifying compiler output requires care:
• correctness;
• loss of hand-tweaked code.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 409
Interpreters and JIT compilers
• Interpreter: translates and executes
program statements on-the-fly.
• JIT compiler: compiles small sections of
code into instructions during program
execution.
• Eliminates some translation overhead.
• Often requires more memory.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 410
Program design and analysis
• Program-level performance analysis.
• Optimizing for:
• Execution time.
• Energy/power.
• Program size.
• Program validation and testing.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 411
Program-level performance
analysis

• Need to understand
performance in detail:
• Real-time behavior, not
just typical.
• On complex platforms.
• Program performance 
CPU performance:
• Pipeline, cache are
windows into program.
• We must analyze the entire
program.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 412
Complexities of program
performance
• Varies with input data:
• Different-length paths.
• Cache effects.
• Instruction-level performance variations:
• Pipeline interlocks.
• Fetch times.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 413
How to measure program
performance
• Simulate execution of the CPU.
• Makes CPU state visible.
• Measure on real CPU using timer.
• Requires modifying the program to control
the timer.
• Measure on real CPU using logic analyzer.
• Requires events visible on the pins.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 414
Program performance metrics
• Average-case execution time.
• Typically used in application programming.
• Worst-case execution time.
• A component in deadline satisfaction.
• Best-case execution time.
• Task-level interactions can cause best-case
program behavior to result in worst-case
system behavior.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 415
Elements of program
performance
• Basic program execution time formula:
• execution time = program path + instruction timing
• Solving these problems independently helps
simplify analysis.
• Easier to separate on simpler CPUs.
• Accurate performance analysis requires:
• Assembly/binary code.
• Execution platform.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 416
Data-dependent paths in an if
statement

if (a || b) { /* T1 */ a b c path
if ( c ) /* T2 */ 0 0 0 T1=F, T3=F: no assignments

x = r*s+t; /* A1 */ 0 0 1 T1=F, T3=T: A4


else y=r+s; /* A2 */ 0 1 0 T1=T, T2=F: A2, A3
z = r+s+u; /* A3 */ 0 1 1 T1=T, T2=T: A1, A3
} 1 0 0 T1=T, T2=F: A2, A3
else { 1 0 1 T1=T, T2=T: A1, A3
if ( c ) /* T3 */ 1 1 0 T1=T, T2=F: A2, A3
y = r-t; /* A4 */ 1 1 1 T1=T, T2=T: A1, A3
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 417
Paths in a loop

for (i=0, f=0; i<N; i++) i=0


f = f + c[i] * x[i]; f=0

N
i=N
Y
f = f + c[i] * x[i]

i=i+1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 418
Instruction timing
• Not all instructions take the same amount of time.
• Multi-cycle instructions.
• Fetches.
• Execution times of instructions are not independent.
• Pipeline interlocks.
• Cache effects.
• Execution times may vary with operand value.
• Floating-point operations.
• Some multi-cycle integer operations.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 419
Mesaurement-driven
performance analysis
• Not so easy as it sounds:
• Must actually have access to the CPU.
• Must know data inputs that give worst/best
case performance.
• Must make state visible.
• Still an important method for performance
analysis.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 420
Feeding the program
• Need to know the desired input values.
• May need to write software scaffolding to
generate the input values.
• Software scaffolding may also need to
examine outputs to generate feedback-
driven inputs.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 421
Trace-driven measurement
• Trace-driven:
• Instrument the program.
• Save information about the path.
• Requires modifying the program.
• Trace files are large.
• Widely used for cache analysis.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 422
Physical measurement
• In-circuit emulator allows tracing.
• Affects execution timing.
• Logic analyzer can measure behavior at pins.
• Address bus can be analyzed to look for events.
• Code can be modified to make events visible.
• Particularly important for real-world input
streams.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 423
CPU simulation
• Some simulators are less accurate.
• Cycle-accurate simulator provides
accurate clock-cycle timing.
• Simulator models CPU internals.
• Simulator writer must know how CPU works.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 424
SimpleScalar FIR filter
simulation

int x[N] = {8, 17, … }; N total sim sim cycles


cycles per filter
int c[N] = {1, 2, … }; execution
main() { 100 25854 259
1,000 155759 156
int i, k, f;
1,0000 1451840 145
for (k=0; k<COUNT;
k++)
for (i=0; i<N; i++)
f += c[i]*x[i];
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 425
Performance optimization
motivation
• Embedded systems must often meet
deadlines.
• Faster may not be fast enough.
• Need to be able to analyze execution
time.
• Worst-case, not typical.
• Need techniques for reliably improving
execution time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 426
Programs and performance
analysis
• Best results come from analyzing
optimized instructions, not high-level
language code:
• non-obvious translations of HLL statements
into instructions;
• code may move;
• cache effects are hard to predict.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 427
Loop optimizations
• Loops are good targets for optimization.
• Basic loop optimizations:
• code motion;
• induction-variable elimination;
• strength reduction (x*2 -> x<<1).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 428
Code motion

for (i=0; i<N*M; i++)


i=0; Xi=0;
= N*M
z[i] = a[i] + b[i];

i<N*M
i<X N
Y
z[i] = a[i] + b[i];

i = i+1;

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 429
Induction variable elimination
• Induction variable: loop index.
• Consider loop:
for (i=0; i<N; i++)
for (j=0; j<M; j++)
z[i,j] = b[i,j];
• Rather than recompute i*M+j for each
array in each iteration, share induction
variable between arrays, increment at end
of loop body.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 430
Cache analysis
• Loop nest: set of loops, one inside other.
• Perfect loop nest: no conditionals in nest.
• Because loops use large quantities of
data, cache conflicts are common.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 431
Array conflicts in cache

a[0,0] 1024
1024 4099

b[0,0] 4099 ...

main memory cache


Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 432
Array conflicts, cont’d.
• Array elements conflict because they are
in the same line, even if not mapped to
same location.
• Solutions:
• move one array;
• pad array.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 433
Performance optimization hints
• Use registers efficiently.
• Use page mode memory accesses.
• Analyze cache behavior:
• instruction conflicts can be handled by
rewriting code, rescheudling;
• conflicting scalar data can easily be moved;
• conflicting array data can be moved, padded.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 434
Energy/power optimization
• Energy: ability to do work.
• Most important in battery-powered systems.
• Power: energy per unit time.
• Important even in wall-plug systems---power
becomes heat.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 435
Measuring energy consumption

• Execute a small loop, measure current:


I

while (TRUE)
a();

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 436
Sources of energy consumption
• Relative energy per operation (Catthoor et
al):
• memory transfer: 33
• external I/O: 10
• SRAM write: 9
• SRAM read: 4.4
• multiply: 3.6
• add: 1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 437
Cache behavior is important
• Energy consumption has a sweet spot as
cache size changes:
• cache too small: program thrashes, burning
energy on external memory accesses;
• cache too large: cache itself burns too much
power.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 438
Cache sweet spot

[Li98] © 1998 IEEE

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 439
Optimizing for energy
• First-order optimization:
• high performance = low energy.
• Not many instructions trade speed for
energy.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 440
Optimizing for energy, cont’d.
• Use registers efficiently.
• Identify and eliminate cache conflicts.
• Moderate loop unrolling eliminates some
loop overhead instructions.
• Eliminate pipeline stalls.
• Inlining procedures may help: reduces
linkage, but may increase cache
thrashing.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 441
Efficient loops
• General rules:
• Don’t use function calls.
• Keep loop body small to enable local repeat
(only forward branches).
• Use unsigned integer for loop counter.
• Use <= to test loop counter.
• Make use of compiler---global optimization,
software pipelining.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 442
Single-instruction repeat loop
example
STM #4000h,AR2
; load pointer to source
STM #100h,AR3
; load pointer to destination
RPT #(1024-1)
MVDD *AR2+,*AR3+
; move

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 443
Optimizing for program size
• Goal:
• reduce hardware cost of memory;
• reduce power consumption of memory units.
• Two opportunities:
• data;
• instructions.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 444
Data size minimization
• Reuse constants, variables, data buffers in
different parts of code.
• Requires careful verification of correctness.
• Generate data using instructions.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 445
Reducing code size
• Avoid function inlining.
• Choose CPU with compact instructions.
• Use specialized instructions where
possible.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 446
Program validation and testing
• But does it work?
• Concentrate here on functional
verification.
• Major testing strategies:
• Black box doesn’t look at the source code.
• Clear box (white box) does look at the source
code.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 447
Clear-box testing
• Examine the source code to determine whether
it works:
• Can you actually exercise a path?
• Do you get the value you expect along a path?
• Testing procedure:
• Controllability: rovide program with inputs.
• Execute.
• Observability: examine outputs.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 448
Controlling and observing
programs

firout = 0.0;
• Controllability:
for (j=curr, k=0; j<N; j++, k++)
firout += buff[j] * c[k]; • Must fill circular buffer
for (j=0; j<curr; j++, k++) with desired N values.
firout += buff[j] * c[k]; • Other code governs
if (firout > 100.0) firout = 100.0; how we access the
if (firout < -100.0) firout = -100.0;
buffer.
• Observability:
• Want to examine
firout before limit
testing.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 449
Execution paths and testing
• Paths are important in functional testing
as well as performance analysis.
• In general, an exponential number of
paths through the program.
• Show that some paths dominate others.
• Heuristically limit paths.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 450
Choosing the paths to test

• Possible criteria:
• Execute every
statement at least not covered
once.
• Execute every branch
direction at least once.
• Equivalent for
structured programs.
• Not true for gotos.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 451
Basis paths

• Approximate CDFG
with undirected
graph.
• Undirected graphs
have basis paths:
• All paths are linear
combinations of basis
paths.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 452
Cyclomatic complexity

• Cyclomatic complexity
is a bound on the size
of basis sets:
• e = # edges
• n = # nodes
• p = number of graph
components
• M = e – n + 2p.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 453
Branch testing
• Heuristic for testing branches.
• Exercise true and false branches of
conditional.
• Exercise every simple condition at least once.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 454
Branch testing example

• Correct: • Test:
• if (a || (b >= c)) { • a=F
printf(“OK\n”); } • (b >=c) = T
• Incorrect: • Example:
• if (a && (b >= c)) { • Correct: [0 || (3 >=
printf(“OK\n”); } 2)] = T
• Incorrect: [0 && (3
>= 2)] = F

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 455
Another branch testing
example

• Correct: • Incorrect code


• if ((x == good_pointer) && changes pointer.
x->field1 == 3)) {
printf(“got the value\n”); } • Assignment returns
new LHS in C.
• Incorrect:
z if ((x = good_pointer) &&
• Test that catches
x->field1 == 3)) { error:
printf(“got the value\n”); }
• (x != good_pointer)
&& x->field1 = 3)

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 456
Domain testing

• Heuristic test for


linear inequalities.
• Test on each side +
boundary of
inequality.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 457
Def-use pairs

• Variable def-use:
• Def when value is
assigned (defined).
• Use when used on
right-hand side.
• Exercise each def-use
pair.
• Requires testing
correct path.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 458
Loop testing
• Loops need specialized tests to be tested
efficiently.
• Heuristic testing strategy:
• Skip loop entirely.
• One loop iteration.
• Two loop iterations.
• # iterations much below max.
• n-1, n, n+1 iterations where n is max.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 459
Black-box testing
• Complements clear-box testing.
• May require a large number of tests.
• Tests software in different ways.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 460
Black-box test vectors
• Random tests.
• May weight distribution based on software
specification.
• Regression tests.
• Tests of previous versions, bugs, etc.
• May be clear-box tests of previous versions.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 461
How much testing is enough?
• Exhaustive testing is impractical.
• One important measure of test quality---bugs
escaping into field.
• Good organizations can test software to give
very low field bug report rates.
• Error injection measures test quality:
• Add known bugs.
• Run your tests.
• Determine % injected bugs that are caught.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 462
Program design and analysis
• Software modem.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 463
Theory of operation

• Frequency-shift keying:
• separate frequencies for 0 and 1.
0 1

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 464
FSK encoding
• Generate waveforms based on current bit:

0110101 bit-controlled
waveform
generator

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 465
FSK decoding

A/D converter
zero filter detector 0 bit

one filter detector 1 bit

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 466
Transmission scheme

• Send data in 8-bit bytes. Arbitrary spacing


between bytes.
• Byte starts with 0 start bit.
• Receiver measures length of start bit to
synchronize itself to remaining 8 bits.
start (0) bit 1 bit 2 bit 3 ... bit 8

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 467
Requirements
Inputs Analog sound input, reset button.

Outputs Analog sound output, LED bit display.

Functions Transmitter: Sends data from memory


in 8-bit bytes plus start bit.
Receiver: Automatically detects bytes
and reads bits. Displays current bit on
LED.
Performance 1200 baud.

Manufacturing cost Dominated by microprocessor and


analog I/O
Power Powered by AC.

Physical Small desktop object.


size/weight

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 468
Specification

Line-in* Receiver
1 1

sample-in()
input()
bit-out()

Transmitter Line-out*
1 1

bit-in()
output()
sample-out()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 469
System architecture
• Interrupt handlers for samples:
• input and output.
• Transmitter.
• Receiver.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 470
Transmitter

• Waveform generation by table lookup.


• float sine_wave[N_SAMP] = { 0.0, 0.5, 0.866,
1, 0.866, 0.5, 0.0, -0.5, -0.866, -1.0, -0.866,
-0.5, 0};

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 471
Receiver
• Filters (FIR for simplicity) use circular
buffers to hold data.
• Timer measures bit length.
• State machine recognizes start bits, data
bits.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 472
Hardware platform
• CPU.
• A/D converter.
• D/A converter.
• Timer.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 473
Component design and testing
• Easy to test transmitter and receiver on
host.
• Transmitter can be verified with speaker
outputs.
• Receiver verification tasks:
• start bit recognition;
• data bit recognition.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 474
System integration and testing
• Use loopback mode to test components
against each other.
• Loopback in software or by connecting D/A
and A/D converters.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 475
Processes and operating
systems
• Multiple tasks and multiple processes.
• Specifications of process timing.
• Preemptive real-time operating systems.
• Processes and UML.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 476
Reactive systems
• Respond to external events.
• Engine controller.
• Seat belt monitor.
• Requires real-time response.
• System architecture.
• Program implementation.
• May require a chain reaction among
multiple processors.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 477
Tasks and processes

• A task is a functional • A process is a unique


description of a execution of a program.
• Several copies of a
connected set of program may run
operations. simultaneously or at
different times.
• (Task can also mean
• A process has its own
a collection of
state:
processes.) • registers;
• memory.
• The operating system
manages processes.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 478
Why multiple processes?
• Multiple tasks means multiple processes.
• Processes help with timing complexity:
• multiple rates
• multimedia
• automotive
• asynchronous input
• user interfaces
• communication systems

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 479
Multi-rate systems
• Tasks may be synchronous or
asynchronous.
• Synchronous tasks may recur at different
rates.
• Processes run at different rates based on
computational needs of the tasks.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 480
Example: engine control

• Tasks:
• spark control
• crankshaft sensing
• fuel/air mixture engine
• oxygen sensor controller
• Kalman filter

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 481
Typical rates in engine
controllers
Variable Full range time (ms) Update period (ms)
Engine spark timing 300 2
Throttle 40 2
Air flow 30 4
Battery voltage 80 4
Fuel flow 250 10
Recycled exhaust gas 500 25
Status switches 100 20
Air temperature Seconds 400
Barometric pressure Seconds 1000
Spark (dwell) 10 1
Fuel adjustment 80 8
Carburetor 500 25
Mode actuators 100
Overheads for Computers as 100
© 2008 Wayne Wolf Components 2nd ed. 482
Real-time systems
• Perform a computation to conform to external
timing constraints.
• Deadline frequency:
• Periodic.
• Aperiodic.
• Deadline type:
• Hard: failure to meet deadline causes system failure.
• Soft: failure to meet deadline causes degraded
response.
• Firm: late response is useless but some late
responses can be tolerated.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 483
Timing specifications on
processes
• Release time: time at which process
becomes ready.
• Deadline: time at which process must
finish.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 484
Release times and deadlines

deadline

P1

time
initiating period
event aperiodic process
periodic process initiated
at start of period
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 485
Rate requirements on
processes

• Period: interval
between process
activations.
CPU 1 P11
• Rate: reciprocal of
period. CPU 2 P12
• Initiatino rate may be CPU 3 P13
higher than period--- CPU 4 P14
several copies of
process run at once. time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 486
Timing violations
• What happens if a process doesn’t finish
by its deadline?
• Hard deadline: system fails if missed.
• Soft deadline: user may notice, but system
doesn’t necessarily fail.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 487
Example: Space Shuttle
software error
• Space Shuttle’s first launch was delayed
by a software timing error:
• Primary control system PASS and backup
system BFS.
• BFS failed to synchronize with PASS.
• Change to one routine added delay that
threw off start time calculation.
• 1 in 67 chance of timing problem.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 488
Task graphs

• Tasks may have data P1 P2


dependencies---must P5
execute in certain order.
• Task graph shows P3
data/control
dependencies between P6
processes. P4
• Task: connected set of
processes. task 1 task 2
• Task set: One or more
tasks. task set
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 489
Communication between tasks

• Task graph assumes that


all processes in each task MPEG
run at the same rate, system
tasks do not layer
communicate.
• In reality, some amount
of inter-task MPEG MPEG
communication is audio video
necessary.
• It’s hard to require
immediate response for
multi-rate communication.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 490
Process execution
characteristics
• Process execution time Ti.
• Execution time in absence of preemption.
• Possible time units: seconds, clock cycles.
• Worst-case, best-case execution time may be useful
in some cases.
• Sources of variation:
• Data dependencies.
• Memory system.
• CPU pipeline.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 491
Utilization
• CPU utilization:
• Fraction of the CPU that is doing useful work.
• Often calculated assuming no scheduling
overhead.
• Utilization:
• U= (CPU time for useful work)/ (total available CPU time)
=[S t1 ≤ t ≤ t2 T(t) ] / [t2 – t1]
= T/t

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 492
State of a process

• A process can be in
one of three states: executing gets data
• executing on the CPU; gets and CPU
preempted
• ready to run; CPU needs
data
• waiting for data.
gets data
ready waiting
needs data

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 493
The scheduling problem
• Can we meet all deadlines?
• Must be able to meet deadlines in all cases.
• How much CPU horsepower do we need
to meet our deadlines?

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 494
Scheduling feasibility

• Resource constraints
make schedulability
analysis NP-hard. P1 P2

• Must show that the


deadlines are met for
all timings of resource
requests. I/O device

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 495
Simple processor feasibility

• Assume:
• No resource conflicts.
• Constant process T1 T2 T3
execution times.
• Require: T
• T ≥ Si Ti
• Can’t use more than
100% of the CPU.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 496
Hyperperiod
• Hyperperiod: least common multiple
(LCM) of the task periods.
• Must look at the hyperperiod schedule to
find all task interactions.
• Hyperperiod can be very long if task
periods are not chosen carefully.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 497
Hyperperiod example
• Long hyperperiod:
• P1 7 ms.
• P2 11 ms.
• P3 15 ms.
• LCM = 1155 ms.
• Shorter hyperperiod:
• P1 8 ms.
• P2 12 ms.
• P3 16 ms.
• LCM = 96 ms.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 498
Simple processor feasibility
example

• P1 period 1 ms, CPU


time 0.1 ms. LCM 5.00E-03
• P2 period 1 ms, CPU peirod CPU time CPU time/LCM
P1 1.00E-03 1.00E-04 5.00E-04
time 0.2 ms. P2 1.00E-03 2.00E-04 1.00E-03
P3 5.00E-03 3.00E-04 3.00E-04
• P3 period 5 ms, CPU
time 0.3 ms. total CPU/LCM
utilization
1.80E-03
3.60E-01

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 499
Cyclostatic/TDMA

• Schedule in time
slots. T1 T2 T3 T1 T2 T3
• Same process
activation P P
irrespective of
workload.
• Time slots may be
equal size or
unequal.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 500
TDMA assumptions

• Schedule based on
least common
multiple (LCM) of
the process P1 P1 P1
periods.
P2 P2
• Trivial scheduler -
> very small PLCM
scheduling
overhead.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 501
TDMA schedulability
• Always same CPU utilization (assuming
constant process execution times).
• Can’t handle unexpected loads.
• Must schedule a time slot for aperiodic
events.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 502
TDMA schedulability example

• TDMA period = 10
ms. TDMA period 1.00E-02
• P1 CPU time 1 ms. CPU time
P1 1.00E-03
• P2 CPU time 3 ms. P2 3.00E-03
P3 2.00E-03
• P3 CPU time 2 ms. P4 2.00E-03
total 8.00E-03
• P4 CPU time 2 ms. utilization 8.00E-01

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 503
Round-robin

• Schedule process
only if ready.
• Always test T1 T2 T3 T2 T3
processes in the
same order.
• Variations: P P
• Constant system
period.
• Start round-robin
again after finishing
a round.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 504
Round-robin assumptions
• Schedule based on least common multiple
(LCM) of the process periods.
• Best done with equal time slots for
processes.
• Simple scheduler -> low scheduling
overhead.
• Can be implemented in hardware.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 505
Round-robin schedulability
• Can bound maximum CPU load.
• May leave unused CPU cycles.
• Can be adapted to handle unexpected
load.
• Use time slots at end of period.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 506
Schedulability and overhead
• The scheduling process consumes CPU
time.
• Not all CPU time is available for processes.
• Scheduling overhead must be taken into
account for exact schedule.
• May be ignored if it is a small fraction of total
execution time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 507
Running periodic processes
• Need code to control execution of
processes.
• Simplest implementation: process =
subroutine.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 508
while loop implementation

• Simplest while (TRUE) {


implementation has p1();
one loop. p2();
• No control over
execution timing. }

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 509
Timed loop implementation

• Encapuslate set of all void pall(){


processes in a single p1();
function that p2();
implements the task
set,. }
• Use timer to control
execution of the task.
• No control over timing
of individual
processes.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 510
Multiple timers implementation

• Each task has its own void pA(){ /* rate A */


function. p1();
• Each task has its own p3();
timer. }
• May not have enough void B(){ /* rate B */
timers to implement p2();
all the rates. p4();
p5();
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 511
Timer + counter
implementation

• Use a software count int p2count = 0;


to divide the timer. void pall(){
• Only works for clean p1();
multiples of the timer if (p2count >= 2) {
period. p2();
p2count = 0;
}
else p2count++;
p3();
}
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 512
Implementing processes
• All of these implementations are
inadequate.
• Need better control over timing.
• Need a better mechanism than
subroutines.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 513
Processes and operating
systems
• Operating systems.

© 2000 Morgan Overheads for Computers as


Kaufman Components 514
Operating systems
• The operating system controls resources:
• who gets the CPU;
• when I/O takes place;
• how much memory is allocated.
• The most important resource is the CPU
itself.
• CPU access controlled by the scheduler.

© 2000 Morgan Overheads for Computers as


Kaufman Components 515
Process state

• A process can be in
one of three states: executing gets data
• executing on the CPU; gets and CPU
preempted
• ready to run; CPU needs
data
• waiting for data.
gets data
ready waiting
needs data

© 2000 Morgan Overheads for Computers as


Kaufman Components 516
Operating system structure
• OS needs to keep track of:
• process priorities;
• scheduling state;
• process activation record.
• Processes may be created:
• statically before system starts;
• dynamically during execution.

© 2000 Morgan Overheads for Computers as


Kaufman Components 517
Embedded vs. general-purpose
scheduling
• Workstations try to avoid starving
processes of CPU access.
• Fairness = access to CPU.
• Embedded systems must meet deadlines.
• Low-priority processes may not run for a long
time.

© 2000 Morgan Overheads for Computers as


Kaufman Components 518
Priority-driven scheduling
• Each process has a priority.
• CPU goes to highest-priority process that
is ready.
• Priorities determine scheduling policy:
• fixed priority;
• time-varying priorities.

© 2000 Morgan Overheads for Computers as


Kaufman Components 519
Priority-driven scheduling
example
• Rules:
• each process has a fixed priority (1 highest);
• highest-priority ready process gets CPU;
• process continues until done.
• Processes
• P1: priority 1, execution time 10
• P2: priority 2, execution time 30
• P3: priority 3, execution time 20

© 2000 Morgan Overheads for Computers as


Kaufman Components 520
Priority-driven scheduling
example

P3 ready t=18
P2 ready t=0 P1 ready t=15

P2 P1 P2 P3

0 10 20 30 40 50 60
time
© 2000 Morgan Overheads for Computers as
Kaufman Components 521
The scheduling problem
• Can we meet all deadlines?
• Must be able to meet deadlines in all cases.
• How much CPU horsepower do we need
to meet our deadlines?

© 2000 Morgan Overheads for Computers as


Kaufman Components 522
Process initiation disciplines
• Periodic process: executes on (almost)
every period.
• Aperiodic process: executes on demand.
• Analyzing aperiodic process sets is harder-
--must consider worst-case combinations
of process activations.

© 2000 Morgan Overheads for Computers as


Kaufman Components 523
Timing requirements on
processes
• Period: interval between process
activations.
• Initiation interval: reciprocal of period.
• Initiation time: time at which process
becomes ready.
• Deadline: time at which process must
finish.

© 2000 Morgan Overheads for Computers as


Kaufman Components 524
Timing violations
• What happens if a process doesn’t finish
by its deadline?
• Hard deadline: system fails if missed.
• Soft deadline: user may notice, but system
doesn’t necessarily fail.

© 2000 Morgan Overheads for Computers as


Kaufman Components 525
Example: Space Shuttle
software error
• Space Shuttle’s first launch was delayed
by a software timing error:
• Primary control system PASS and backup
system BFS.
• BFS failed to synchronize with PASS.
• Change to one routine added delay that
threw off start time calculation.
• 1 in 67 chance of timing problem.

© 2000 Morgan Overheads for Computers as


Kaufman Components 526
Interprocess communication
• Interprocess communication (IPC): OS
provides mechanisms so that processes
can pass data.
• Two types of semantics:
• blocking: sending process waits for response;
• non-blocking: sending process continues.

© 2000 Morgan Overheads for Computers as


Kaufman Components 527
IPC styles
• Shared memory:
• processes have some memory in common;
• must cooperate to avoid destroying/missing
messages.
• Message passing:
• processes send messages along a
communication channel---no common
address space.

© 2000 Morgan Overheads for Computers as


Kaufman Components 528
Shared memory

• Shared memory on a bus:

memory
CPU 1 CPU 2

© 2000 Morgan Overheads for Computers as


Kaufman Components 529
Race condition in shared
memory
• Problem when two CPUs try to write the
same location:
• CPU 1 reads flag and sees 0.
• CPU 2 reads flag and sees 0.
• CPU 1 sets flag to one and writes location.
• CPU 2 sets flag to one and overwrites
location.

© 2000 Morgan Overheads for Computers as


Kaufman Components 530
Atomic test-and-set
• Problem can be solved with an atomic
test-and-set:
• single bus operation reads memory location,
tests it, writes it.
• ARM test-and-set provided by SWP:
ADR r0,SEMAPHORE
LDR r1,#1
GETFLAG SWP r1,r1,[r0]
BNZ GETFLAG

© 2000 Morgan Overheads for Computers as


Kaufman Components 531
Critical regions
• Critical region: section of code that cannot
be interrupted by another process.
• Examples:
• writing shared memory;
• accessing I/O device.

© 2000 Morgan Overheads for Computers as


Kaufman Components 532
Semaphores
• Semaphore: OS primitive for controlling
access to critical regions.
• Protocol:
• Get access to semaphore with P().
• Perform critical region operations.
• Release semaphore with V().

© 2000 Morgan Overheads for Computers as


Kaufman Components 533
Message passing

• Message passing on a network:

CPU 1 CPU 2

message message

message

© 2000 Morgan Overheads for Computers as


Kaufman Components 534
Process data dependencies

• One process may not


be able to start until P1 P2
another finishes.
• Data dependencies
defined in a task
P3
graph.
• All processes in one
task run at the same P4
rate.
© 2000 Morgan Overheads for Computers as
Kaufman Components 535
Other operating system
functions
• Date/time.
• File system.
• Networking.
• Security.

© 2000 Morgan Overheads for Computers as


Kaufman Components 536
Processes and operating
systems
• Scheduling policies:
• RMS;
• EDF.
• Scheduling modeling assumptions.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 537
Metrics
• How do we evaluate a scheduling policy:
• Ability to satisfy all deadlines.
• CPU utilization---percentage of time devoted
to useful work.
• Scheduling overhead---time required to make
scheduling decision.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 538
Rate monotonic scheduling
• RMS (Liu and Layland): widely-used,
analyzable scheduling policy.
• Analysis is known as Rate Monotonic
Analysis (RMA).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 539
RMA model
• All process run on single CPU.
• Zero context switch time.
• No data dependencies between processes.
• Process execution time is constant.
• Deadline is at end of period.
• Highest-priority ready process runs.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 540
Process parameters

• Ti is computation time of process i; ti is


period of process i.

period ti

Pi

computation time Ti

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 541
Rate-monotonic analysis
• Response time: time required to finish
process.
• Critical instant: scheduling state that gives
worst response time.
• Critical instant occurs when all higher-
priority processes are ready to execute.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 542
Critical instant
interfering processes

P1 P1 P1 P1 P1

P2 P2 P2

P3 P3
critical
instant
P4

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 543
RMS priorities
• Optimal (fixed) priority assignment:
• shortest-period process gets highest priority;
• priority inversely proportional to period;
• break ties arbitrarily.
• No fixed-priority scheme does better.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 544
RMS example

P2 period

P2
P1 period

P1 P1 P1

0 5 10
time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 545
RMS CPU utilization
• Utilization for n processes is
• S i Ti / ti
• As number of tasks approaches infinity,
maximum utilization approaches 69%.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 546
RMS CPU utilization, cont’d.
• RMS cannot use 100% of CPU, even with
zero context switch overhead.
• Must keep idle cycles available to handle
worst-case scenario.
• However, RMS guarantees all processes
will always meet their deadlines.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 547
RMS implementation
• Efficient implementation:
• scan processes;
• choose highest-priority active process.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 548
Earliest-deadline-first
scheduling
• EDF: dynamic priority scheduling scheme.
• Process closest to its deadline has highest
priority.
• Requires recalculating processes at every
timer interrupt.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 549
EDF analysis
• EDF can use 100% of CPU.
• But EDF may fail to miss a deadline.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 550
EDF implementation
• On each timer interrupt:
• compute time to deadline;
• choose process closest to deadline.
• Generally considered too expensive to use
in practice.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 551
Fixing scheduling problems
• What if your set of processes is
unschedulable?
• Change deadlines in requirements.
• Reduce execution times of processes.
• Get a faster CPU.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 552
Priority inversion
• Priority inversion: low-priority process
keeps high-priority process from running.
• Improper use of system resources can
cause scheduling problems:
• Low-priority process grabs I/O device.
• High-priority device needs I/O device, but
can’t get it until low-priority process is done.
• Can cause deadlock.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 553
Solving priority inversion
• Give priorities to system resources.
• Have process inherit the priority of a
resource that it requests.
• Low-priority process inherits priority of device
if higher.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 554
Data dependencies

• Data dependencies
allow us to improve
P1
utilization.
• Restrict combination
of processes that can
run simultaneously. P2
• P1 and P2 can’t run
simultaneously.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 555
Context-switching time
• Non-zero context switch time can push
limits of a tight schedule.
• Hard to calculate effects---depends on
order of context switches.
• In practice, OS context switch overhead is
small (hundreds of clock cycles) relative to
many common task periods (ms – ms).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 556
Processes and operating
systems
• Interprocess communication.
• Operating system performance.
• Power management.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 557
Interprocess communication
• OS provides interprocess communication
mechanisms:
• various efficiencies;
• communication power.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 558
Interprocess communication
• Interprocess communication (IPC): OS
provides mechanisms so that processes
can pass data.
• Two types of semantics:
• blocking: sending process waits for response;
• non-blocking: sending process continues.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 559
IPC styles
• Shared memory:
• processes have some memory in common;
• must cooperate to avoid destroying/missing
messages.
• Message passing:
• processes send messages along a
communication channel---no common
address space.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 560
Shared memory

• Shared memory on a bus:

memory
CPU 1 CPU 2

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 561
Race condition in shared
memory
• Problem when two CPUs try to write the
same location:
• CPU 1 reads flag and sees 0.
• CPU 2 reads flag and sees 0.
• CPU 1 sets flag to one and writes location.
• CPU 2 sets flag to one and overwrites
location.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 562
Atomic test-and-set
• Problem can be solved with an atomic
test-and-set:
• single bus operation reads memory location,
tests it, writes it.
• ARM test-and-set provided by SWP:
ADR r0,SEMAPHORE
LDR r1,#1
GETFLAG SWP r1,r1,[r0]
BNZ GETFLAG

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 563
Critical regions
• Critical region: section of code that cannot
be interrupted by another process.
• Examples:
• writing shared memory;
• accessing I/O device.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 564
Semaphores
• Semaphore: OS primitive for controlling
access to critical regions.
• Protocol:
• Get access to semaphore with P().
• Perform critical region operations.
• Release semaphore with V().

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 565
Message passing

• Message passing on a network:

CPU 1 CPU 2

message message

message

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 566
Process data dependencies

• One process may not


be able to start until P1 P2
another finishes.
• Data dependencies
defined in a task
P3
graph.
• All processes in one
task run at the same P4
rate.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 567
Signals in UML

• More general than Unix signal---may carry


arbitrary data:

someClass
<<signal>>
aSig
<<send>>
p : integer sigbehavior()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 568
Evaluating RTOS performance
• Simplifying assumptions:
• Context switch costs no CPU time,.
• We know the exact execution time of
processes.
• WCET/BCET don’t depend on context
switches.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 569
Scheduling and context switch
overhead

Process Execution deadline


time
P1 3 5
P2 3 10

With context switch overhead of


1, no feasible schedule.
2TP1 + TP2 = 2*(1+3)+(1_3)=11

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 570
Process execution time
• Process execution time is not constant.
• Extra CPU time can be good.
• Extra CPU time can also be bad:
• Next process runs earlier, causing new
preemption.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 571
Processes and caches
• Processes can cause additional caching
problems.
• Even if individual processes are well-behaved,
processes may interfere with each other.
• Worst-case execution time with bad
behavior is usually much worse than
execution time with good cache behavior.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 572
Effects of scheduling on the
cache

Schedule 1 (LRU cache):


Process WCET Avg. CPU
time
P1 8 6
P2 4 3
P3 4 3

Schedule 2 (half of cache


reserved for P1):

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 573
Power optimization
• Power management: determining how
system resources are scheduled/used to
control power consumption.
• OS can manage for power just as it
manages for time.
• OS reduces power by shutting down units.
• May have partial shutdown modes.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 574
Power management and
performance
• Power management and performance are
often at odds.
• Entering power-down mode consumes
• energy,
• time.
• Leaving power-down mode consumes
• energy,
• time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 575
Simple power management
policies
• Request-driven: power up once request is
received. Adds delay to response.
• Predictive shutdown: try to predict how
long you have before next request.
• May start up in advance of request in
anticipation of a new request.
• If you predict wrong, you will incur additional
delay while starting up.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 576
Probabilistic shutdown
• Assume service requests are probabilistic.
• Optimize expected values:
• power consumption;
• response time.
• Simple probabilistic: shut down after time
Ton, turn back on after waiting for Toff.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 577
Advanced Configuration and
Power Interface
• ACPI: open standard for power
management services.

applications
power
OS kernel management
device
drivers
ACPI BIOS

Hardware platform
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 578
ACPI global power states
• G3: mechanical off
• G2: soft off
• S1: low wake-up latency with no loss of context
• S2: low latency with loss of CPU/cache state
• S3: low latency with loss of all state except memory
• S4: lowest-power state with all devices off
• G1: sleeping state
• G0: working state

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 579
Processes and operating
systems
• Telephone answering machine.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 580
Theory of operation

• Compress audio using adaptive differential


pulse code modulation (ADPCM).

analog

time
ADPCM 3 2 1 -1 -2 -3

time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 581
ADPCM coding
• Coded in a small alphabet with positive
and negative values.
• {-3,-2,-1,1,2,3}
• Minimize error between predicted value
and actual signal value.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 582
ADPCM compression system

S quantizer

inverse
integrator
quantizer
encoder
samples

inverse
integrator
quantizer
decoder
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 583
Telephone system terms
• Subscriber line: line to phone.
• Central office: telephone switching
system.
• Off-hook: phone active.
• On-hook: phone inactive.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 584
Real and simulated subscriber
line
• Real subscriber line:
• 90V RMS ringing signal;
• companded analog signals;
• lightning protection, etc.
• Simulated subscriber line:
• microphone input;
• speaker output;
• switches for ring, off-hook, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 585
Requirements
Inputs Telephone: voice samples, ring.
User interface: microphone, play
messages button, record OGM button.
Outputs Telephone: voice samples, on-
hook/off-hook command.
User interface: speaker, # messages
indicator, message light.
Functions Default mode: detects ring, signals off-
hook, pays OGM, records ICM
Playback: play all messages, wait 5
seconds for new playback.
OGM editing: OGM up to 10 sec.
Performance About 30 minutes voice (@ 8kHz).
Manufacturing cost Consumer product range ($50)
Power AC plug
Physical Comparable to desk phone.
size/weight
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 586
Comments on analysis
• DRAM requirement influenced by DRAM
price.
• Details of user interface protocol could be
tested on a PC-based prototype.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 587
Answering machine class
diagram
1
1 1
Microphone* 1
Controls Record * Outgoing-
11 1
Line-in*
1 1 1 * message
1 *
1 1
Playback Incoming-
Line-out* * message
1 1

1 Lights
Buttons*

1
Speaker*

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 588
Physical interface classes

Microphone* Line-in* Line-out*

sample() sample()
sample()
ring-indicator() pick-up()

Buttons* Lights* Speaker*


record-OGM messages
play num-messages
sample()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 589
Message classes

Message
length
start-adrs
next-msg
samples

Incoming-message Outgoing-message

msg-time length=30 sec

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 590
Operational classes

Controls Record Playback

operate() record-msg() playback-msg()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 591
Software components
• Front panel module.
• Speaker module.
• Telephone line module.
• Telephone input and output modules.
• Compression module.
• Decompression module.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 592
Controls activate behavior

Compute buttons, line activations

Activations?

Play OGM Record OGM Play ICM Erase Answer

Play OGM
Wait for timeout
Allocate ICM
Erase
Record ICM
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 593
Record-msg/playback-msg
behaviors

nextadrs = 0 nextadrs = 0

msg.samples[nextadrs] = speaker.samples() =
sample(source) msg.samples[nextadrs];
nextadrs++
F F
End(source) nextadrs=msg.length
T T

record-msg playback-msg
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 594
Hardware platform
• CPU.
• Memory.
• Front panel.
• 2 A/Ds:
• subscriber line, microphone.
• 2 D/A:
• subscriber line, speaker.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 595
Component design and testing
• Must test performance as well as testing.
• Compression time shouldn’t dominate other
tasks.
• Test for error conditions:
• memory overflow;
• try to delete empty message set, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 596
System integration and testing
• Can test partial integration on host
platform; full testing requires integration
on target platform.
• Simulate phone line for tests:
• it’s legal;
• easier to produce test conditions.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 597
Multiprocessors
• Why multiprocessors?
• CPUs and accelerators.
• Multiprocessor performance analysis.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 598
Why multiprocessors?

• Better cost/performance.
• Match each CPU to its tasks or use custom
logic (smaller, cheaper).
• CPU cost is a non-linear function of
performance.

cost

performance
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 599
Why multiprocessors? cont’d.
• Better real-time performance.
• Put time-critical functions on less-loaded
processing elements.
• Remember RMS utilization---extra CPU cycles
must be reserved to meet deadlines.

cost
deadline w.
deadline RMS overhead
performance
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 600
Why multiprocessors? cont’d.

• Using specialized
processors or custom
logic saves power.
• Desktop
uniprocessors are not
power-efficient [Aus04] © 2004 IEEE Computer Society

enough for battery-


powered applications.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 601
Why multiprocessors? cont’d.
• Good for processing I/O in real-time.
• May consume less energy.
• May be better at streaming data.
• May not be able to do all the work on
even the largest single CPU.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 602
Accelerated systems
• Use additional computational unit
dedicated to some functions?
• Hardwired logic.
• Extra CPU.
• Hardware/software co-design: joint design
of hardware and software architectures.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 603
Accelerated system
architecture

request accelerator
result
data
data
CPU
memory

I/O

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 604
Accelerator vs. co-processor
• A co-processor executes instructions.
• Instructions are dispatched by the CPU.
• An accelerator appears as a device on the
bus.
• The accelerator is controlled by registers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 605
Accelerator implementations
• Application-specific integrated circuit.
• Field-programmable gate array (FPGA).
• Standard component.
• Example: graphics processor.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 606
System design tasks
• Design a heterogeneous multiprocessor
architecture.
• Processing element (PE): CPU, accelerator,
etc.
• Program the system.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 607
Accelerated system design
• First, determine that the system really
needs to be accelerated.
• How much faster is the accelerator on the
core function?
• How much data transfer overhead?
• Design the accelerator itself.
• Design CPU interface to accelerator.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 608
Accelerated system platforms
• Several off-the-shelf boards are available
for acceleration in PCs:
• FPGA-based core;
• PC bus interface.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 609
Accelerator/CPU interface
• Accelerator registers provide control
registers for CPU.
• Data registers can be used for small data
objects.
• Accelerator may include special-purpose
read/write logic.
• Especially valuable for large data transfers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 610
System integration and
debugging
• Try to debug the CPU/accelerator
interface separately from the accelerator
core.
• Build scaffolding to test the accelerator.
• Hardware/software co-simulation can be
useful.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 611
Caching problems
• Main memory provides the primary data
transfer mechanism to the accelerator.
• Programs must ensure that caching does
not invalidate main memory data.
• CPU reads location S.
• Accelerator writes location S.
• CPU writes location S.
BAD

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 612
Synchronization
• As with cache, main memory writes to
shared memory may cause invalidation:
• CPU reads S.
• Accelerator writes S.
• CPU reads S.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 613
Multiprocessor performance
analysis
• Effects of parallelism (and lack of it):
• Processes.
• CPU and bus.
• Multiple processors.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 614
Accelerator speedup
• Critical parameter is speedup: how much
faster is the system with the accelerator?
• Must take into account:
• Accelerator execution time.
• Data transfer time.
• Synchronization with the master CPU.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 615
Accelerator execution time
• Total accelerator execution time:
• taccel = tin + tx + tout

Data input Data output


Accelerated
computation

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 616
Accelerator speedup
• Assume loop is executed n times.
• Compare accelerated system to non-
accelerated system:
• S = n(tCPU - taccel)
• = n[tCPU - (tin + tx + tout)]

Execution time on CPU

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 617
Single- vs. multi-threaded
• One critical factor is available parallelism:
• single-threaded/blocking: CPU waits for accelerator;
• multithreaded/non-blocking: CPU continues to
execute along with accelerator.
• To multithread, CPU must have useful work to
do.
• But software must also support multithreading.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 618
Total execution time

• Single-threaded: • Multi-threaded:
P1
P1

P2 A1
P2 A1

P3
P3

P4
P4
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 619
Execution time analysis

• Single-threaded: • Multi-threaded:
• Count execution time • Find longest path
of all component through execution.
processes.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 620
Sources of parallelism
• Overlap I/O and accelerator computation.
• Perform operations in batches, read in second
batch of data while computing on first batch.
• Find other work to do on the CPU.
• May reschedule operations to move work
after accelerator initiation.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 621
Data input/output times
• Bus transactions include:
• flushing register/cache values to main
memory;
• time required for CPU to set up transaction;
• overhead of data transfers by bus packets,
handshaking, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 622
Scheduling and allocation
• Must:
• schedule operations in time;
• allocate computations to processing
elements.
• Scheduling and allocation interact, but
separating them helps.
• Alternatively allocate, then schedule.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 623
Example: scheduling and
allocation

P1 P2

M1 M2
d1 d2

P3

Task graph Hardware platform

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 624
First design

• Allocate P1, P2 -> M1; P3 -> M2.

M1 P1 P1C P2 P2C

M2 P3

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 625
Second design

• Allocate P1 -> M1; P2, P3 -> M2:

M1 P1 P1C

M2 P2 P3

time

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 626
Example: adjusting messages
to reduce delay

• Task graph: • Network:


execution time
3 3 allocation
P1 P2
M1 M2 M3

d1 d2
P3
4
Transmission time = 4
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 627
Initial schedule

M1 P1

M2 P2

M3 P3

network d1 d2
Time = 15

0 5 10 15 20 time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 628
New design
• Modify P3:
• reads one packet of d1, one packet of d2
• computes partial result
• continues to next packet

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 629
New schedule

M1 P1

M2 P2

M3
P3 P3 P3 P3
network d1d2d1d2d1d2d1d2
Time = 12

0 5 10 15 20 time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 630
Buffering and performance
• Buffering may sequentialize operations.
• Next process must wait for data to enter
buffer before it can continue.
• Buffer policy (queue, RAM) affects
available parallelism.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 631
Buffers and latency

• Three processes
separated by buffers:

B1 A B2 B B3 C

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 632
Buffers and latency schedules

A[0] A[0]
A[1] B[0]
… C[0]
Must wait for
B[0] all of A before A[1]
B[1] getting any B B[1]
… C[1]
C[0] …
C[1]
… Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 633
Multiprocessors
• Consumer electronics systems.
• Cell phones.
• CDs and DVDs.
• Audio players.
• Digital still cameras.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 634
Consumer electronics use
cases

• Multimedia: stored in
compressed form,
uncompressed on
viewing.
• Data storage and
management: keep track
of your multimedia, etc.
• Communication:
download, upload, chat.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 635
Non-functional requirements
for CE
• Often battery-operated, strict power
budget.,
• Very inexpensive.
• User interface must be capable but
inexpensive.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 636
CE devices and hosts

• Many devices talk to host


system.
• PC host does things that
are hard to do on the
device.
• Increasingly, CE
devices communicate
directly over the
network, avoiding the
host for access.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 637
Platforms and operating
systems

• Many CE devices use


a DSP for signal
processing and a
RISC CPU for other
tasks.
• I/O devices include
buttons, screen, USB.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 638
Flash file systems
• Flash is widely used for mass storage.
• Flash wears out on writing (up to 1 million
cycles).
• Directory is most often written, wears out
first.
• Flash file system has layer that moves
contents to levelize wear.
• Hides wear leveling from API.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 639
Cell phones

• Most popular CE
device in history;
most widely used
computing device.
• 1 billion sold per year.
• Handset talks to cell.
• Cells hand off
handset as it moves.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 640
Cell phone platforms

• Today’s cell phones use analog


front end, digital baseband
processing.
• Future cell phones will
perform IF processing with
DSP.
• Baseband processing in DSP:
• Voice compression.
• Network protocol.
• Other processing:
• Multimedia functions.
• User interface.
• File system.
• Applications (contacts, etc.)
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 641
CD/MP3 player

Audio
CPU
memory
Jog
memory
Error Analog
display focus, drive
corrector out
tracking,
sled,
amp DAC Servo Analog head
motor
CPU in
FE, TE, amp
I2S
memory
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 642
CD medium
• Rotational speed: 1.2-1.4 m/s (CLV).
• Track pitch: 1.6 microns.
• Diameter: 120 mm.
• Pit length: 0.8 -3 microns.
• Pit depth: .11 microns.
• Pit width: 0.5 microns.
• Laser wavelength: 780 nm.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 643
CD mechanism

• Laser, lens, sled:


CD
focus

track
detectors
diffraction
sled grating
laser

track
Overheads for Computers as 644
© 2008 Wayne Wolf Components 2nd ed.
Laser focus

• Focus controlled by vertical position of


lens.
• Unfocused beam causes irregular spot:

Out of focus In focus Out of focus


Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 645
Laser pickup

Side spot
detectors F
A

Level:
D B A+B+C+D
Focus error:
C (A+C)-(B+D)
E Tracking error:
E-F

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 646
Servo control
• Four main signals:
• focus (laser) @ 245 kHz;
• tracking (laser) @ 245 kHz;
Optical pickup
• sled (motor): @ 800 Hz;
• Disc motor.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 647
EFM

• Eight-to-fourteen modulation:
• Fourteen-bit code guarantees a maximum
distance between transitions.

00000011 00100100000000

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 648
Error correction
• CD capacity: 6.99 GB raw, 700 MB formatted.
• Reed-Solomon code:
• g(x) = (x-a) (x- a2) … (x- an-k-1) (x- an-k)
• Produces data, erasure bits.
• Time to solve varies greatly depending on noise.
• CD interleaves Reed-Solomon blocks to reduce
effects of large data gaps.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 649
Control and error correction
• Skips caused by physical disturbance.
• Wait for disturbance to subside.
• Retry.
• Read errors caused by disc/servo problems.
• Detect error.
• Choose location for retry.
• Retry.
• Fail and interpolate.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 650
MPEG audio standards
• Layer 1:
• Lossless compression of subbands + optional
simple masking model
• Layer 2:
• More advanced masking model.
• Layer 3:
• Additional processing for lower bit rates.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 651
MPEG audio rates
• Input sampling rates:
• 32, 44.1, 48 kHz.
• Output bit rates:
• 23, 48, 64, 96, 112, 128, 192, 256, 384
kbits/sec.
• Output can be mono, dual-channel
(bilingual, etc.), stereo.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 652
Other standards
• Dolby Digital (AC-3):
• Uses modified discrete cosine transform.
• ATRAC (MiniDisc):
• Uses subband + modified DCT.
• MPEG-2 AAC.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 653
MPEG Layer 1
• 384 samples/block at all frequencies.
• Equals 8 ms at 48 kHz.
• Optional masking model.
• Driven by separate FFT for better accuracy.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 654
MPEG Layer 1 data frame

• Bit allocation codes specify word length in


each subband.
• Scale factors give gain for each band.

bit scale aux


header CRC subband samples
allocation factors data

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 655
MPEG Layer 1 encoder

Choose
Scale factor
mux
Filter
bank * requantize
0101..
Masking
FFT model

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 656
MPEG Layer 1 decoder

Scale
factor
demux inverse
quantize Inverse
0101.. * * filter
bank
expand
Step
size

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 657
MP3
• Decoding is easier than encoding, but
requires:
• decompression;
• filtering.
• Basic CD standard for data discs.
• No standards for MP3 disc file structure:
player must understand Windows, Mac,
Unix discs.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 658
Audio players

• Audio players may use


flash, hard disk, or CD for
mass storage.
• Decompression requires
small amount of CPU:
• 10% of ARM7.
• File system must be
compatible (FAT).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 659
Digital still cameras

• DSC must determine


exposure before
taking picture.
• After taking picture:
• Improve image
quality.
• Compress.
• Save as file.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 660
Digital still camera
architecture

• DSC uses CPU for


general-purpose
processing, DSP for
image processing.
• Internal memory buffers
the passes on the image.
• Display is lower
resolution than image
sensor.
• Image must be
downsampled.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 661
Image capture

• Before taking picture:


• Determine exposure.
• Determine focus.
• Optimize white
balance.

Bayer pattern

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 662
Image processing
• Must perform basic processing to get
usable picture:
• Bayer->RGB interpolation.
• DSCs perform many functions formerly
performed by photoprocessors for film:
• Image sharpening.
• Color balance.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 663
File management
• EXIF standard gives format for digital
pictures:
• Format of data in a file.
• Directory structure.
• EXIF file includes:
• Image (JPEG, etc.)
• Thumbnail.
• Metadata (camera type, date/time, etc.)

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 664
Accelerators
• Example: video accelerator

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 665
Concept

• Build accelerator for block motion


estimation, one step in video
compression.
• Perform two-dimensional correlation:
f2
f2f2
f2f2f2f2f2f2f2
Frame 1

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 666
Block motion estimation
• MPEG divides frame into 16 x 16
macroblocks for motion estimation.
• Search for best match within a search
range.
• Measure similarity with sum-of-absolute-
differences (SAD):
• S | M(i,j) - S(i-ox, j-oy) |

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 667
Best match

• Best match produces motion vector for


motion block:

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 668
Full search algorithm
bestx = 0; besty = 0;
bestsad = MAXSAD;
for (ox = - SEARCHSIZE; ox < SEARCHSIZE; ox++) {
for (oy = -SEARCHSIZE; oy < SEARCHSIZE; oy++) {
int result = 0;
for (i=0; i<MBSIZE; i++) {
for (j=0; j<MBSIZE; j++) {
result += iabs(mb[i][j] - search[i-
ox+XCENTER][j-oy-YCENTER]);

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 669
Full search algorithm, cont’d.
}
}
if (result <= bestsad) { bestsad = result; bestx =
ox; besty = oy; }
}
}

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 670
Computational requirements
• Let MBSIZE = 16, SEARCHSIZE = 8.
• Search area is 8 + 8 + 1 in each
dimension.
• Must perform:
• nops = (16 x 16) x (17 x 17) = 73984 ops
• CIF format has 352 x 288 pixels -> 22 x
18 macroblocks.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 671
Accelerator requirements
name block motion estimator
purpose block motion est. in PC
inputs macroblocks, search areas
outputs motion vectors
functions compute motion vectors with
full search
performance as fast as possible
manufacturing cost hundreds of dollars
power from PC power supply
physical size/weight PCI card

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 672
Accelerator data types, basic
classes

Motion-vector Macroblock Search-area


x, y : pos pixels[] : pixelval pixels[] : pixelval

PC Motion-estimator
memory[]
compute-mv()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 673
Sequence diagram

:PC :Motion-estimator

compute-mv()
Search area memory[]

memory[]

macroblocks memory[]

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 674
Architectural considerations
• Requires large amount of memory:
• macroblock has 256 pixels;
• search area has 1,089 pixels.
• May need external memory (especially if
buffering multiple macroblocks/search
areas).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 675
Motion estimator organization

search area
PE 0

network
PE 1
generator

comparator
Address

ctrl ...
Motion
vector
macroblock

network

PE 15
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 676
Pixel schedules

PE 0 PE 1 PE 2

|M(0,0)-S(0,0)|

M(0,0)
|M(0,1)-S(0,1)| |M(0,0)-S(0,1)|

|M(0,2)-S(0,2)| |M(0,1)-S(0,2)| |M(0,0)-S(0,2)|


S(0,2)

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 677
System testing
• Testing requires a large amount of data.
• Use simple patterns with obvious answers
for initial tests.
• Extract sample data from JPEG pictures
for more realistic tests.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 678
Networking for Embedded
Systems
• Why we use networks.
• Network abstractions.
• Example networks.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 679
Network elements

distributed computing platform:

PE
PE

communication link
network

PE
PEs may be CPUs or ASICs.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 680
Networks in embedded
systems

initial processing
more processing
PE sensor
PE

PE actuator

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 681
Why distributed?
• Higher performance at lower cost.
• Physically distributed activities---time
constants may not allow transmission to
central site.
• Improved debugging---use one CPU in
network to debug others.
• May buy subsystems that have embedded
processors.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 682
Network abstractions
• International Standards Organization
(ISO) developed the Open Systems
Interconnection (OSI) model to describe
networks:
• 7-layer model.
• Provides a standard way to classify
network components and operations.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 683
OSI model

application end-use interface


presentation data format
session application dialog control
transport connections
network end-to-end service
data link reliable data transport
physical mechanical, electrical

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 684
OSI layers
• Physical: connectors, bit formats, etc.
• Data link: error detection and control
across a single link (single hop).
• Network: end-to-end multi-hop data
communication.
• Transport: provides connections; may
optimize network resources.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 685
OSI layers, cont’d.
• Session: services for end-user
applications: data grouping,
checkpointing, etc.
• Presentation: data formats,
transformation services.
• Application: interface between network
and end-user programs.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 686
Hardware architectures
• Many different types of networks:
• topology;
• scheduling of communication;
• routing.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 687
Point-to-point networks
• One source, one or more destinations, no
data switching (serial port):

PE 1 PE 2 PE 3
link 1 link 2

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 688
Bus networks
• Common physical connection:

PE 1 PE 2 PE 3 PE 4

header address data ECC packet format

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 689
Bus arbitration
• Fixed: Same order of resolution every
time.
• Fair: every PE has same access over long
periods.
• round-robin: rotate top priority among Pes.

fixed A B C A B C
round-robin
A B C B C A
A,B,C A,B,C
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 690
Crossbar

out4

out3

out2

out1
in1 in2 in3 in4
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 691
Crossbar characteristics
• Non-blocking.
• Can handle arbitrary multi-cast
combinations.
• Size proportional to n2.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 692
Multi-stage networks
• Use several stages of switching elements.
• Often blocking.
• Often smaller than crossbar.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 693
Message-based programming
• Transport layer provides message-based
programming interface:
send_msg(adrs,data1);
• Data must be broken into packets at
source, reassembled at destination.
• Data-push programming: make things
happen in network based on data
transfers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 694
I2C bus
• Designed for low-cost, medium data rate
applications.
• Characteristics:
• serial;
• multiple-master;
• fixed-priority arbitration.
• Several microcontrollers come with built-in
I2C controllers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 695
I2C physical layer

master 1 master 2
data line
SDL
clock line
SCL

slave 1 slave 2

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 696
I2C data format

SCL ... ...

SDL ...

start MSB ack

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 697
I2C electrical interface

• Open collector interface: +

SDL
+

SCL

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 698
I2C signaling
• Sender pulls down bus for 0.
• Sender listens to bus---if it tried to send a
1 and heard a 0, someone else is
simultaneously transmitting.
• Transmissions occur in 8-bit bytes.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 699
I2C data link layer
• Every device has an address (7 bits in
standard, 10 bits in extension).
• Bit 8 of address signals read or write.
• General call address allows broadcast.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 700
I2C bus arbitration
• Sender listens while sending address.
• When sender hears a conflict, if its
address is higher, it stops signaling.
• Low-priority senders relinquish control
early enough in clock cycle to allow bit to
be transmitted reliably.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 701
I2C transmissions

multi-byte write

S adrs 0 data data P

read from slave

S adrs 1 data P

write, then read


S adrs 0 data S adrs 1 data P

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 702
Ethernet
• Dominant non-telephone LAN.
• Versions: 10 Mb/s, 100 Mb/s, 1 Gb/s
• Goal: reliable communication over an
unreliable medium.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 703
Ethernet topology
• Bus-based system, several possible
physical layers:

A B C

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 704
CSMA/CD
• Carrier sense multiple access with collision
detection:
• sense collisions;
• exponentially back off in time;
• retransmit.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 705
Exponential back-off times

time
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 706
Ethernet packet format

start source dest data


preamble length padding CRC
frame adrs adrs payload

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 707
Ethernet performance
• Quality-of-service tends to non-linearly
decrease at high load levels.
• Can’t guarantee real-time deadlines.
However, may provide very good service
at proper load levels.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 708
Fieldbus
• Used for industrial control and
instrumentation---factories, etc.
• H1 standard based on 31.25 MB/s twisted
pair medium.
• High Speed Ethernet (HSE) standard
based on 100 Mb/s Ethernet.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 709
Networks
• Network-based design.
• Communication analysis.
• System performance analysis.
• Internet.
• Internet-enabled systems.
• Vehicles as networks.
• Sensor networks

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 710
Communication analysis
• First, understand delay for single
message.
• Delay for multiple messages depends on:
• network protocol;
• devices on network.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 711
Message delay
• Assume:
• single message;
• no contention.
• Delay:
• tm = t x + t n + t r
• = xmtr overhead + network xmit time +
rcvr overhead

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 712
Example: I2C message delay
• Network transmission time dominates.
• Assume 100 kbits/sec, one 8-bit byte.
• Number of bits in packet:
• npacket = start + address + data + stop
• = 1 + 8 + 8 + 1 = 18 bits
• Time required to transmit: 1.8 x 10-4 sec.
• 20 instructions on 8 MHz controller adds 2.5 x
10-6 delay on xmtr, rcvr.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 713
Multiple messages
• If messages can interfere with each other,
analysis is more complex.
• Model total message delay:
• ty = t d + t m
• = wait time for network + message delay

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 714
Arbitration and delay
• Fixed-priority arbitration introduces
unbounded delay for all but highest-
priority device.
• Unless higher-priority devices are known to
have limited rates that allow lower devices to
transmit.
• Round-robin arbitration introduces
bounded delay proportional to N.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 715
Further complications
• Acknowledgment time.
• Transmission errors.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 716
Priority inversion in networks
• In many networks, a packet cannot be
interrupted.
• Result is priority inversion:
• low-priority message holds up higher-priority
message.
• Doesn’t cause deadlock, but can slow
down important communications.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 717
Multihop networks

• In multihop networks, one node receives


message, then retransmits to destination
(or intermediate).
hop 1 hop 2

A B C
Network 1 Network 2

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 718
System performance analysis

• System analysis is difficult in general.


• multiprocessor performance analysis is hard;
• communication performance analysis is hard.
• Simple example: uncertainty in P1 finish
time -> uncertainty in P2 start time.

P1 P2

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 719
Analysis challenges

• P2 and P3 can delay


P1
each other, even
though they are in
separate tasks. P2
• Delays in P1 P3
propagate to P2, then
P3, then to P4.
P4

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 720
Lower bounds on system

• Computational • Communication
requirements: requirements:
• sum up process • Count all
requirements over transmissions in one
least-common multiple period.
of periods, average
over one period.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 721
Hardware platform design
• Need to choose:
• number and types of PEs;
• number and types of networks.
• Evaluate a platform by allocating
processes, scheduling processes and
communication.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 722
I/O-intensive systems
• Start with I/O devices, then consider
computation:
• inventory required devices;
• identify critical deadlines;
• chooses devices that can share PEs;
• analyze communication times;
• choose PEs to go with devices.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 723
Computation-intensive systems
• Start with shortest-deadline tasks:
• Put shortest-deadline tasks on separate PEs.
• Check for interference on critical
communications.
• Allocate low-priority tasks to common PEs
wherever possible.
• Balance loads wherever possible.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 724
Internet Protocol
• Internet Protocol (IP) is basis for Internet.
• Provides an internetworking standard:
between two Ethernets, Ethernet and
token ring, etc.
• Higher-level services are built on top of
IP.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 725
IP in communication

application application
presentation presentation
session session
transport IP transport
network network network
data link data link data link
physical physical physical

node A router node B


Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 726
IP packet
• Includes:
• version, service type, length
• time to live, protocol
• source and destination address
• data payload
• Maximum data payload is 65,535 bytes.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 727
IP addresses
• 32 bits in early IP, 128 bits in IPv6.
• Typically written in form xxx.xx.xx.xx.
• Names (foo.baz.com) translated to IP
address by domain name server (DNS).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 728
Internet routing
• Best effort routing:
• doesn’t guarantee data delivery at IP layer.
• Routing can vary:
• session to session;
• packet to packet.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 729
Higher-level Internet services
• Transmission Control Protocol (TCP)
provides connection-oriented service.
• Quality-of-service (QoS) guaranteed
services are under development.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 730
The Internet service stack

FTP HTTP SMTP telnet SNMP

User
TCP UDP Datagram
Protocol
IP

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 731
Internet-enabled embedded
system
• Internet-enabled embedded system: any
embedded system that includes an Internet
interface (e.g., refrigerator).
• Internet appliance: embedded system designed
for a particular Internet task (e.g. email).
• Examples:
• Cell phone.
• Laser printer.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 732
Example: Javacam
• Hardware platform:
• parallel-port camera;
• National Semi NS486SXF;
• 1.5 Mbytes memory.
• Uses memory-efficient Java Nanokernel.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 733
Javacam architecture

QuickCam Web browser


applet

Quickcam HTTP
server QuickCam
Java VM

Java nanokernel

486

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 734
Vehicles as networks
• 1/3 of cost of car/airplane is
electronics/avionics.
• Dozens of microprocessors are used throughout
the vehicle.
• Network applications:
• Vehicle control.
• Instrumentation.
• Communication.
• Passenger entertainment systems.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 735
CAN bus

• First used in 1991.


• Serial bus, 1 Mb/sec up
to 40 m.
• Synchronous bus.
• Logic 0 dominates logic 1
on bus.
• Arbitrated with
CSMA/AMP:
• Arbitration on message
priority.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 736
CAN data frame

• 11 bit destination
address.
• RTR bit determines
read/write from/to
destination.
• Any node can detect
bus error, interrupt
packet for
retransmission.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 737
CAN controller

• Controller implements
physical and data link
layers.
• No network layer
needed---bus
provides end-to-end
connections.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 738
Other vehicle busses
• FlexRay is next generation:
• Time triggered protocol.
• 10 Mb/s.
• Local Interconnect Network (LIN) connects
devices in a small area (e.g., door).
• Passenger entertainment networks:
• Bluetooth.
• Media Oriented Systems Transport (MOST).

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 739
Avionics
• Anything permanently attached to the aircraft
must be certified by FAA/national agency.
• Traditional architecture uses separate
electronics for each instrument/device.
• Line replaceable unit (LRU) can be physically
removed and replaced.
• Federated architecture shares processors across
a subsystem (nav/comm, etc.)

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 740
Sensor networks
• Wireless networks, small nodes.
• Ad hoc networks---organizes itself without
system administrator:
• Must be able to declare membership in
network, find other networks.
• Must be able to determine routes for data.
• Must update configuration as nodes
enter/leave.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 741
Node capabilities
• Must be able to turn radio on/off quickly
with low power overhead.
• Communication/computation power = 100x.
• Radios should operate at several different
power levels to avoid interference with
other nodes.
• Must buffer, route network traffic.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 742
Networks
• Example: elevator controller.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 743
Terminology
• Elevator car: holds passengers.
• Hoistway: elevator shaft.
• Car control panel: buttons in each car.
• Floor control panel: elevator request, etc.
per floor.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 744
Elevator system

floor

floor

floor

floor

floor

Hoistway 1 Hoistway 2
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 745
Theory of operation
• Each floor has control panel, display.
• Each car has control panel:
• one button per floor;
• emergency stop.
• Controlled by a single controller.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 746
Elevator position sensing

sensor

fine

coarse

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 747
Elevator control
• Elevator control has up and down.
• To stop, disable both.
• Master controller:
• reads elevator positions;
• reads requests;
• schedules elevators;
• controls movement;
• controls doors.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 748
Elevator system requirements

name elevator system


inputs F floor control, N position, N car
control, 1 master
outputs F displays, N motor controllers
functions responds to requests, operates
safely
performance elevator control is time-critical
manufacturing cost electronics is small part of total
power electronics consumes small
fraction of total
physical size/weight cabling is important

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 749
Elevator system class diagram

1
Coarse-sensor*
Master-control-panel*
1 1
1 N 1
Fine-sensor* Car 1
1 1
1
1 Controller
Car-control-panel* 1
1
1 Floor F N
Floor-control-panel* 1 Motor*

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 750
Physical interfaces

Sensor* Car-control-panel*
hit: boolean Floors[1..F]: boolean
emergency-stop:
boolean
open-door, close-door:
Coarse-sensor* Fine-sensor* boolean

Master-control-panel...
Motor* Floor-control-panel*
speed: {o,s,f} up, down: boolean
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 751
Car and Floor classes

Car Floor
request-lights[1..F]:
up-light, down-light:
boolean
boolean
current-floor: integer

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 752
Controller class

Controller
car-floor[1..H]: integer
emergency-stop[1..H]:
integer
scan-cars()
scan-floors()
scan-master-panel()
operate()

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 753
Architecture
• Computation and I/O occur at:
• floor control panels/displays;
• elevator cars;
• system controller.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 754
Panels and cab controller
• Panels are straightforward---no real-time
requirements.
• Cab controller:
• read buttons and send events to system
controller;
• read sensor inputs and send to system
controller.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 755
System controller
• Must take inputs from many sources:
• car controllers;
• floors.
• Must control cars to hard real-time
deadlines.
• User interface, scheduling are soft
deadlines.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 756
Testing
• Build an elevator simulator using an
FPGA:
• simulate multiple elevators;
• simulate real-time control demands.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 757
System design techniques
• Design methodologies.
• Requirements and specification.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 758
Design methodologies
• Process for creating a system.
• Many systems are complex:
• large specifications;
• multiple designers;
• interface to manufacturing.
• Proper processes improve:
• quality;
• cost of design and manufacture.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 759
Product metrics
• Time-to-market:
• beat competitors to market;
• meet marketing window (back-to-school).
• Design cost.
• Manufacturing cost.
• Quality.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 760
Mars Climate Observer
• Lost on Mars in September 1999.
• Requirements problem:
• Requirements did not specify units.
• Lockheed Martin used English; JPL wanted
metric.
• Not caught by manual inspections.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 761
Design flow
• Design flow: sequence of steps in a
design methodology.
• May be partially or fully automated.
• Use tools to transform, verify design.
• Design flow is one component of
methodology. Methodology also includes
management organization, etc.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 762
Waterfall model

• Early model for software development:

requirements

architecture

coding

testing

maintenance
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 763
Waterfall model steps
• Requirements: determine basic
characteristics.
• Architecture: decompose into basic
modules.
• Coding: implement and integrate.
• Testing: exercise and uncover bugs.
• Maintenance: deploy, fix bugs, upgrade.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 764
Waterfall model critique
• Only local feedback---may need iterations
between coding and requirements, for
example.
• Doesn’t integrate top-down and bottom-
up design.
• Assumes hardware is given.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 765
Spiral model

system feasibility

specification

prototype

initial system

enhanced system
requirements
design
test

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 766
Spiral model critique
• Successive refinement of system.
• Start with mock-ups, move through simple
systems to full-scale systems.
• Provides bottom-up feedback from
previous stages.
• Working through stages may take too
much time.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 767
Successive refinement model

specify specify

architect architect

design design
build build
test test

initial system refined system

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 768
Hardware/software design flow

requirements and
specification

architecture

hardware design software design

integration

testing
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 769
Co-design methodology
• Must architect hardware and software
together:
• provide sufficient resources;
• avoid software bottlenecks.
• Can build pieces somewhat
independently, but integration is major
step.
• Also requires bottom-up feedback.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 770
Hierarchical design flow
• Embedded systems must be designed
across multiple levels of abstraction:
• system architecture;
• hardware and software systems;
• hardware and software components.
• Often need design flows within design
flows.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 771
Hierarchical HW/SW flow

spec spec
spec
architecture HWSW
architecture
architecture

HW SW detailed
detailed
design
design

integrate integration
integration

test testtest

system hardware
software
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 772
Concurrent engineering
• Large projects use many people from
multiple disciplines.
• Work on several tasks at once to reduce
design time.
• Feedback between tasks helps improve
quality, reduce number of later design
problems.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 773
Concurrent engineering
techniques
• Cross-functional teams.
• Concurrent product realization.
• Incremental information sharing.
• Integrated product management.
• Supplier involvement.
• Customer focus.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 774
AT&T PBX concurrent
engineering
• Benchmark against competitors.
• Identify breakthrough improvements.
• Characterize current process.
• Create new process.
• Verify new process.
• Implement.
• Measure and improve.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 775
Requirements analysis
• Requirements: informal description of
what customer wants.
• Specification: precise description of what
design team should deliver.
• Requirements phase links customers with
designers.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 776
Types of requirements
• Functional: input/output relationships.
• Non-functional:
• timing;
• power consumption;
• manufacturing cost;
• physical size;
• time-to-market;
• reliability.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 777
Good requirements
• Correct.
• Unambiguous.
• Complete.
• Verifiable: is each requirement satisfied in
the final system?
• Consistent: requirements do not
contradict each other.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 778
Good requirements, cont’d.
• Modifiable: can update requirements
easily.
• Traceable:
• know why each requirement exists;
• go from source documents to requirements;
• go from requirement to implementation;
• back from implementation to requirement.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 779
Setting requirements
• Customer interviews.
• Comparison with competitors.
• Sales feedback.
• Mock-ups, prototypes.
• Next-bench syndrome (HP): design a
product for someone like you.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 780
Specifications
• Capture functional and non-functional
properties:
• verify correctness of spec;
• compare spec to implementation.
• Many specification styles:
• control-oriented vs. data-oriented;
• textual vs. graphical.
• UML is one specification/design language.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 781
SDL

• Used in telephone
on-hook
telecommunications
protocol design. caller goes
• Event-oriented state off-hook
machine model.
dial tone

caller gets
dial tone

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 782
Statecharts
• Ancestor of UML state diagrams.
• Provided composite states:
• OR states;
• AND states.
• Composite states reduce the size of the
state transition graph.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 783
Statechart OR state
s123
i1 i1
S1 S1
i2
i1 i1 i2
i2
S2 S4 S2 S4

i2

S3 S3

traditional OR state
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 784
Statechart AND state
sab
c
S1-3 S1-4 S1 S3
d
b a b a b a c d
c
S2-3 S2-4 S2 S4
d r
r r
S5
S5
traditional AND state
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 785
AND-OR tables
• Alternate way of specifying complex
conditions:
cond1 or (cond2 and !cond3)

cond1 T
OR -
cond2 - T
AND cond3 - F

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 786
TCAS II specification
• TCAS II: aircraft collision avoidance
system.
• Monitors aircraft and air traffic info.
• Provides audio warnings and directives to
avoid collisions.
• Leveson et al used RMSL language to
capture the TCAS specification.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 787
RMSL

• State description: • Transition bus for


transitions between
state1 many states:
inputs
a

state description b

c
outputs
d
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 788
TCAS top-level description
CAS

power-off
power-on
Inputs:
TCAS-operational-status {operational,not-operational}
fully-operational
own-aircraft C

other-aircraft i:[1..30] standby


mode-s-ground-station i:[1..15]

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 789
Own-Aircraft AND state
CAS
Inputs:
own-alt-radio: integer standby-discrete-input: {true,false}
own-alt-barometric:integer, etc.

Effective-SL Alt-SL Alt-layer Climb-inibit


... Descend-inibit
...
1 1 ...
Increase-climb-inibit
2 2
... Increase-Descend-inibit ...
... ...
7 7 Advisory-Status ...

Outputs:
sound-aural-alarm: {true,false} aural-alarm-inhibit: {true, false}
combined-control-out: enumerated, etc.
Overheads for Computers as
© 2008 Wayne Wolf Components 2nd ed. 790
CRC cards
• Well-known method for analyzing a
system and developing an architecture.
• CRC:
• classes;
• responsibilities of each class;
• collaborators are other classes that work with
a class.
• Team-oriented methodology.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 791
CRC card format

Class name: Class name:


Superclasses: Class’s function:
Subclasses: Attributes:
Responsibilities: Collaborators:

front back

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 792
CRC methodology
• Develop an initial list of classes.
• Simple description is OK.
• Team members should discuss their choices.
• Write initial responsibilities/collaborators.
• Helps to define the classes.
• Create some usage scenarios.
• Major uses of system and classes.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 793
CRC methodology, cont’d.
• Walk through scenarios.
• See what works and doesn’t work.
• Refine the classes, responsibilities, and
collaborators.
• Add class relatoinships:
• superclass, subclass.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 794
CRC cards for elevator
• Real-world classes:
• elevator car, passenger, floor control, car
control, car sensor.
• Architectural classes: car state, floor
control reader, car control reader, car
control sender, scheduler.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 795
Elevator responsibilities and
collaborators

class responsibilities collaborators

Elevator car* Move up and down Car control, car


sensor, car control
sender
Car control* Transmits car Passenger, floor
requests control reader
Car state Reads current Scheduler, car
position of car sensor

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 796
System design techniques
• Quality assurance.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 797
Quality assurance
• Quality judged by how well product
satisfies its intended function.
• May be measured in different ways for
different kinds of products.
• Quality assurance (QA) makes sure that
all stages of the design process help to
deliver a quality product.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 798
Therac-25 Medical Imager
(Leveson and Turner)
• Six known accidents: radiation overdoses
leading to death and serious injury.
• Radiation gun controlled by PDP-11.
• Four major software components:
• stored data;
• scheduler;
• set of tasks;
• interrupt services.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 799
Therac-25 tasks
• Treatment monitor controlled and
monitored setup and delivery of treatment
in eight phases.
• Servo task controlled radiation gun.
• Housekeeper task took care of status
interlocks and limit checks.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 800
Treatment monitor task
• Treat was main monitor task.
• Eight subroutines.
• Treat rescheduled itself after every
subroutine.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 801
Software timing race
• Timing-dependent use of mode and
energy:
• if keyboard handler sets completion behavior
before operator changes mode/energy data,
Datent task will not detect the change, but
Hand task will.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 802
Software timing errors
• Changes to parameters made by operator
may show on screen but not be sensed by
Datent task.
• One accident caused by entering
mode/energy, changing mode/energy,
returning to command line in 8 seconds.
• Skilled operators typed faster, more likely
to exercise bug.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 803
Leveson and Turner
observations
• Performed limited safety analysis:
guessed at error probabilities, etc.
• Did not use mechanical backups to check
machine operation.
• Used overly complex programs written in
unreliable styles.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 804
ISO 9000
• Developed by International Standards
organization.
• Applies to a broad range industries.
• Concentrates on process.
• Validation based on extensive
documentation of organization’s process.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 805
CMU Capability Maturity Model
• Five levels of organizational maturity:
• Initial: poorly organized process, depends on
individuals.
• Repeatable: basic tracking mechanisms.
• Defined: processes documented and
standardized.
• Managed: makes detailed measurements.
• Optimizing: measurements used for
improvement.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 806
Verification

• Verification and testing are important


throughout the design flow.
• Early bugs are more expensive to fix:
cost to fix

requirements
bug coding bug

Overheads for Computers as


time
© 2008 Wayne Wolf Components 2nd ed. 807
Verifying requirements and
specification
• Requirements:
• prototypes;
• prototyping languages;
• pre-existing systems.
• Specifications:
• usage scenarios;
• formal techniques.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 808
Design review
• Uses meetings to catch design flaws.
• Simple, low-cost.
• Proven by experiments to be effective.
• Use other people in the project/company
to help spot design problems.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 809
Design review players
• Designers: present design to rest of team,
make changes.
• Review leader: coordinates process.
• Review scribe: takes notes of meetings.
• Review audience: looks for bugs.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 810
Before the design review
• Design team prepares documents used to
describe the design.
• Leader recruits audience, coordinates
meetings, distributes handouts, etc.
• Audience members familiarize themselves
with the documents before they go to the
meeting.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 811
Design review meeting
• Leader keeps meeting moving; scribe
takes notes.
• Designers present the design:
• use handouts;
• explain what is going on;
• go through details.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 812
Design review audience
• Look for any problems:
• Is the design consistent with the
specification?
• Is the interface correct?
• How well is the component’s internal
architecture designed?
• Did they use good design/coding practices?
• Is the testing strategy adequate?

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 813
Follow-up
• Designers make suggested changes.
• Document changes.
• Leader checks on results of changes, may
distribute to audience for further review
or additional reviews.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 814
Measurements
• Measurements help ground our beliefs:
• Do our practices really work?
• Do they work where we think they work?
• Types of measurements:
• bugs found at different stages of design;
• bugs as a function of time;
• bugs in different types of components;
• how bugs are found.

Overheads for Computers as


© 2008 Wayne Wolf Components 2nd ed. 815

You might also like