You are on page 1of 20

University of Manchester School of Computer Science

Register Transfer Level (RTL)


In order to build whole systems from gates successfully we need some
sort of hierarchy.
The next ‘level’ above gates is usually referred to as “RTL”.
❏ “Register” is the name given to a number of flip-flops acting D Q
together to hold a single coherent value, such as …
CE
❍ a numerical value
❍ a computer instruction
❍ a colour D Q

CE

❏ “Transfer” refers to the need to move values between


registers, possibly …
❍ applying some operation – such as add, compare, etc. D Q
– in the process CE

❍ maybe predicating operations on some condition

COMP10211 – The Underlying Machine RTL Slide 1

Registers Methods for designing logic blocks


A typical register will comprise a number of D-type flip-flops which each have Although we have described an adder, we have not yet looked at how blocks of
their own input and output bit but all have a common clock input. This enforces combinatorial logic can be designed and reduced to gates. In this course we are
synchronization – all the bits switch at the same time. not going to go into detailed methods (there are some keywords below if you
wish to find out); instead we will use CAD tools to do a lot of the work.
A clock is normally free-running but it may not be desirable for the register to
change every cycle. Usually, therefore, there will be a clock enable (CE) input However, there are some things you must still do.
which is also common to the flip-flops in one register. Different clock enables
First, represent the function. For a combinatorial block this will probably be a
control different registers whereas the clock will typically be the same signal in
truth table. It is quite normal to deal with ‘don’t care’ values both in inputs and
all the registers in a system.
outputs to reduce the size of a truth table.
More rarely other inputs are also present. For example there may be a ‘clear’
Two (contrived) examples:
signal which sets all the flip-flops to a known value; this may be used for initial-
ising the system. (It has not been shown here.) Coffee Cocoa Sleep
Invited Going Buy beer False False True
It is typical for a register to be drawn a bit like a ‘fat’ flip-flop, with buses show- False X False False True True
ing the input and output values and single wires for the clock and control inputs. True False False True False False
True True True True True X

In the first, if you’re not invited you can’t go so the second input is irrelevant.
In the second you can’t have both at the same time so the outcome can be any-
thing convenient.
The resultant logic may then be simple enough to apply an ad-hoc approach (i.e.
it’s ‘obvious’) as we did with the adder previously. If not some form of logic
CE reduction is needed.
Some people can do this with Boolean algebra. Another way is to use a graphi-
cal approach such as Karnaugh maps which work well for up to four or five
inputs. Beyond that, CAD tools are very helpful at implementing (e.g.) the
Because the clock is common to all registers, sometimes it may be omitted on Quine-McCluskey algorithm.
‘informal’ drawings. (We will try to keep it visible in this course!)
Espresso is a logic reduction programme that has been around for many years
(since at least 1986) which uses a different algorithm. It is a useful, but some-
what user-unfriendly, tool.
In later lab. exercises we will express the problem in a Hardware Design Lan-
guage (HDL) and leave it to the compiler to minimise the logic to the actual
gates.
University of Manchester School of Computer Science

Finite State Machines

A Finite State Machine (FSM) is a


sequential machine that moves

Inputs

Outputs
Register
from one ‘state’ to another Combinatorial
according to some well-defined
rules. Logic

The next state is influenced by Feedback


the current one:
❏ The function is defined by the combinatorial logic.
❏ The state is held in some form of register.
❏ Inputs may influence the behaviour (possibly according to the current state)
❏ Outputs may be state bits or derived separately

Definition
A register is simply a collection of flip-flops which are acting together.

COMP10211 – The Underlying Machine RTL Slide 2

Finite State Machines Exercise – Mealy machines and Moore machines


These are two often described forms of FSM. Produce a generalised sketch of
In a state machine ‘what happens next’ is determined by a set of rules and is each and note their differences.
influenced by ‘where we are now’.
Mealy machine
Example
Rule: counting in threes.
Current state = 0 ⇒ next state = 3
Current state = 3 ⇒ next state = 6
Current state = 6 ⇒ next state = 9 Note: this is not a finite state machine!
A Finite State Machine – usually abbreviated to FSM – works on this principle,
but the number of states is finite; for most real machines the number of states is
also quite small.

Implementation
❏ To remember its history the machine must contain some state
holding devices Moore machine
(i.e. flip-flops).
❏ To represent each state unambiguously there must be at least
log2(states) flip-flops. (N flip-flops allow the representation of 2N
states.)

Number of Number of
states flip-flops
1 0
2 1
3-4 2
5-8 3
9-16 4 Differences

(2N-1+1) - 2N N
University of Manchester School of Computer Science

State Transition Diagrams


❏ It is often useful to have a diagrammatic notation for indicating the sequence of states.
❏ Use a directed graph of nodes (one representing each state) and links (or arcs)
corresponding to state transitions (the next state):

initial state
0
6

4
3

❏ This represents a counter which cycles through seven states.


❏ This is called a modulo-7 counter
Note: ‘traditionally’ numbering starts at zero

COMP10211 – The Underlying Machine RTL Slide 3

Design Process
The complete design process for simple sequential systems usually works some- State Transition Table
thing like this:.
Current state Next state
❏ First, decide on the inputs and outputs, and determine the different State No. C B A C′ B′ A′
internal states required. Draw a state transition diagram1. 0 0 0 0 0 0 1
❏ Perform state assignment (i.e. label each state with a unique No,). 1 0 0 1 0 1 0
2 0 1 0 0 1 1
❍ State assignment is arbitrary but the particular state assign-
3 0 1 1 1 0 0
ment chosen may influence the complexity of the finished
4 1 0 0 1 0 1
design; as you gain experience optimal state assignment
5 1 0 1 1 1 0
becomes easier.
6 1 1 0 0 0 0
❍ Often some of the state bits can be used as output bits directly. 7 1 1 1 δ δ δ

This summarises the state transition diagram, showing what happens next in
In a manual process, proceed as follows:
each state. Note that (in this case) seven states were required but the implemen-
❏ Complete a state transition table. tation needs a whole number of bits. Three bits is the minimum which can be
❏ Extract (and simplify) all the logic equations – Karnaugh maps may used (23 = 8) – the minimum number of state bits is usually preferred – so there
help. are eight states in the state transition table.
❏ Draw the logic diagram as gates and connect to the appropriate The undefined state should never be reached, so it doesn’t really matter what its
flip-flops. next state is. However in this example it would be preferable if it did not remain
❏ Test! in state #7 (i.e. the ‘don’t cares’ all end up as “1”; this would ensure that the cir-
cuit would recover if it reached this ‘illegal’ state (e.g. at switch on).

If using a HDL: A state transition table is not, strictly speaking, a truth table, but it can be
reduced to logic equations using the same methods.
❏ Usually the FSM can be translated directly into HDL code
❏ Compile
❏ Test!

1. Sometimes just called a “state diagram”


University of Manchester School of Computer Science

The Synchronous Paradigm

When does a state machine move from one state to the next?
❏ In principle this can happen as a result of any input change.
❏ In practice it is normal to change only in response to a clock transition

This is synchronous design – a very useful design simplification:


❏ The state machine is easier to design: ‘all’ you have to do is look at:
❍ where you are now
❍ what your inputs are
❏ … and then write down where you want to be next.
❏ Define all state transitions (correctly!) and your system will work.

COMP10211 – The Underlying Machine RTL Slide 4

Synchronous design benefits


The slide above focuses on design benefit because that is the most immediately
important. However there are several other benefits to this paradigm:
❏ Combinatorial logic can produce output ‘glitches’ as signals race
down different paths. These can be allowed to settle before the clock
transition occurs.
❏ Changes from different parts of the system which arrive in the same
clock cycle can be regarded as ‘simultaneous’.
❏ Outputs change ‘simultaneously’ (in the same cycle) if desired.
❏ The timing of the system is easy to predict by counting the number
of clock cycles needed in the state diagram.
❏ The design tools are geared towards synchronous machines, that
being ‘normal’ practice.
University of Manchester School of Computer Science

Example FSM – a Counter


Clock

A′
A

Register
Combinatorial
B′
E.g. our modulo-7 counter B
C′
Logic C

Current state Next state


State no. C B A C′ B′ A′
Transition table 0 0 0 0 0 0 1
1 0 0 1 0 1 0
2 0 1 0 0 1 1
3 0 1 1 1 0 0
4 1 0 0 1 0 1
5 1 0 1 1 1 0
6 1 1 0 0 0 0
7 1 1 1 δ δ δ

COMP10211 – The Underlying Machine RTL Slide 5

Simple FSMs ❏ Draw the schematic diagram.


A
❏ Most sequential circuits have a single clock, which determines the
A′
times at which a state change may take place. All changes happen
B
synchronously – i.e. at the same time.
C
❏ The binary digits which determine the current state are stored in a
register; these are sometimes called the state vector, or just the state. ❏ B′ and C′ can be determined similarly; again, this is left as an
❏ In the counter example (opposite) the state is the output. This is not exercise.
generally true; sometimes some or all of the state bits are hidden
(i.e. used purely internally).
❏ This counter has no Boolean inputs; it is free-running. There will
always be a state change for a counter
❏ The next state is determined from the current state by a combinato-
rial logic network.

Design process
❏ The state transition diagram is given on slide 4.
❏ State assignment for a counter is self-selecting; the numbers repre-
sent binary codes in an ascending sequence. (Note that a descending
modulo-7 counter is also possible, as would be a number of bizarre ❏ Test!
variations.)
❏ The state transition table. is shown on slide 5.
❏ For output A′, we select the entries where A′ is true (1), and write
down the Boolean equation:
A′ = (A and B and C) or (A and B and C) or (A and B and C)
A Karnaugh map will reduce this to:
A′ = (A and B) or (A and C)
Boolean algebra can take this a bit further:
A′ = A and (B or C)
A′ = A and (B and C)
University of Manchester School of Computer Science

External Inputs
❏ A typical state machine will allow some external control
❏ Some or all of the transitions will be conditional
e.g. a modulo-7 counter with a (synchronous) reset

Note: no need for condition here reset = 1

0 reset = 0
6
reset = 1
reset = 0 1
reset = 1

5 reset = 1 reset = 0

reset = 0 reset = 1 2
reset = 1
4
3 reset = 0
reset = 0

The synchronous reset forces the counter to state #0 when the next active clock transition occurs.

COMP10211 – The Underlying Machine RTL Slide 6

Conditional Transitions Go=1


0
6
If an FSM has any inputs these must be able to alter its behaviour. This means
that transitions may be conditional. Go=0 1

up=1 0 up=1
6 5
up=0

up=0 1 2
up=1 up=0
4
up=1
5 up=0 3
up=0
up=0 2 Triggerable counter
up=1 up=0
4
3 up=1
up=1 Set-up and hold times
Synchronous up-down counter Note that external inputs should not violate the set-up and hold times of the
FSM’s registers. This usually requires that they are synchronized to the same
A finite state machine will always behave in some way (even if that behaviour is clock.
‘do nothing’) each time it has an opportunity to change state (usually corre-
sponding to a clock pulse).
A state diagram is supposed to represent the behaviour of the state machine. It is
therefore essential that it shows the behaviour under all conditions. Note that
this may show that the FSM remains in the current state (see example below). Exercise
Not all arcs need be conditional. In most cases some (or even all) of the inputs Complete the state transition table for a modulo-4 up/down counter:
will not influence the behaviour in certain states.
State Up Next
In the example above the counter is started by a particular input (Go=1) after 0 up=1
up=1 0 0 0
which the input is ignored until the FSM has completed its counting cycle.
up=0 up=0 0 0 1
0 1 0
3 1
0 1 1
up=0 up=0 1 0 0
up=1 up=1 1 0 1
2
1 1 0
1 1 1
University of Manchester School of Computer Science

Subdividing the State Machine


Many sequential systems have a very large number of state bits: e.g:
❏ a 32-bit RISC processor may have 32 registers.
The register bank therefore has 21024 possible states
❏ the behaviour of the processor is largely independent of the state of the register bank

The FSM model represents the behaviour of the processor quite well …

fetch dec. ALU

addr mem.

branch

… but duplicating each state 21024 times for each possible value
of the register bank state would make the bubble diagram unreadable!

COMP10211 – The Underlying Machine RTL Slide 7

Splitting a System into


Data and Control Elements
A typical digital system can be divided into data and control parts. The data ele- Of course, the subsystems are not completely separate. The datapath will influ-
ments collectively form what is often known as the datapath (the term “data” is ence the control at the decode stage because it supplies the instruction. In a few
usually reserved for the values contained therein) – control is usually referred to cases there is also influence during execution: an example would be a condi-
as just control. Naturally these two parts of the system interact. In general the tional branch; an instruction (BNE) could specify “Branch if Not Equal” (to
control tells the datapath what to do – and when to do it; the datapath has less zero) – IF the specified data value is not zero a jump would occur, ELSE the
influence on the control but will occasionally supply values which influence its instruction would be ignored.
behaviour.

However the two subsystems separate at well defined boundaries and this is
For example a processor fetches, decodes and executes instructions:
important in keeping the complexity manageable.
❏ Fetch: output an address, input an instruction
❏ Decode: decide what the instruction should do
Further subdivision of the datapath may be done: an obvious subdivision here
❏ Execute: change the processor’s register (or control) state
would be to build separate units for each instruction processing stage. This
❍ the details of execution depend on the instruction fetched approach is common in high-performance processors.

The fetch will normally involve some control sequencing but is totally inde- In the MU0 processor we will look at later this approach is not taken in order
pendent of the value (instruction) fetched. to keep the size of the design down. Instead a single arithmetic unit is made
to serve the needs of fetch and decode at different times.
The decode may be different for different instructions, but there will be fewer
than 232 categories of operation1 so there will still be a large amount of commo-
The control may also be subdivided as appropriate to keep its complexity under
nality (e.g. the control probably doesn’t care which registers are being used).
control. An common example application could be a traffic light controller.
Execution may have a (small) number of different behaviours. However the
Traffic lights follow a fixed sequence but typically spend different amounts of
controller may order an ADD instruction but will not care what numbers (from
time in each phase. One approach is to have several states which all appear
the 264 possible combinations) the datapath adds. the same. Another approach is to have an FSM which cycles the phases and
another which handles the timing. When the first enters a new state it signals
Register Transfer Level (RTL) is a means of exploiting this separation of data this to the timing together with the number of cycles it should remain there; it
and control in order to simplify the design process. RTL effectively ignores the then waits for the timing FSM to signal that that time has elapsed.
different values (states) of the data, instead treating them as individual variables. All the second machine does is input a time, count off that number of cycles
RTL is therefore a hierarchical ‘level’ of abstraction ‘higher’ than a gate level and output a ‘finished’ signal.
design.

1. Tacitly assuming a 32-bit RISC processor.


University of Manchester School of Computer Science

The RTL Model


In the register transfer level model we split the complete system state up into registers and
consider the flow of information ‘in bulk’ from one register to the next on each clock tick:
clk

reg1 logic reg2 logic reg3

Here the system state changes on the rising edge of every clock cycle, and the functionality can be
described by:
f1{reg2} ⇒ reg3; f2{reg1} ⇒ reg2
Note: synchronous operation means changes occur in parallel. This corresponds to pipelined
data processing; in general we want more flexible control of the register behaviour, so:
❏ all registers are updated at the same point in the clock cycle
❏ some registers may not change in particular clock cycles
It is assumed that the clock cycle is long enough for all logic between registers to stabilise before
the next active clock edge

COMP10211 – The Underlying Machine RTL Slide 8

Register Transfer Level (RTL)


As the name might suggest RTL design concentrates on design at the register
level, and the blocks which join them.
As the datapath is normally many (e.g. 32) bits wide whereas the control is com-
mon to all the data bits the datapath normally accounts for most of the gates in a
finished design. This means that a surplus ‘gate’ in the datapath design results in
(maybe) 32 surplus gates on the chip. For this reason it is usually designed and
optimised first.
The RTL model handles data movement well but doesn’t give a good representa-
tion of the control structure.
The FSM model (already encountered) copes with the control structure well but
can’t handle data movement.
Therefore a typical sequential system is best handled with a combination of the
two approaches:
❏ the part of the system which handles data where the behaviour is
largely independent of the data value is modelled at RTL
❏ the part of the system which controls the movement of data and
contains the different possible behaviours is modelled as an FSM
(or, in complex cases, several FSMs)

Note a ‘pipeline’ is not the only possible structure; feedback of data values is
typical in most computing machines.
Register

Register

Logic Logic
University of Manchester School of Computer Science

ARM Microprocessor Subdivisions

Datapath Fetch

Decode

Registers
Control

Memory
Shift

ALU

COMP10211 – The Underlying Machine RTL Slide 9

The slide shows a possible implementation of an ARM microprocessor, similar Don’t try and memorise this picture! However, later in the course or at revision
to the simpler versions available. The processor occupies the right-hand three time, you might like to consider how some of the ARM instructions you have
quarters of the figure, a memory has been shown on the left hand side. used in COMP10031 use the various buses shown here. Here’s a few to try:
❏ SUB R0, R1, R2
The ARM is a 32-bit microprocessor, so the thick lines normally show 32-bit ❏ LDR R5, [R6, R7]
buses. The thinner lines show control paths. ❏ STR R8, [R9], #&20

The datapath has been shown subdivided into its major blocks such as instruc-
tion fetch, instruction decode, the register bank and the Arithmetic Logic Unit
(ALU) which performs the actual calculations. Multiplexers allow the selection
of inputs at various places.

The control logic is less easy to draw at this scale, so has been concealed.

We will produce a similar, if simpler, picture for a smaller microprocessor later.


University of Manchester School of Computer Science

Tools and Methodology


(so far)
S B A Output

0 X 0 0
0 X 1 1
1 0 X 0
1 1 X 1
Truth tables
Schematics

fetch dec. ALU

addr mem.

branch
Timing diagrams
State diagrams

COMP10211 – The Underlying Machine RTL Slide 10

Revision
To summarise the major classes of design aids to date:

Schematics Truth tables


Schematics can be used to show basic gate functions and how they are intercon- Truth tables can be used to document the output(s) of a block of combinatorial
nected. Gates are drawn using standard symbols. logic in a compact form. They show the output state for any possible input state.

A hierarchy is typical so that subcircuits can be ‘hidden’ to keep complexity Truth tables can be a useful design aid in specifying what you want to happen.
under control. This also promotes the reuse of components. They are especially good at forcing all possible input combinations to be consid-
ered.

Timing Diagrams State diagrams


Timing diagrams can be used as a design aid to depict how circuits should State diagrams are useful for depicting what a machine should do under particu-
behave as they evolve through time. They may show individual signals or values lar circumstances. Like a truth table, they can ensure that all possible combina-
if they depict buses. They complement state diagrams in showing the evolution tions of internal and input state are considered.
of a system under a particular set of conditions.

Timing diagrams are also used as simulation outputs to show what a circuit State diagrams are frequently used to document the behaviour of finite state
would really do under a set of test circumstances. This can be compared with machines. Note that FSMs are frequently implemented in software too; this is
what was wanted. When there are errors these can be traced back in time to find not just for hardware.
the cause.
University of Manchester School of Computer Science

Design Example: Car Park Controller


Black box controller
?

In Out

❏ Allow cars in, one at a time, unless car park full


❏ Allow cars out, one at a time
❏ Illuminate a sign (and disable the entry barrier) when the car park is full

COMP10211 – The Underlying Machine RTL Slide 11

Car Park Controller: Specification


Before commencing the design it is important (nay, essential!) to specify what
Deductions
the system should do. Even though the slide looks simple it contains more ele-
ments than we are going to build. For example we are going to assume that the ❏ The occupancy of the car park must be deduced from the number of
barrier operation is implicit and the traffic lights will be omitted completely. cars entering and leaving

Specification ❏ Some form of controllable up/down counter will be needed


A controller for access to a car park.
❏ The data path width must be great enough to accommodate the
❏ Cars may leave at any time. largest count values – equivalent to the capacity of the car park
❍ Cars attempting to leave will be sensed, and counted out ❍ e.g. for 800 parking spaces, (at least) 10 bits are needed

❏ Cars may enter at any time, unless the car park is full
❍ Cars attempting to be entered will be sensed
if the car park is not full they will be admitted and counted in
else the car park is full so they will not be admitted From the foregoing we can sketch a preliminary state diagram.

❏ If the car park is full, switch on a sign

Interface
❏ Sensors
❍ an input sensor
❍ an output sensor
❏ Actuators
❍ a ‘full’ light
❍ an input barrier control
❍ an output barrier control

(The following design neglects the barrier controls. You are encouraged to add
outputs to control them.)
University of Manchester School of Computer Science

Design Example: Car Park Controller


Supposition:
❏ keep a count of the number of cars in the car park
Here the ‘count’ can be treated as a data item and operations on it considered by the RTL model.
An informal control bubble diagram could be:

dec tot
out
not full

in
C H
ready inc tot cmp
E T Y
full K N L
out
S O
dec tot full

where in indicates a car entering, out a car exiting, and full is true when the total equals the
maximum capacity of the car park.
❏ need a constant to represent the ‘maximum capacity’
The ready and full states are exited only when out or in is true.

COMP10211 – The Underlying Machine RTL Slide 12

Design Process Car Park Controller


To design a digital system (such as a computer processor) it is first necessary to In this example some of the complexity is neglected. For example the action out
specify its behaviour. The specification should be sufficiently rigorous to deter- may involve sensing a car’s arrival, waiting for payment, raising a barrier, wait-
mine the behaviour in any combination of internal and input states. This should ing for the car to clear the barrier and lowering the barrier again. However we
also help determine the interfaces, i.e. the inputs and outputs (usually abbrevi- could defer all these actions to a separate state machine which would signal each
ated to I/O). time it had completed this cycle.
When this is done it should be relatively straightforward to determine the This is another example of hierarchy in the design.
requirements for the datapath. The number of variables which must be stored
will indicate the minimum number of registers required and the processing oper-
Question: the in process is similar but what other signal would it require?
ations on these variables will be enumerated. Note that while it is impossible to
(hint below)
produce a working system with fewer registers than variables it is perfectly pos-
sible to produce one with extra registers; although this is sometimes valuable
when attempting to increase performance (techniques such as pipelining will be
discussed later) it increases the control complexity and is usually undesirable.
A good rule is:
“Things should be made as simple as possible – but no simpler.”
– A. Einstein
The bubble diagram on the slide is ‘informal’; it assumes:
❏ that the process in cannot happen in the full state
❏ that the process out never decrements the total below zero
Are these assumptions reasonable?

In practice …
Nowadays, in practice, no one would build a logic circuit to control a car
park barrier. The preferred method would be to use an off-the-shelf micro-
controller (single chip computer) and use software to customise its func-
tion.
Specific hardware (e.g. ASICs – Application Specific Integrated Circuits)
are only used when software would be too slow for the job. As a guide hard-
ware should be about 100x faster than a software solution; for a car park bar-
rier this hardly matters!
Note that software may implement an FSM, designed this way, though.
University of Manchester School of Computer Science

Datapath Design
The preceding diagram suggests that we need to be able to:
❏ keep a running total
❏ increment the total
❏ decrement the total
❏ compare the total with the maximum allowed capacity

ce sel add/sub Z
This suggests an RTL picture something like:
zero
max 1 detect
The data buses connecting the register, mux, mux b
adder and zero detector are all N bits wide
‘1’ 0 add/
where 2N > max. sub
total
register a
The ‘datapath’ is controlled by 3 input wires
(ce, sel and add/sub) and produces one
control output (Z).

COMP10211 – The Underlying Machine RTL Slide 13

Datapath Design Comparison


There are various ways of comparing values. If a comparison alone is required a
The controller clearly needs only a single variable (register) which counts the specialist unit can be built. However in this design we already require the ability
number of vehicles in the car park. We do not need to worry much about the size to add and subtract and a comparison can be done by subtracting.
of the register although it clearly must be able to count at least to the maximum To compare A and B, subtract one from the other
number of spaces available.
The system clock runs continuously. The clock enable (ce) signal is used to Result(A - B) Conclusion
‘freeze’ the register when no activity is required.
zero A=B
positive (not zero) A>B
Evaluation functions required include incrementing (+1) and decrementing (-1)
this register. To do this the number “1” may be fed to the second port of the negative A<B
adder/subtractor.
(The positive/negative result can be determined by testing the ‘sign’ bit (the
subtraction is done by: a - b = a + b + 1 most significant bit in two’s complement notation); a check for zero can be done
(i.e. invert b and put a carry into the LSB) with a NOR of all the bits.
To compare with the value “max”, the value must be made available to the
A slightly less obvious function is the ability to compare the count with the car adder/subtractor.
park’s capacity.
Exercises
Think how increment and decrement could be performed without using the
explicit constant ‘1’?

Can you simplify the datapath to avoid (most of) the comparison?
(Hint: the car park occupancy data may be stored in a different way.)

“MUX” is a common abbreviation of multiplexer.


University of Manchester School of Computer Science

Control Specification
The control FSM can now be defined more specifically:

inputs outputs
next
state

add
out

sel

full
state

ce
in

Z
1 0 x inc 0 x x 0
ready x 1 x dec1 0 x x 0
0 0 x ready 0 x x 0
inc x x x cmp 1 0 1 0
x x 0 ready 0 1 0 0
cmp
x x 1 full 0 1 0 0
x 0 x full 0 x x 1
full
x 1 x dec2 0 x x 1
dec1 x x x ready 1 0 0 0
dec2 x x x ready 1 0 0 0

Note:
❏ there is no initialisation procedure to ensure this system starts in a defined state
❏ each ‘in’ and ‘out’ event must be seen exactly once – this needs more thought
❏ we must assign a ‘number’ to each state before we can complete the logic design

COMP10211 – The Underlying Machine RTL Slide 14

Control Specification
The ‘first cut’ of the control specification takes the names of the various states Exercise
from the earlier ‘bubble’ diagram. Some states – for example inc – occur only Check that the value assigned to any ‘x’ in the output side of the table cannot
once because there is only one possible succeeding state; others - e.g. ready – affect the behaviour of the system.
have several possible successors.

Input “don’t cares”


Writing the table out in full would occupy a great deal of space (and be confus-
ing!). As some or all of the inputs can always be ignored the table can be col-
lapsed by inserting don’t care (“x”) in some places. Thus the table can be written
in ten lines rather that forty eight (6 states x 23 input conditions).
For example in the ready state the Z input never has any influence. There are
four possible combinations of in and out, but only three possible behaviours: let
a car in, let a car out and do nothing. Here it has been defined that if out is active
(“1”) then in is not considered.

Exercise
Verify that all possible input conditions are covered in the table.

Output “don’t cares”


The “don’t cares” in the output set have a different meaning. As the control state
must generate these signals we must eventually define what they will be.
Note that ce is always defined; this is because it is essential to know if the total
register is to change or not on any given clock edge. If it is changing (ce = 1)
then sel and add control what it is changing to; however if it is not changing
(ce = 0) then it doesn’t matter what value is presented to the register – it will
ignore it.
Allowing latitude at this time gives more freedom in the logic reduction, simpler
equations and thus smaller (& faster) circuits.
University of Manchester School of Computer Science

System Initialisation
We must add control to ensure that the system starts from a known state; this must put the FSM
into a known state, but it must also reset the total count to zero.
We can also observe from the state transition table that the dec1 and dec2 states are identical and
can be merged, so we can now draw a bubble diagram with initialisation:

reset
reset
Note: decision
not full
zero
in
ready inc tot cmp

full
out out
dec tot full

This system loops in the ‘reset’ state (not shown!) decrementing the total (subtracting 1 from it) until
the total is zero, then it enters the ‘ready’ state and begins normal operation.
(An alternative would be to add a reset input to the total register.)

COMP10211 – The Underlying Machine RTL Slide 15

Initialisation
Reset Physical input and output devices
On the slide this (somewhat informal) notation Handling the physical input and output devices can often be a problem. There
has been used to show that the FSM remains in may be a wide variety of different sensors, actuators, etc. which are not neces-
reset the reset state until the zero input becomes sarily digital.
zero active. The assumption is made that the state is
unaltered unless otherwise shown. We will look a bit more at I/O later in the course. Here we will look at some
machinery to keep the problem separate from out control machine.
Strictly speaking the behaviour when the zero zero How can we sense the presence of a car at the car park entrance or exit?
input is inactive should also be shown explic-
itly. This tends to clutter the diagram, but it How can we control the barriers safely?
does give an added check that all behaviours How can we keep the system complexity under control?
reset
have been specified.
zero

An electronic system may power up in any state, including a state that is


unreachable in normal operation - each flip-flop has two stable states and it may
come up in either state when power is first applied.
The designer must ensure that the system can be brought into a valid initial state
Sensors: input devices that measure some real-world property, such as the
however it powers up.
presence of a car. A sensor will often produce a signal in a form that is
totally unsuitable for direct connection to a digital circuit. It may carry a
mains electricity voltage, or it may ba a high-frequency carrier, or it may be
too small to be detected by a logic gate, or...
So in addition to the digital input conditioning logic we discuss here there
will probably be a need for analogue electronics to convert the sensor out-
put into a form suitable for digital processing.
Actuators: output devices that change some real-world property, such as
the position of a barrier or whether the ‘FULL’ light is on or off.
Again, the ‘FULL’ light may be a 240V AC light bulb that cannot be driven
directly from our digital output and additional driver electronics is needed.
The non-digital interface electronics is beyond the scope of this course.
University of Manchester School of Computer Science

System Partitioning
So far we have assumed that a car arriving (or leaving) does so in one state; this is unlikely!
However this is a a convenient abstraction – like to keep this model.

up
Let’s have a ‘barrier’ machine which handles the
up details of sensing cars and raising and lowering
raise the barrier.

car count
done
car Copies of this machine could be made for each
barrier.
waiting done

This could interact with the original FSM so that


clear the ‘count’ state requested ‘in’ or ‘out’ (as
down
appropriate) and ‘done’ acknowledged that the
lower count has been registered.
car
down car

There is a functional bug in the complete system as described: can you spot it?

COMP10211 – The Underlying Machine RTL Slide 16

System Partitioning
This machine does not need a datapath as all its state is in the figure above.
The five states function as follows: reset full
❏ waiting remain idle until a car is sensed, then go to …
❏ raise raise barrier until sensor indicates it’s up, then go to … car car
❏ count signal main FSM that car is entering/leaving; raise datapath raise
only when the main FSM acknowledges, go to … in out
up up
❏ clear wait until the car is not sensed before going to … done done
lower lower
❏ lower lower barrier until sensor indicates it’s down down control down
and return to waiting state

This machine deals with the motors, actuators, sensors etc. and simplifies the
overall design by appropriate partitioning. The ability to reuse the same design
for both barriers is a bonus.
Exercises
It’s I/O is:
Can you add a payment system to the state diagram?
❏ car sense presence of a vehicle
❏ raise move barrier up (during ‘raise’ state)
❏ up barrier is fully raised
❏ lower move barrier down (during ‘lower’ state) Can you find the bug mentioned in the slide? (Not trivial!)
❏ down barrier is fully lowered
❏ in/out ready to count
❏ done count has been done
Can you suggest a solution?

Note: in this case ‘up’ and ‘down’ are separate because there is also a range of
intermediate positions.
University of Manchester School of Computer Science

Handshaking
The different state machines need to communicate.
❏ Assume a synchronous model: their clock is the same
❏ Sometimes need to synchronize states.

clk

Control ready inc tot max ready

done

Barrier raise count clear

in

up

This is a possible timing scenario from the barrier being raised to the car being counted.

COMP10211 – The Underlying Machine RTL Slide 17

Handshaking Physical Interfaces


In computer terms a “handshake” is a protocol which ensures that a transmitted In this example it has been assumed that the external inputs (such as ‘car’, ‘up’)
signal has been ‘seen’ by the receiver. are synchronised with the internal clock.
In the slide the protocol runs as follows: In practice the world is not so cooperative. It is therefore necessary to synchro-
nize these to the clock to ensure that they do not change at the wrong time, pos-
❏ the ‘in’ signal is being transmitted when appropriate
sibly violating the flip-flops’ set-up or hold timing requirements.
❏ ‘in’ remains active until ‘done’ is returned
❍ here this happens on the next cycle, but the control machine
may not be ‘ready’, so this could be delayed
Switch Bounce
❏ ‘in’ is maintained until ‘done’ has been seen
A mechanical component such as a switch does not switch “cleanly” like a logic
❏ (in this model) ‘done’ can be pulsed because it is known that ‘in’ gate. A switch has a tendency to bounce as contact is made/broken. Thus the
will be removed as soon as it is seen to be active logic state of the output signal as a key is pressed and released will look some-
❍ another protocol could have required ‘done’ to remain until thing like:
‘in’ was inactivated
❏ both machines can proceed independently again

Clearly it is important to register only a single ‘clean’ signal here, so the signal
must be “debounced”.
Several methods of debouncing are possible; perhaps the easiest way is to sam-
ple the signal periodically and wait until several sequential samples have the
Handshaking same value. The time/number of samples depends on the mechanical properties
of the switch; ~1ms is typical but for a specific application you should consult
Handshaking is a means of communicating between subsystems which the switch manufacturer’s data.
may be asynchronous, i.e. not always sharing the same clock.
For simple systems this sampling can be done in hardware, using flip-flops. In
The principle of handshaking is that one partner initiates a communication systems with a large number of switches this becomes prohibitively expensive
(in this case asserting req) and then waits for some sort of an acknowl- and a processor is usually used with the debounce being done in software. The
edgement. In this example req remains asserted until acknowledged by ‘classic’ example of this is a computer keyboard.
ack.
req

ack
University of Manchester School of Computer Science

State Assignment
Now, examine the full specification again

inputs outputs
next ❏ There are six states: need

doneout
state

donein
reset

add
out

sel

full
state (at least) 3 state bits

ce
in

Z
0 0 0 x 0 x x 0 0 0 ready ❏ States are assigned ‘numbers’
ready 0 x 1 x 0 x x 0 0 0 dec (binary codes)
0 1 0 x 0 x x 0 0 0 inc ❍ e.g. 000 for ‘ready’
inc 0 x x x 1 0 1 0 1 0 cmp
0 x x 0 0 1 0 0 0 0 ready ❍ choice is arbitrary, but …
cmp
0 x x 1 0 1 0 0 0 0 full
❏ Choice may make output
0 x 0 x 0 x x 1 0 0 full
full derivation easier
0 x 1 x 0 x x 1 0 0 dec
dec 0 x x x 1 0 0 0 0 1 ready ❏ Some legitimate codes may
x x x 0 1 0 0 0 1 1 reset be unused
reset
0 x x 1 1 0 0 0 1 1 ready
xxxx 1 x x x 0 x x 0 0 0 reset ❏ Outputs may be derived from
current state
(xxxx is used as shorthand for ‘any state’)

COMP10211 – The Underlying Machine RTL Slide 18

State Assignment
Assigning binary codes (numbers) to states is an arbitrary process, but sensible This needs an example. Consider the two outputs ‘ce’ and ‘add’:
choice can make the logic simpler. Even if the gate-level logic is generated auto-
matically using CAD tools (more on this later) a sensible choice can make the state ce add state coding
logic simpler and, probably, therefore smaller, faster and lower power.
ready 0 x 000
The first and only essential rule is that there must be enough bits to code for all
the states. Remember ‘N’ bits can code for up to 2N states: with this six state inc 1 1 111
machine 2 bits are insufficient, 3 bits will suffice. (More bits could be used; cmp 0 0 001
sometimes this is justifiable for later simplifications but we will not investigate
that here.) full 0 x 010
Note that there is some freedom in deciding what some outputs are. If we don’t dec 1 0 100
care about the datapath output in some state it doesn’t matter what it evaluates
reset 1 0 101
to. We only care if we are latching the output (‘ce’ is ‘true’) or the ‘Z’ flag needs
to be evaluated.
The state codings have been chosen so that their first two bits can be used as the
Can any of the state bits also be used as outputs? i.e. are any of the outputs ‘true’ two chosen outputs (except for the reset case we were ignoring). Note that the
for half of the states (or about half if not all the states are used)? state codings are all unique.
In this case, yes; there is enough freedom in outputs like ‘sel’ to choose patterns Let’s call the state bits ‘state<2:0>’
which obey this for the ‘don’t care (x) states.
ce = state<2> . reset
‘ce’ is another ‘obvious’ candidate although be careful reading the slide
because, for example, ‘ce’ can be ‘0’ in the ‘inc’ state if the ‘reset’ input is add = state<1>
asserted. However this can still help a bit; we’ll ignore ‘reset’ for a moment. The other outputs are not difficult to derive, now:
We can repeat this process of subdivision into halves if other outputs have the sel = state<2>
same property and are orthogonal – i.e. they divide our previous ‘halves; inde-
pendently. full = state<2> . state<1> . reset
donein = state<2> . state<0> . reset
Unused states
doneout = state<2> . state<1> . reset
In an example like this, three bits specify one of six states. Two states are
unused and should never occur. [We remembered the reset input again!]
It is normally good practice to ensure that, if these states are encountered,
they return to a defined state as soon as feasible.
The next state bits can also be derived in a similar fashion. However this is an
More on this topic later. illustration – let’s not get too deep into hand-designed logic here.
University of Manchester School of Computer Science

The Final Implementation

(e.g. 10-bit datapath)


❏ 6 x 5 x 5 = 150 states
❏ 7 inputs, 5 outputs
❏ 19 flip-flops
❏ ~150 gates

COMP10211 – The Underlying Machine RTL Slide 19

The RTL Design Process Testing


RTL Design Summary How do we go about testing the design fully?
Especially tricky (and a common source of design errors): how do we check that
❏ Understand the problem! the system will reset from every possible state after power up?
❏ Produce a specification; maybe sketch a state diagram; Can we test the behaviour with every possible combination of values of the total
determine the interfaces. register, the state register and the external inputs? Is this necessary?

❏ If necessary and appropriate, partition the design to


simplify the problem.

❏ Identify the flows of data; sketch an RTL datapath;


list the control signals.

❏ If it helps, draw timing diagrams.

❏ Formalise the state diagram;


verify that the specification is fulfilled. Clocking
❏ Design the FSM that drives the control signals In this example the clock has been neglected. For such a system the clock
(process described earlier). speed is of little relevance, as long as it is sufficiently fast that the users
notice no lag (“latency”). Thus even a clock speed of, say, 1kHz would be
❏ Test! far faster than would be required; the system will spend almost all its time
idling in the ‘ready’ state.
In a ‘real’ system (such as a microprocessor) it is usually desirable to be
able to clock the system as fast as possible. The maximum clock speed is
determined by the propagation delay of the circuit. Of particular signifi-
cance is the critical path – the longest (in time) propagation delay from
Unused states one clock to the next. It is sometimes hard to identify the critical path, but
In an example like this, three bits specify one of six states. Two states are CAD tools exist to help. A possibility in this design would be the time from
unused and should never occur. the state register, through the multiplexer select, arithmetic unit, zero detec-
tor and back to the state register via the ‘Z’ bit and the FSM logic.
It is normally good practice to ensure that, if these states are encountered,
they return to a defined state as soon as feasible. Microprocessor designers spend considerable effort identifying and short-
ening such paths in order to make faster machines. Fortunately this is not
More on this topic later. part of this course.
University of Manchester School of Computer Science

RTL Conclusions
We have seen how digital systems can be designed by splitting the functionality up into appropriate
areas and using different approaches for the different areas:
❏ Controller
❍ FSMs give a good representation of the flow of control in, and the behaviour of, a system
❍ state assignment can be arbitrary, but intelligent choice can simplify logic design
❍ the FSM can be combined with the state assignment in a state transition table
❍ logic equations can be read directly from the state transition table
❏ Datapath
❍ RTL modelling represents the movement of data around a system where the data has a
limited influence on the flow of control
❏ System
❍ the two approaches blend conveniently by connecting the RTL register and functional
control lines as FSM outputs, and using the results of data operations (e.g. compares)
as FSM inputs
This will be applied later in COMP10211 to the design of a general purpose processor for a
computer.

COMP10211 – The Underlying Machine RTL Slide 20

RTL Conclusions Computer Processor Architecture


Although it would be very unusual to design a real car park controller in this This example design contains all the elements used when producing a computer
way (it is cheaper and more flexible to use a microcontroller) this design has all processor. A processor will be more complicated than this but, as we shall see,
the elements of a real processing machine such as a computer processor. need not be much more complex to be able to run real programmes.
Although a processor is a more complex problem – as will be seen shortly – the
design flow is the same. A processor will include:

Exercise ❏ A datapath, with


Estimate the number of logic gates in the car park controller as a function of the ❍ the user’s registers
width of the datapath part of the design. ❍ some processing logic –e.g. an adder/subtractor or ALU
(Arithmetic/Logic Unit)
❏ which part of the design (RTL or FSM) dominates the gate count?
❍ interfaces to memory for loading and storing
❏ which part of the design (RTL or FSM) dominates the design effort?

❏ A controller, which
❍ causes the datapath to fetch instructions
❍ receives and decodes these instructions
❍ causes the datapath to obey the instructions

❏ Some input and output, such as


❍ reset – to get things started in a defined place
❍ interrupts – which are not discussed in this course

Differences
❏ One (major) difference is the addition of a memory for storing more
values outside the Central Processing Unit (CPU).
❏ One (minor) difference here is that most of the Input and Output (I/O)
is dealt with as part of the memory; this allows the programmer access
in software.

More later …

You might also like