Undergrad Instruction Set

COE5320 Computer Architecture
Instruction Set Architecture [1]
Angel E. Gonzalez-Lizardo,
Ph.D.
Polytechnic University of Puerto Rico
September 18, 2014
Ph.D. 1
Instructions: Language of the Computer
Introduction
The language of computers is called instructions.
Their vocabulary is called Instruction Set
Computer designers have a common goal:
To find a language that makes easy to build the hardware
and the compiler while maximizing performance and
minimizing cost.
Simplicity of the equipment is a valuable consideration.
The secret of computing: The stored program
Both instructions and data are stored as numbers in the computer.
Ph.D. 2

Arithmetic Instructions
Every computer must be able to perform arithmetics.
The MIPS instruction
add a , b , c
# a = b + c;
Each MIPS instruction is able to perform only ONE operation.

The statement,
a = b + c + d + e
translates into
add a , b , c
add a , a , d
add a , a , e
# The sum of b and c is placed in a .

# The sum of b , c , and d is placed in a .
# The sum of b , c , d , and e is placed in a .
Three arithmetic operations generate 3 instructions.

Ph.D. 3

First Design Principle
Simplicity Favors Regularity
The simpler the instructions, the simpler the hardware to execute
them.
Each line contains only one instruction.
MIPS Assembly
The text after the # is a comment.
Comments always terminate at the end of the line.
Each instruction has three operands, no less, no more.
Ph.D. 4

Example: Compiling two C assignments into MIPS.
This segment of C language program contains five variables.
a = b + c;
d = a - e;
Translating these instructions into MIPS assembly language is performed by a compiler
and yields:
add a , b , c
sub d , a , e
Two simple C statements compile into two assembly language instructions.
Ph.D. 5

A more complex statement
f = ( g + h ) - ( i + j );
The compiler must break this statement into several instructions, for example
add t0 , g , h
add t1 , i , j
sub f , t0 , t1
# temporary variable t0 becomes g + h

# temporary variable t1 becomes i + j
# f gets t0 - t1 , the final result
Ph.D. 6

The table below shows the portions of MIPS assembly language
described so far.
Table: MIPS Assembly Language
Category
Instruction
Example
Meaning
Comments
Arithmetic
add
subract
add a, b, c
sub a, b, c
a=b+c
a=bc
Always three operands

Always three operands
Ph.D. 7
Operands in a Computer
Registers
Operands of arithmetic instructions are always registers.
Registers in the MIPS-32 architecture are 32-bits wide.
The name word is given to such groups of 32-bits
MIPS-32 architecture has only 32 registers.
The reason for that is the Second Design Principle:
Smaller is faster
Ph.D. 8
Registers
Example: Same as before but using registers
f = ( g + h ) -( i + j );
Assuming registers $s0, $s1, $s2, $s3, and $s4 are assigned to f, g, h, i, and j
respectively,
add $t0 , $s1 , $s2
add $t1 , $s3 , $s4
sub $s0 , $t0 , $t1
# register t0 becomes g + h
# register t1 becomes i + j
# $s0 gets the final result
Registers called $sx are used for variables and the ones called $tx
are used for temporary variables.
Ph.D. 9
Registers
The processor can keep only a limited number of data elements.
Data structures and arrays must be kept in memory.
Data Transfer Instructions move data from memory to processor
and viceversa.
To access a word in memory, the instruction must provide a
memory address.
Memory is just a large single dimensional array with the address
acting like an index.
Ph.D. 10
Data Transfer Instructions

The main instruction to move
data from memory to the
processor is load word (lw).
The main instruction to move
data from the processor to
memory is store word (sw).
Figure: Memory addresses and
contents.
Ph.D. 11
Example: Memory operation
The instruction
g = h + A [8]
will be compiled into

lw $t0 , 32( $s3 )
add $s1 , $s2 , $t0
# Temporary register
# g = h + A [8]
The constant (32) in the data transfer is called the Offset while the
register ($s3) is called the Base Register.
Effective Address = Offset + Base Register
Ph.D. 12
Compiler
The compiler:
Associates register with variables.
Allocates arrays and structures memory locations in memory.
Places the right data address into the data transfer instructions.
In MIPS a word address must be multiple of 4.
This is called Alignment Restriction.
Ph.D. 13
Immediate Instructions
A constant in the instruction is called an immediate operand.
More than half of MIPS arithmetic instructions use immediate
operands.
Since immediate operands are very frequent, immediate
instructions are included
The instruction
addi $s3 , $s3 , 4
# $s3 = $s3 + 4
illustrate arithmetic immediate instruction.
Ph.D. 14
Third Design Principle

Make the common case fast
Immediate instructions illustrate the third design principle
Constant operands occur frequently.
Immediate operands are much faster than constants in memory.
Tables in Figure 2 show a summary of MIPS instruction set so far.
Ph.D. 15
Figure: MIPS Architecture revealed so far
Ph.D. 16
Instructions Formats
R-type Instructions
How machine represent numbers? binary format
For example 12310 = 11110112 .
MIPS Fields.
The instruction is divided into fields
Each field specifies part of the information needed for
execution.
The R-type (for Register) instruction format is:
op
rs
rt
rd
shamt
function
6 bits
5 bits
5 bits
5 bits
5 bits
6 bits
Ph.D. 17
R-type Instructions
op. Basic operation of the instruction, called opcode.
rs. The first register source operand.
rt. The second register source operand.
rd. The destination register where the result is stored.
shamt. Shift amount, used for the shift instructions, specify how the
shift is done.
function. Function, used for selecting a variant of the operation
specified by the op field.
Ph.D. 18
Fourth design principle

Good design demands good compromises
Fixed length instructions are easier to decode.
Fixed length instructions implies different types of instructions to
perform different operations.
Ph.D. 19
I-type Instructions
The I-type (for immediate) instruction used for data transfer is:
The 16-bit address means a load word instruction can load any
word within a region of 215 or 32, 768 bytes (213 or 8, 192 words)
of the address in the base register.
op
rs
rt
6 bits
5 bits
5 bits
immediate
16 bits
Ph.D. 20
Instruction Types so far
Multiple formats complicate hardware,

To reduce complexity keep the formats similar.
The opcode identify the formats, indicating to the hardware what
fields to look at.
R-type arithmetic instructions have opcode 0, while load and store
have distinct opcodes.
Ph.D. 21
Assembly to Machine Language
Lets translate the instruction
A [300] = h + A [300]
which is compiled into
lw $t0 , 1200( $t1 ) # Temporary reg $t0 gets A [300]

add $t0 , $s2 , $t0 # Temporary reg $t0 gets h + A [300]
sw $t0 , 1200( $t1 ) # Stores h + A [300] into A [300]
into machine language.
Figure: Instruction fields in decimal representation
Ph.D. 22
Assembly to Machine Language
In MIPS, registers $s0 to $s7 map onto registers 16 to 23.
Registers $t0 to $t7 map onto registers 8 to 15.
Thus, $s3 is register 18, $t0 is register 8, and $t1 is register 9.
The binary representation of the instruction is in Figure 4.
Figure: Instruction fields in binary representation
Ph.D. 23
Two Key Principles

1
Instructions are represented by

numbers.
Programs are stored in memory

just like numbers.
Consequence of the stored
program are what we called
intelligence in a can
Also, binary compatibility or
inheritance of software
ready-made
Figure: Programs in memory
Ph.D. 24
Summary
Ph.D. 25
Other Instructions
Logic Instructions
Bitwise operations are useful in a computer.
Logical operations in Java or C translate directly to MIPS assembly.
Table 2 shows a summary of MIPS logical operations.
Table: C or Java to MIPS
Logical Operations
Java
MIPS
Shift Left
<<
<<
sll
Shift Right
>>
>>>
srl
Bit-by-Bit AND
&
&
and, andi
Bit-by-Bit OR
or, ori
Bit-by-Bit NOR
nor
Ph.D. 26
Other Instructions
Logic Instructions
The first two operations are called shifts.
They move all the bits in the word to the left or right,
The emptied bits are filled with zeros.
For example if the register $s0 contains
0000 0000 0000 0000 0000 0000 0000 1001 = 9 ,
executing instruction sll t2,s0, 4, $t2 turns into:

0000 0000 0000 0000 0000 0000 1001 0000 = 144
Ph.D. 27
Other Instructions
Logic Instructions
The machine version of the instruction is:
op
rs
rt
rd
shamt
funct
16
10
Shifting left by i bits gives the same result as multiplying by 2i

In the previous pattern 9 24 = 144
Masking is a useful way of isolating fields.
For example, executing and $t0, $t1, $t2 with
$t2 = 0000 0000 0000 0000 0000 1101 0000 0000
AND
$t1 = 0000 0000 0000 0000 0011 1100 0000 0000

$t0 = 0000 0000 0000 0000 0000 1100 0000 0000
Ph.D. 28
Other Instructions
Logic Instructions
If the instruction executed was or $t0, $t1, $t2, the result would be:
$t0 = 0000 0000 0000 0000 0011 1101 0000 0000
Further if we execute nor $t0, $t1, $t2, the result is

$t0 = 1111 1111 1111 1111 1100 0010 1111 1111
MIPS also provides

andi: AND immediate
ori OR immediate
Ph.D. 29
Other Instructions
Figure: MIPS ISA so far

Ph.D. 30
Other Instructions
Decision-Making Instructions
What distinguishes a computer from a calculator?
Its ability to make decisions.
MIPS assembly includes two decision-making instructions.
The instructions are branch if equal (beq), and branch if not
equal (bne)
beq reg1 , reg2 , L1
bne reg1 , reg2 , L1
\# GOTO label L1 if [ reg1 ]=[ reg2 ].

\# GOTO label L1 if [ reg1 ]~=[ reg2 ]}.
Ph.D. 31
Other Instructions
Example. Compiling an if statement into a conditional branch. In the
following code segment, f g, h, i, and j are variables.
if ( i == j ) f = g + h ; else f = f - i ;
Assuming that the five variables f through j correspond to the five registers $s0 to $s4,
what is the compiled MIPS code?
Answer
beq $s3 , $s4 , Else
sub $s0 , $s0 , $s3
j Exit
Else : add $s0 , $s1 , $s2
Exit :
# go to Else if i equals j
# f = f - i
# go to exit
# f = g + h
The label L1 is assigned a memory address pointing to the appropriate instruction

during the compilation process.
Ph.D. 32
Other Instructions
The compilers frequently
create branches where they do
not appear in the original code.
Figure: Illustration of the options in the

previous example
Avoiding the writing of explicit

labels and branches is one of
the benefits of high-level
language programming.
Ph.D. 33
Basic Programming
Loops
Decision making instructions are also used for loops.
Compiling a loop with a variable array index Here is a C loop:
Loop :
g = g + A [ i ];
i = i + j;
if ( i != h ) go to Loop ;
Assume A is an array of 100 elements and that g, h, i, and j are associated to registers $s1 to $s4
by the compiler, and $s5 holds the base address of A.
Loop :
add $t1 , $s3 , $s3

add $t1 , $t1 , $t1
add $t1 , $t1 , $s5
lw $t0 , 0( $t1 )
add $s1 , $s1 , $t0
add $s3 , $s3 , $s4
bne $s3 , $s2 , Loop
#
#
#
#
#
#
#
Temp reg $t1 = 2* i

Temp reg $t1 = 4* i
$t1 = address of A [ i ]
Temp reg $t0 = A [ i ]
g = g + A[i]
i = i + j
go to Loop if i not equal to h
Ph.D. 34
Basic Programming
Basic Blocks
These sequence of instructions that end in a branch are so
fundamental to compiling that they have got their own buzzword :
Basic Block.
A Basic Block is a sequence of instructions with
No embedded branches (except at end)
No branch targets (except at beginning)
A compiler identifies basic blocks for optimization
An advanced processor can accelerate execution of basic blocks
Ph.D. 35
Basic Programming
Example. While Programming
Compile the C code:
while ( save [ i ] == k )
i = i + j;
Assuming i, j, and k are associated to registers $s3, $s4 and $s5 by the compiler, and $s6 holds
the base address of save. Answer:
Loop :
add $t1 , $s3 , $s3

add $t1 , $t1 , $t1
add $t1 , $t1 , $s6
lw $t0 , 0( $t1 )
bne $t0 , $s5 , Exit
add $s3 , $s3 , $s4
j Loop
#
#
#
#
#
#
#
Temp reg $t1 =

Temp reg $t1 =
$t1 = address
Temp reg $t0 =
go to Exit if
i = i + j
go to Loop
2* i
4* i
of save [ i ]
save [ i ]
save [ i ] does not eq
Exit :
Ph.D. 36
Basic Programming
Set if Less Than

The test of inequality or equality is the most popular for stoping a loop,
however sometimes it is useful to find out if a variable is greater than other.
The MIPS slt (set if less than) instructions compares two registers and
set a third to 1 if the first is less than the second.
For example slt $t0, $s3, $s4 , means that $t0 is set to 1 if $s3 < $s4.
Otherwise $t0 is set to 0.
The compiler uses slt, bne, and beq and the register $zero to create all
relative conditions: =, >,<,, and .
$zero maps to register number 0.
Ph.D. 37
Basic Programming
Set if Less Than
What is the code to test if variable a associated to $s0 is less than
variable b in $s1, and then branch to the label Less, if the condition
holds?
Answer:
Less : slt $t0 , $s0 , $s1
bne $t0 , $zero , Less
Exit :
# $t0 = 1 if $s0 < $s1

# goto Less if $t0 is not zero
Simple Fast Instructions

No branch on less than in MIPS, too complicated.
It would either stretch the clock time or take extra clock cycles.
Two fast instructions perform better than a powerful slow one.
Ph.D. 38
Basic Programming
Case/Switch Statement
Most programming languages have a case or switch statement.
One way of implementing a switch is through a sequence of
if-then-else.
Another way is using a table of addresses called a jump address
table.
To support this situation MIPS provides the instruction jump
register (jr), an unconditional jump to the address specified in a
register.
Ph.D. 39
Basic Programming
Case/Switch Statement.
Example: Compiling a switch Statement using a Jump Address
Table.
Consider the code:
switch ( k ) {
case
case
case
case
}
0:
1:
2:
3:
f
f
f
f
=
=
=
=
i
g
g
i
+
+
-
j;
h;
h;
j;
break
break
break
break
;
;
;
;
/*
/*
/*
/*
k
k
k
k
=
=
=
=
0*/
1*/
2*/
3*/
Assume the six variables are contained in registers $s0 to $s5 and that register $t2
contains 4.
Ph.D. 40
Basic Programming
Answer
The switch variable k is used to index a jump address table, then jump via the loaded value.
First we make sure k is in the test range.
slt
bne
slt
beq
$t3 ,
$t3 ,
$t3 ,
$t3 ,
$s5 , $zero
$zero , Exit
$s5 , $t2
$zero , Exit
#
#
#
#
$t3 =1 if k < 0
if k < 0 go to Exit
Test if k < 4
if k >= 4 go to Exit
Then we multiply the index k by 4 so we can use it as pointer.
add $t1 , $s5 , $s5

add $t1 , $t1 , $t1
Assume that four sequential words in memory starting with the address in $t4 contain the
addresses corresponding to the labels L0, L1, L2, and L3.
add $t1 , $t1 , $t4

lw $t0 , 0( $t1 )
# $t1 = address of JumpTable [ k ]

# $t1 = JumpTable [ k ]
Ph.D. 41
Basic Programming
Answer
The instruction jr jumps to the address specified in a register.
jr $t0
Finally, the cases
L0 : add $s0 ,
j Exit
L1 : add $s0 ,
j Exit
L2 : sub $s0 ,
j Exit
L4 : sub $s0 ,
Exit :
$s3 , $s4
# f = i + j
$s1 , $s2
# f = g + h
$s3 , $s4
# f = g - h
$s1 , $s2
# f = i - j
Ph.D. 42
Summary of MIPS Assembly
Ph.D. 43
Summary of MIPS Assembly

MIPS Instruction fields
Figure: MIPS Instruction fields
Ph.D. 44
Supporting Procedures
Procedures
Procedures are tools used with two purposes:
1
Make the code easier to understand.
2
Make the code reusable.
Procedures: programs that concentrate in a portion of the task.
Parameters
Allow for separation between the procedure and the rest of the
program and data.
Allow to pass values and return results.
Ph.D. 45
Procedures
To execute procedures the program must follow six steps:
1
Place the parameters where the procedure can access them.
Transfer control to the procedure.
Acquire the storage resources needed by the procedure.
Perform the desired task.
Place the results where the main program can access them.
Return the control to the point of origin.
Ph.D. 46
Register Allocation
As mentioned, register are the fastest place to hold data.
They must be used as much as possible.
MIPS allocates the following registers for procedure calling:
$a0-$a3: four arguments register to pass parameters.
$v0-$v1: two value register to return values.
$ra: one return address register to return the point of origin.
MIPS includes an instruction just to call procedures: jal
ProcedureAddress
The instruction is jump-and-link save the return address in $ra and
jumps to the target address.
Ph.D. 47
Register Allocation
The execution of a procedure
The caller program places the parameters in $a0 to $a3
The caller program uses jal X to jump to the procedure X.
The callee program (the procedure) perform its calculations.
The callee program returns control using jr $ra.
Ph.D. 48
Stack
If more registers for parameters are needed,
If the procedure uses more than 4 register, a stack is used
Any register needed by the caller must be restored to their original
values before the procedure was invoked.
This is called spilling registers.
MIPS software allocates another register for the stack called the
stack pointer, $sp.
MIPS stacks grow form higher address to lower address
This convention means you push values into the stack by
subtracting from the stack pointer.
Adding to the stack pointer shrinks the stack, popping values off the
stack.
Ph.D. 49
Leaf Procedure
Example: Compiling a Procedure that does not call another procedure.
Consider the code:
int leaf_example ( int g , int h , int i , int j )
{
int f ;
f = ( g + h ) - ( i - j );
return f ;
}
The compiled program has three parts, saving the registers for the caller,
performing the computations, and restoring the registers:
Ph.D. 50
Leaf Procedure
The parameters g, h, i, and j correspond to the argument registers $a0 to $a3.,
and f correspond to $s0.
leaf_example :
addi $sp , $sp , -12 # make room in the stack for 3 items .
sw $t1 , 8( $sp )
# save register $t1 to use afterwards
sw $t0 , 4( $sp )
# save register $t0 to use afterwards
sw $s0 , 0( $sp )
# save register $s0 to use afterwards
add $t0 , $a0 , $a1
# $t0 contains g + h
sub $t1 , $a2 , $a3
# $t1 contains i - j
sub $s0 , $t0 , $t1
# f = (g + h) - (i - j)
add $v0 , $s0 , $zero
# returns f
lw $s0 , 0( $sp )
# restore register $s0 for the caller
lw $t0 , 4( $sp )
# restore register $t0 for the caller
lw $t1 , 8( $sp )
# restore register $t1 for the caller
addi $sp , $sp , 12
# adjust the stack pointer back
jr $ra
# jump back to the calling routine
Ph.D. 51
Register Preservation Rules

To avoid saving and restoring register that are never used, MIPS offers
two classes of registers:
$t0-$t9. 10 temporary registers that are not preserved by the callee.
$s0-$s7. 8 saved registers that must be preserved. If used the
callee saves them and restore them.
This simple convention reduces register spilling.
Ph.D. 52
Nested Procedures
Procedure that do not call other procedures are called leaf
procedures.
Life would be simpler if all procedures were leaf.
If a procedure A calls a procedure B, both using $a0 to pass
parameters, B must preserve the value of $a0 for A.
One solution
Ph.D. 53
Nested Procedures
Lets consider the procedure that computes the factorial:
int fact ( int n )
{
if ( n < 1) return (1)
else return ( n * fact (n -1));
}
Assuming we can add or subtract constants, what is the MIPS code for
this procedure?
Ph.D. 54
Nested Procedures
The parameter n correspond to the argument register $a0. Hence, the value of $a0
must be pushed on the stack.
fact :
addi $sp , $sp , -8
sw $ra , 4( $sp )
sw $a0 , 0( $sp )
# Open two words in the stack

# save the return address
# save the argument
The next two instruction test if n is less than 1.

slti $t0 , $a0 , 1
beq $t0 , $zero , L1
# test for n < 1

# if n >= 1 go to L1
Ph.D. 55
Nested Procedures
If n < 1, fact returns 1 by putting 1 into a value register.
addi $v0 , $zero , 1

addi $sp , $sp , 8
jr $ra
# returns 1
# pops two items off the stack
# return to after jal
If n is not less than 1, n is decremented and then fact is called again.
L1 : addi $a0 , $a0 , -1

jal fact
# decrement n
# calls fact again
Then, when fact returns, the old address and old
lw $a0 , 0( $sp )
lw $ra , 4( $sp )
addi $sp , $sp , 8
# restore argument
# restore the return address
# pops two items off the stack
Assuming the multiplication instruction exists.
mult $v0 , $a0 , $v0

jr $ra
# n * fact (n -1)
# return to the caller
Ph.D. 56
Nested Procedures
Registers $a0-$a3, $s0-$s7, and stack pointer are preserved.
Registers $v0-$v1, $t0-$t9 are not preserved.
The stack above the stack pointer is preserved.
The stack below the stack pointer is not preserved.
Ph.D. 57
Nested Procedures
Variables local to the procedures are also stored in the stack.
This is done when variables do not fit in the registers.
The procedure frame or activation record is the stack segment
containing a procedure saved register and local variables.
Some MIPS software use a frame pointer ($fp) to point the first
word of a procedure.
Hence, the $fp points to the begin of the procedure frame and the
$sp points to its end.
Ph.D. 58
Nested Procedures
C language has two storage classes: automatic and static.
Automatic variables are local data discarded when the procedure
exits.
Static variables exist across exits from procedures.
To ease the access to static data MIPS reserves another register
called global pointer, $gp.
Ph.D. 59
Nested Procedures
The frame pointer points to the first word saved by the
procedure
Figure: Stack Allocation (a) before, (b) during and (c) after the procedure call.
Ph.D. 60
Summary
Register Mapping
Figure: MIPS register convention. Register 1, called $at, is used by the assembler and
register 26 and 27, called $k0 and $k1, are reserved to the operating system
Ph.D. 61
Summary
Figure: Summary
Ph.D. 62
Summary
Figure:
Ph.D. 63
Communicating with People
Communication
Computers were created to crunch numbers
Most computers today use the American Standard Code for
Information Interchange (ASCII) code to represent characters.
ASCII codes characters are 8-bit wide.
MIPS provides special instructions to move bytes.
Load byte (lb) loads a byte from memory placing it in the
rightmost 8 bits of a register.
Store byte (sb) takes a byte from the rightmost 8 bits of a
register and writes them into memory.
Ph.D. 64

ASCII Code
Ph.D. 65

Communication
Thus, for copying a byte
lb $t0 , 0( $sp )
sw $t0 , 0( $gp )
Three choices for representing a string

1
The first position of the string is reversed to give the length of
the string.
2
An accompanying variable has the length of the string (as in a
structure).
3
The last position of the string is marked with a character to
mark the end of the strings.
C language uses the null character to mark the end of strings.
Ph.D. 66
Communication
Example: Compiling a string copy procedure (C style).
void strcpy ( char x [] , char y [])

{
int i ;
i =0;
while (( x [ i ] = y [ i ]) !=0) /* copy and test byte */
i = i + 1;
}
What is the MIPS assembly code?
Ph.D. 67

Communication
Assuming the base addresses of the arrays x and y are in the registers $a0 and $a1, respectively,
and i is in $s0.
strcpy :
addi $sp , $sp , -4
sw $s0 , 4( $sp )
add $s0 , $zero , $zero
L1 : add $t1 , $a1 , $s0
lb $t2 , 0( $t1 )
add $t3 , $a0 , $s0
sb $t2 , 0( $t3 )
addi $s0 , $s0 , 1
bne $t2 , $zero , L1
lw $s0 , 4( $sp )
addi $sp , $sp , 4
jr $ra
# Adjust the stack for 1 word

# save the $s0 .
# initialize i
# Address of y [ i ] in $t1
# $t2 = y [ i ]
# Address of x [ i ] in $t3
# x[i] = y[i]
# increment i
# if y [ i ] !=0 go to L1
# y [ i ]==0 , restore s0
# Adjust the stack
# return
Ph.D. 68
Unicode
There is Universal Encoding or Unicode using 16 bits to represent
a character.
Java, for example uses unicode.
MIPS have a set of instructions to load an store halfwords or
16-bits quantities.
These instructions will not be treated at the moment, but revised
later.
Ph.D. 69
Constants and Immediate Operands
Why Immediates
A constant can be included in the instruction via the I-type
instructions.
52% of the arithmetic instructions in the gcc compiler use an
immediate operand
69% of the instructions of spice use an immediate operand.
Observe the sequence:
sw $t0 AddrConstant4 ( $zero ) # $t0 = constant 4
add $sp , $sp , $t0
# sp = sp + t0
With the immediate instruction we avoid accessing the memory

address AddrConstant4 to get the constant 4.
Ph.D. 70
Why Immediates
Example: Translating assembly constants into machine language
The add immediate instruction addi adds a constant to a register
addi $sp , $sp , 4
# $sp = $sp +4
The op field for addi is 8. Try to guess the rest of the fields in the
corresponding machine instruction.
Ph.D. 71
Why Immediates
We know that register $sp maps to register 29 of the register file, so fields rt and rs of
the instruction must be 29. The immediate field contains the constant in the instruction.
op
rs
rt
Immediate
29
29
In binary format:
op
001000
rs
11101
rt
11101
Immediate
0000 0000 0000 0100
Ph.D. 72
Why Immediates
Immediate operands are also popular for comparisons.
Then MIPS has the instruction :
slti
$t0 , $t2 , 10
# $t0 = 1 if $t2 < 10
The immediate instructions allows to

allocate only the instruction space for constants
avoiding wasting memory accesses in those constants
avoiding the compiler having to resolve them to constants
Ph.D. 73
Why Immediates
Make the common case Fast
Constant operands are frequent in arithmetic operations.
Making the operand part of the instruction is much faster than
accessing memory to get them.
Then, immediate addressing is implemented to make common
cases faster.
Ph.D. 74
Target Address Computations
Branches and Jumps

The simplest addressing in MIPS is the jump
They use the third MIPS instruction format, the j-type instruction.
Consider the instruction j 10000, assembled into
6 bits
26 bits
10000
where the opcode of the jump is 2 and the jump address is 10000.
Ph.D. 75

Branches and Jumps
The conditional branch instruction needs two operands.
For example:
bne $s0 , $s1 , Exit
# go to exit if $s0 \ neq $s1
This is assembled as
5
16
17
Exit
6 bits
5 bits
5 bits
16 bits
The new PC is obtained by (PC-relative)

PCnew = (PC+4) + Branch Immediate 4
(1)
or in other words
Branch Target Address = (PC+4) + Branch Immediate 4

(2)
Ph.D. 76
Branches and Jumps

Since PC = address of the next instruction we can branch within
215 words of the current instruction.
This is called PC-relative addressing mode.
PC-relative addressing is used for all conditional branches because the
target address is likely to be close to the branch.
Jump and link (jl) calls a procedure that has no reason to be close to the
call, then it uses long addressing mode provided by the j-type instructions.
Ph.D. 77

Branching Far Away More than 16 bits offsets
Nearly every conditional branch is to a nearby location.
A far away branch is an offset that requires more than 16 bits.
In such a case, the assembler inverts the test condition and inserts an
unconditional jump.
For example, the instruction
beq $s0 , $s1 , L1
is replaced by
bne $s0 , $s1 , L2
j
L1
L2 :
Ph.D. 78
Branches and Jumps

opcode
Jump Instruction
26-bit address
The 26-bits field is a word address, or a 28-bit byte address.

The MIPS jump instruction replaces the 28 lower bits of the PC.
PCnew = PC(31 : 28) & 26-bit field & 00
If the jump target is farther than 256 MB away, the jump instruction must
be replaced with a jr instruction that allows for a full 32-bits address.
Ph.D. 79
Addressing Modes
The five MIPS addressing modes
Register Mode: Where the operand is a register.
Base or displacement addressing: Where the operand is at the
memory location whose address is the sum of a register and a
constant in the instruction.
Immediate addressing: Where the operand is a constant within
the instruction.
PC-relative addressing: Where the address is the sum of the PC
and a constant in the instruction.
Pseudo direct addressing: Where th jump address is the 26 bits
of the instruction concatenated with the upper bits of the PC.
Ph.D. 80
Addressing Modes
The five MIPS addressing modes
Ph.D. 81
Instruction Formats Summary
Figure: MIPS Instruction Formats
Ph.D. 82
How Compilers Work

Steps to start a program
Ph.D. 83
How Compilers Work
Compilers
Translate a C language program into an assembly language
program.
High level language programs: fewer lines than assembly.
In the 70s many operating systems were written in assembly
because of small memories and inefficient compilers.
As memory capacity increased and compilers improved assembly
programming was not indispensable.
Ph.D. 84
How Compilers Work

Assembler
The assembler deals with the pseudoinstructions.
Pseudoinstructions exists in the assembly language but does not
have a hardware implementation.
For example
move $t0 , $t1
is in fact executed as a
add $t0 , $zero , $t1
Assembly also accepts numbers in a variety of numeric bases (hex,

bin, etc), change their base to binary.
Ph.D. 85
How Compilers Work
Assembler
The assembler turns the program into a Object File
The object file is a combination of
Machine language instructions.
Data.
Information needed to place the program in memory.
Assembler keeps track of the labels used by the program in a
symbol table containing pairs of symbols-addresses
Ph.D. 86
How Compilers Work

Assembler
The object file for Unix, typically 6 pieces:
The object file header describing the size and position of the other
pieces.
The text segment containing the machine language code.
The data segment containing any data that comes with the program.
The relocation information identifying instructions and data words that
depend on absolute addresses when the program is loaded into memory.
The symbol table containing the remaining labels that are not defined,
such as external references.
The debugging information with a description of how the modules were
compiled.
Ph.D. 87
How Compilers Work
Linker Puting it together

Each procedure is compiled and assembled separately
One line change causes only one procedure to be recompiled or
reassembled.
The linker stitches together all the independently compiled
procedures.
Three steps for linking
1
Place code and data modules symbolically in memory.
2
Determine the addresses of data and instruction labels.
3
Patch both the internal and external references.
Ph.D. 88
How Compilers Work

Linker Puting it together
The linker will
Use the relocation information and the symbol table to resolve
all undefined labels (jumps, branches, and data addresses).
If all external references are resolved, the linker determines the
memory location for each module.
When the linker places the modules in memory, all absolute
references (memory addresses that are not relative to a register)
are relocated to its true location.
The linker produces an executable file that can be run in a
computer.
Usually the executable has the same format as the object file but
without unresolved references.
Ph.D. 89
How Compilers Work

Loader
To start it UNIX gives the following steps:
1
Reads the file header to determine the size of the text and data segments.
Creates an address space large enough for the text and data.
Copies the parameters (if any) to the main program onto the stack.
Initializes the machine registers and sets the stack pointer to the first free
location.
Jumps to a start-up routine that copies the parameters into the argument
registers and calls the main routine of the program.
When the main routine returns, the start-up routine terminates the
program with an exit system call
Ph.D. 90
Examples
The Swap Procedure

Lets derive the MIPS code from a procedure written in C:
The swap procedure.
swap ( int v [] , int k )
{
int temp ;
temp = v [ k ];
v [ k ] = v [ k +1];
v [ k +1] = temp ;
}
Ph.D. 91
The Swap Procedure
When translating from C to assembly language we follow the

steps:
1
Allocate the registers to program variables.
Produce code for the body of the procedure.
Preserve registers across the procedure invocation
Ph.D. 92
Examples
Register Allocation
$a0-$a3 are the registers to pass parameters to procedures.
swap has only two parameters v and k, and one additional variable
temp.
Then $a0 and $a1 are associated with v and k, while temp is
associated with $t0.
We use $t0 since swap is a leaf procedure.
Ph.D. 93
Examples
Produce code
First multiply the index by 4
add
add
add
$t1 , $a1 , $a1

$t1 , $t1 , $t1
$t1 , $a0 , $t1
# $t1 = k *2
# $t1 = k *4
# $t1 = v +( k *4) , the address of v
Next, load v[k] and v[k+1]

lw
lw
$t0 , 0( $t1 )
$t2 , 4( $t1 )
# loads v [ k ] in t0
# loads v [ k +1] in t2
Then, store the swapped addresses.

sw
sw
$t2 , 0( $t1 )
$t0 , 4( $t1 )
jr
$ra
# v [ k ] = $t2
# v [ k + 1] = $t0
Ph.D. 94
Examples
sort Procedure
sort ( int v [] int n )
{
int i j ;
for ( i = 0; i < n ; i = i + 1){
for ( j =i -1; j >=0 && v [ j ] > v [ j +1]; j =j -1) { swap (v , j )
}
}
}
Assume that i is in $s0, j is in $s1, v base address is in $s2, and n is in $s3.
Ph.D. 95
Examples
sort Procedure
Saving Registers :
sort :
addi
sw
sw
sw
sw
sw
$sp , $sp , -20

$ra , 16( $sp )
$s3 , 12( $sp )
$s2 , 8( $sp )
$s1 , 4( $sp )
$s0 , 0( $sp )
Parameter saving :
move
$s2 , $a0
move
$s3 , $a1
Outer Loop :
move
for1tst : slt
beq
$s0 , $zero
$t0 , $s0 , $s3
$t0 , $zero , exit1
Ph.D. 96
Examples
sort Procedure
Inner Loop :
addi
for2tst : slti
bne
add
add
add
lw
lw
slt
beq
$s1 ,
$t0 ,
$t0 ,
$t1 ,
$t1 ,
$t2 ,
$t3 ,
$t4 ,
$t0 ,
$t0 ,
$s0 , -1
$s1 , 0
$zero , exit2
$s1 , $s1
$t1 , $t1
$s2 , $t1
0( $t2 )
4( $t2 )
$t4 , $t3
$zero , exit2
Pass Parameters and call

move
move
jal
$a0 , $s2
$a1 , $s1
swap
Ph.D. 97
Examples
sort Procedure
Inner loop
addi
j
$s1 , $s1 , -1
for2tst
outer loop
exit2 : addi
j
$s0 , $s0 , 1
for1tst
Restoring Registers
exit1 :
lw
lw
lw
lw
lw
addi
jr
$s0 ,
$s1 ,
$s2 ,
$s3 ,
$ra ,
$sp ,
$ra
0( $sp )
4( $sp )
8( $sp )
12( $sp )
16( $sp )
$sp , 20
Ph.D. 98
Examples
Arrays vs. Pointers
Modern optimizing compilers can produce just as good code for pointer or arrays.
Consider the code
clear1 ( int array [] , int size )

{
int i
for ( i = 0; i < size ; i = i + 1)
array [ i ] = 0;
}
clear2 ( int * array , int size )
{
int * p ;
for ( p = & array [0]; p < & array [ size ]; p = p + 1)
* p = 0;
}
Ph.D. 99
Examples
Arrays vs. Pointers

clear1 uses indices while clear2 uses pointers.
The second procedure deserve some explanations
The address of a variable is denoted by &.

The object pointed by a pointer is indicated by *.
The declarations *p and *array declare them as pointers to
integers.
Let us look at the assembly code.
Ph.D. 100
Examples
Array version of clear

Assume $a0 and $a1 hold array and size respectively. i is allocated in
register $t0.
loop1 :
move
add
add
add
sw
addi
slt
bne
$t0 , $zero
# i = 0
$t1 , $t0 , $t0
# i = i *2
$t1 , $t1 , $t
# i = i *4
$t2 , $a0 , $t
# $t2 = address of array [ i ]
$zero , 0( $t2 )
# array [ i ] = 0
$t0 , $t0 , 1
# i = i +1
$t3 , $t0 , $a1
# $t3 = (i < size )
$t3 , $zero , loop1
# if (i < size ) go to loop1
Ph.D. 101
Examples
Pointer version of clear
Assume $a0 and $a1 hold array and size respectively. p is allocated in
register $t0.
loop2 :
move
sw
addi
add
add
add
slt
bne
$t0 , $a0
$zero , 0( $t0 )
$t0 , $t0 , 4
$t1 , $a1 , $a1
$t1 , $t1 , $t1
$t2 , $a0 , $t1
$t3 , $t0 , $t2
$t3 , $zero , loop2
#
#
#
#
#
#
#
#
p = address of array [0]

Memory [ p ] = 0
p=p + 4
$t1 = size * 2
$t1 = size * 4
$t2 = address of array [ size ]
$t3 =( p <& array [ size ])
if (p <& array [ size ]) goto loop
Both programs assume size > 0.
Ph.D. 102
Examples
Improved Pointer version of clear

This version moves the address calculation out of the loop.
loop2 :
move
add
add
add
sw
addi
slt
bne
$t0 , $a0
$t1 , $a1 , $a1
$t1 , $t1 , $t1
$t2 , $a0 , $t1
$zero , 0( $t0 )
$t0 , $t0 , 4
$t3 , $t0 , $t2
$t3 , $zero , loop2
#
#
#
#
#
#
#
#
p = address of array [0]

$t1 = size * 2
$t1 = size * 4
$t2 = address of array [ size ]
Memory [ p ]=0
p = p +4
$t3 =( p <& array [ size ])
if (p <& array [ size ]) goto loop
Ph.D. 103
Examples
Comparing
Array Version
Pointer Version
move
$t0 , $zero
add
$t1 , $t0 , $t0
add
$t1 , $t1 , $t1
add
$t2 , $a0 , $t1
sw
$zero , 0( $t2 )
addi
$t0 , $t0 , 1
slt
$t3 , $t0 , $a1
bne
$t3 , $zero , loop1
move
$t0 , $a0
add
$t1 , $t1 , $a1
add
$t1 , $t1 , $t1
add
$t2 , $a0 , $t1
loop2 : sw
$zero , 0( $t0 )
addi
$t0 , $t0 , 4
slt
$t3 , $t0 , $t2
bne
$t3 , $zero , loop2
loop1 :
The array version has to multiply the index every iteration.

The pointer is updated more efficiently.
Instructions per iterations are 7 and 4 from left to right.
An optimized compiler will translate array versions to a pointer version.
Ph.D. 104
Real Stuff
IA-32 Instructions
Designers sometimes provide more powerful instructions than
those found in MIPS.
Their goal is reduce the number of instructions in a program.
The danger is increasing the complexity of the hardware, increasing
the time to execute.
MIPS was the vision of a single small group in 1985.
Not the case of the Intel IA-32, developed by several independent
groups who evolved the architecture over 20 years.
Ph.D. 105
Real Stuff
IA-32 Milestones
1978: The Intel 8086 was announced as a register dedicated architecture.
1980: The Intel 8087 FP coprocessor is announced. Used the stack instead of
registers. Extended the 8086 architecture in 60 instructions.
1982: The 80286 extended the 8086 architecture by increasing the address space
to 24 bits, creating an elaborate memory-mapping and protection model, and
adding a few instructions to handle the protection.
1985: the 80386 extended the 80286 to 32 bits. Also added new instructions
turning the 386 into a nearly general purpose register machine. Paging support
was also added.
1989-95: The 80486, Pentium, and Pentium Pro aimed for higher performance
adding only 4 new user-visible instructions.
Ph.D. 106
Real Stuff
IA-32 Milestones
1997: MMX (Multi Media Extension) expanded Pentium and Pentium Pro
architectures. 57 new instructions using the FP stack to accelerate multimedia and
communications applications.
1999: Intel added another 70 instructions labelled SSE (Streaming SIMD
Extensions) as part of Pentium III.
2001: Intel adds another 144 instructions for double precision arithmetic. FP
registers can be used for FP operations instead of the stack.
AMD enhances the IA-32 architecture increasing the address space from 32 to 64
bits. It provides a legacy mode, identical to IA-32 and a compatibility mode
AMD64 (user programs are IA-32, operating system is IA-64).
Intel capitulates and embraces AMD64 enhancing it with a 128-bit compare and
swap instruction. Adds SSE3 supporting complex arithmetics. AMD will offer
SSE3 in subsequent chips.
Ph.D. 107
Real Stuff
The 80386 Register Set
80386 Register Set

The 80386 extended all 80286 16-bit
(except segment registers) register to
32 bits.
The prefix E was added to the name
to denote 32-bit version.
The 80386 has only 8 GPRs as
opposed to 32 GPRs of MIPS.
Ph.D. 108
Real Stuff
Operand types for arithmetic, logical, and data transfer

instructions
Source/destination operand
Second Source operand
Register
Register
Register
Immediate
Register
Memory
Memory
Register
Memory
Immediate
The IA-32 logical and arithmetic instructions used one operand as source and destination.
Ph.D. 109
Real Stuff
Addressing Modes
Mode
Description
Register Restrictions
MIPS equivalent
Register Indirect
Address in a register
not ESP or EBP
Based mode with 8 or 32bit displacement
Address is the content of a

register plus displacement
not ESP or EBP
Based plus scaled index
Address is Base +
(2scale index) where
scale is 0, 1, 2, or 3
Base: Any GPR Index:

Not ESP
Address is Base +
(2scale index) where
scale is 0, 1, 2, or 3
Base: Any GPR Index:

Not ESP
lw $s0, 0($s1)
Based plus scaled index

with 8 or 32-bit displacement
lw $s0, 100($s1)
mul $t0, $s2,4 add $t0,
$t0, $s1 lw $s0, 0($t0)
mul $t0, $s2,4 add $t0,
$t0, $s1 lw $s0, 100($t0)
Two size of addresses within the instruction: displacements are 8 or

32-bit wide.
Memory operands can be used in any instruction.
There are restrictions on what registers can be used with each mode.
Ph.D. 110
Real Stuff
Classes Of Instructions
Four major classes of instructions
1
Data movement instructions: move, push, pop, etc.
2
Arithmetic and logic instructions: test, integer, decimal
arithmetic operations.
3
Control flow: conditional branches, unconditional jumps,
calls, and returns.
4
String instructions: string move and string compare.
Ph.D. 111
Real Stuff
Comparing IA-32 and MIPS

IA-32
MIPS
Arithmetic and logic instructions: one

operand in memory.
Only data transfer instructions access

memory
Conditional branches are based on

condition codes or flags.
Conditional branches based on an

arithmetic comparison speeds up
comparison with zero.
Comparison with 0 requires extra

instructions.
Branch address is specified in bytes.
Branch address is specified in words

favoring simplicity.
Ph.D. 112
Real Stuff
The 80386 Instruction

Format
One opcode bit says if the
offset is 8 or 32-bit wide.
Opcode Post-byte specifying
the addressing mode.
Second Post-byte for the
based plus scaled index
modes.
Ph.D. 113
Fallacies and Pitfalls
More powerful instructions mean higher performance.

Data transfers performed with 80x86 prefix repeating instructions
yield 40 MB/sec, while load/store data transfer yield 60 MB/sec.
Write code in assembly language for higher performance.
With the level of optimization included in today compilers, the code
written in high level language is often faster than code written in
assembly, specially for long programs.
Ph.D. 114
Concluding Remarks
The 4 Design Principles
1
Simplicity favors regularity. The regularity of the fields in the

MIPS instruction allow all the instruction types to be processed
almost by the same hardware, keeping the machine simple.
Smaller is faster. Speed is the reason for 32 registers instead of

more.
Good design demands good compromises. For example, MIPS

does not provide 32 bits for immediate addresses, to keep all
instruction the same length.
Make the common case fast. Arithmetic immediate instructions

are the example of this principle.
Ph.D. 115
References
John L. Hennessy and Patterson.

Computer Organization and Design, The Hardware/Software Interface,, volume 1.
MK, San Mateo, CA, 2007.
Ph.D. 116

Undergrad Instruction Set

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Undergrad Instruction Set

Uploaded by

Copyright:

Available Formats

COE5320 Computer Architecture

Instruction Set Architecture [1]

September 18, 2014

COE5320 Computer Architecture

Instructions: Language of the Computer

COE5320 Computer Architecture

Instructions: Language of the Computer

Each MIPS instruction is able to perform only ONE operation.

# The sum of b and c is placed in a .

Three arithmetic operations generate 3 instructions.

Instructions: Language of the Computer

COE5320 Computer Architecture

Instructions: Language of the Computer

First Design Principle

COE5320 Computer Architecture

Instructions: Language of the Computer

First Design Principle

COE5320 Computer Architecture

# temporary variable t0 becomes g + h

Instructions: Language of the Computer

First Design Principle

Always three operands

COE5320 Computer Architecture

COE5320 Computer Architecture

COE5320 Computer Architecture

COE5320 Computer Architecture

Data Transfer Instructions

COE5320 Computer Architecture

will be compiled into

COE5320 Computer Architecture

COE5320 Computer Architecture

illustrate arithmetic immediate instruction.

COE5320 Computer Architecture

Third Design Principle

COE5320 Computer Architecture

Figure: MIPS Architecture revealed so far

COE5320 Computer Architecture

COE5320 Computer Architecture

COE5320 Computer Architecture

Fourth design principle

COE5320 Computer Architecture

COE5320 Computer Architecture

Multiple formats complicate hardware,

COE5320 Computer Architecture

lw $t0 , 1200( $t1 ) # Temporary reg $t0 gets A [300]

Figure: Instruction fields in decimal representation

COE5320 Computer Architecture

Figure: Instruction fields in binary representation

COE5320 Computer Architecture

Two Key Principles

Instructions are represented by

Programs are stored in memory

COE5320 Computer Architecture

COE5320 Computer Architecture

COE5320 Computer Architecture

executing instruction sll t2,s0, 4, $t2 turns into:

COE5320 Computer Architecture

Shifting left by i bits gives the same result as multiplying by 2i

$t1 = 0000 0000 0000 0000 0011 1100 0000 0000

COE5320 Computer Architecture

Further if we execute nor $t0, $t1, $t2, the result is

MIPS also provides

COE5320 Computer Architecture

Figure: MIPS ISA so far