You are on page 1of 99

Computer Architecture and

Organization
(SWE 203)
Dr. R. Bhargavi
SCSE

Syllabus
Unit I - FUNDAMENTALS OF COMPUTER ARCHITECTURE
Organization of the von Neumann machine; Instruction formats;
The fetch/execute cycle, instruction decoding and execution;
Registers and register files; Instruction types and addressing
modes; Subroutine call and return mechanisms; Programming in
assembly language; I/O techniques and interrupts; Other design
issues
Unit II - COMPUTER ARITHMETIC
Data Representation, Hardware and software implementation of
arithmetic unit for common arithmetic operations: addition,
subtraction, multiplication, division( Fixed point and floating
point); Conversion between integer and real numbers;
Representation of non-numeric data (character codes, graphical
data)

Syllabus (cont )
Unit III - MEMORY SYSTEM ORGANIZATION & ARCHITECTURE
Memory systems hierarchy; Coding, data compression, and data
integrity; Electronic, magnetic and optical technologies; Main
memory organization, Types of Main memories, and its
characteristics and performance; Latency, cycle time, bandwidth, and
interleaving; Cache memories (address mapping, line size,
replacement and write-back policies); Virtual memory systems;
Reliability of memory systems; error detecting and error correcting
systems
Unit IV - INTERFACING AND COMMUNICATION
I/O fundamentals: handshaking, buffering; I/O techniques:
programmed I/O, interrupt-driven I/O, DMA; Interrupt structures:
vectored and prioritized, interrupt overhead, interrupts and
reentrant code; Buses: bus protocols, local and geographic
arbitration.

Syllabus (cont )
Unit V - DEVICE SUBSYSTEMS
External storage systems; organization and structure of disk drives
and optical memory ; Basic I/O controllers such as a keyboard and a
mouse; RAID architectures; I/O Performance; Processor to network
interfaces.

Reference Books
1.Carl Hamacher, Zvonko Vranesic, Safwat Zaky,
Computer organization, Mc Graw Hill, Fifth
edition ,Reprint 2011.
2.David A. Patterson and . John L. Hennessy
Computer Organization and Design-The
Hardware/Software Interface 5th edition,
Morgan Kaufmann, 2011
3.W. Stallings, Computer organization and
architecture, Prentice-Hall, 8th edition, 2009.

Introduction
Basic functional components of a computer

Input

ALU

Memory
Output
I/O

Control Unit
Processor

Von Neumann Computer


Stored Program concept
Main memory storing programs and data
ALU operating on binary data
Control unit interpreting instructions from
memory and executing
Input and output equipment operated by control
unit
Princeton Institute for Advanced Studies
IAS
Completed 1952

Structure of von Neumann


machine

IAS - details
1000 x 40 bit words
Binary number
2 x 20 bit instructions
Set of registers (storage in CPU)
Memory Buffer Register
Memory Address Register
Instruction Register
Instruction Buffer Register
Program Counter
Accumulator
Multiplier Quotient

IAS Structure

IAS Memory Formats


0

39

Number word

78
Left Instruction

Opcode

19 20

Address
Instruction word

27 28
Right Instruction

39

Instruction Format
Instruction Fields
Opcode Specifies the operation.
Address Specifies the memory address/
Register.
Mode Specifies the way the operand or
effective address is determined.

Von Neumann vs. Harvard architectures


ALU
Control Unit
Input

ALU

Output

CPU
Memory Unit

Von Neumann Model

Instruction
Memory

Control
Unit

Data
Memory

I/O

Harvard Model

Computer Components:
Top Level View

Instruction Cycle
Two steps:
Fetch Read instruction from memory one at a time
Execute

Fetch Cycle
Program Counter (PC) holds address of next
instruction to fetch
Processor fetches instruction from memory
location pointed to by PC
Increment PC
Unless told otherwise
Instruction loaded into Instruction Register (IR)
Processor interprets instruction and performs
required actions

Execute Cycle
Processor-memory
data transfer between CPU and main memory
Processor - I/O
Data transfer between CPU and I/O module
Data processing
Some arithmetic or logical operation on data
Control
Alteration of sequence of operations
e.g. jump
Combination of above

Example of Program Execution


0001 -Load AC from memory
0010 -Store AC to memory
0101 -Add to AC from
memory

Assumptions:
Instruction/data 16 bits
4 bits opcode
12 bits address

Instruction Cycle State Diagram

Addressing Modes
Addressing modes: The different ways in which the
location of an operand is specified in an instruction.
Immediate
Direct or Absolute
Indirect
Register
Register Indirect
Displacement (Indexed)
Memory Indirect
Autoicrement
Autodecrement
Base Index

Immediate Addressing
Operand is part of instruction
Operand = address field
e.g. ADD 5
Add 5 to contents of accumulator
5 is operand
No memory reference to fetch data
Fast
Limited range
Instruction
Opcode

Operand

Direct Addressing
Address field contains address of operand
Effective address (EA) = address field (A)
e.g. ADD A
Add contents of cell A to accumulator
Look in memory at address A for operand
Single memory reference to access data
No additional calculations to work out effective
address.
Limited address space.

Direct Addressing (cont)


Instruction

Opcode

Address A
Memory

Operand

Indirect Addressing
Memory cell pointed to by address field contains
the address of (pointer to) the operand
EA = (A)
Look in A, find address (A) and look there for
operand
e.g. ADD (A)
Add contents of cell pointed to by contents of A
to accumulator
Large address space
Multiple memory accesses to find operand.
Hence slower

Indirect Addressing (cont)


Instruction

Opcode

Address A
Memory
Pointer to operand

Operand

Register Addressing
Operand is held in register named in address filed
EA = R
Limited number of registers
Very small address field needed
Shorter instructions
Faster instruction fetch
No memory access
Very fast execution
Very limited address space

Register Addressing (cont)


Instruction
Opcode

Register Address R
Registers

Operand

Register Indirect Addressing


Instruction
Opcode

Register Address R

Memory

Registers

Pointer to Operand

Operand

Displacement Addressing
EA = A + (R)
Address field hold two values
A = base value
R = register that holds displacement
or vice versa

Displacement Addressing (cont.)


Instruction
Opcode

Register R

Address A
Memory

Registers

Pointer to Operand

Operand

Indexed Addressing
EA = A + R
A = base
R = displacement

eg. X(Ri)
EA = X + [Ri]

Autoincrement & Autodecrement


Good for accessing arrays
Autoincrement
EA = A + R
(or) EA = [R]
R++
Autodcerement
EA = A + R
(or) EA = [R]
--R

Base with index


Example : (Ri,Rj)
EA = [Ri] + [Rj]

Base with index and offset


Example : X(Ri,Rj)
EA = X + Reg [R2] + Reg[R3]

Registers R1 and R2 of a computer contain decimal values


1200 and 4600. What is the effective address of the
memory operand in each of the following instructions?
1. Load 20(R1), R5
2. Move #3000, R5
3. Store R5, 30(R1, R2)
4. Add (R2), R5
5. Subtract (R1)+, R5

1.
2.
3.
4.
5.

1220
Part of the instruction
5830
4599
1200

CPU organization
Single accumulator organization
One address field

General Register Organization


Three address fields
Two address fields

Stack Organization
Zero address fields for operation-type
instructions.

Hybrid Organization

Instruction Types
Three address instructions
operation Destination, source1, source2
Example : Add R, X, Y // R X + Y
Advantages: Results in writing short programs
Disadvantages: More bits are needed to specify three
addresses.
Two address instructions
operation Destination, source
Example: Add R, X // R R + X
Advantages: Results in writing medium size programs
Disadvantages: More bits are needed to specify two
addresses.

Instruction Types (cont)


One address instructions
Example: Add X // A A + X
Transfer between Accumulator and memory : LOAD,
STORE
Example: Load X
Add Y
Advantages: Fewer bits are needed to specify the address.
Disadvantages: Results in writing long programs.
Zero address instructions
Example : Push X
Push Y
Add
Advantages: No memory addresses needed during the operation.
Disadvantages: results in longer program codes.

Exercise
Consider the expression X = (A + B) * (C + D)
A, B, C, D, and X are the memory addresses.
Write the sequence of instructions to solve the above expression using
3 address, 2 address, 1 address and zero address instruction types
Arithmetic operations : ADD, SUB, MUL, DIV.
Data transfer operations : MOV
Transfer between Accumulator and memory : LOAD, STORE
At least one register operand must be used

Three Address Instructions


ADD R1, A, B // R1 M[A] + M[B]
ADD R2, C, D // R2 M[C] + M[D]
MUL X, R1, R2 //M[X] R1 * R2

Two Address Instructions


MOV R1, A
ADD R1, B
MOV R2, C
ADD R2, D
MUL R1, R2
MOV X, R1

// R1 M[A]
// R1 R1 + M[B]
// R2 M[C]
// R1 R1 * R2
// M[X] R1

One Address Instructions


LOAD A // AC M[A]
ADD B
// AC AC + M[B]
STORE T // M[T] AC
LOAD C // AC M[C]
ADD D // AC AC + M[D]
MUL T // AC AC * M[T]
STORE X // M[X] AC

ZERO Address Instructions


PUSH A
PUSH B
ADD
PUSH C
PUSD D
ADD
MUL
POP X

Byte Addressability
Bit, Byte, Word
Bytes are always 8 bits.
Word length typically ranges from 16 to 64 bits.
Successive memory addressing assignments
refer to successive byte locations in memory
Memory is byte-addressable.
If the word length is 32, successive words are
located at addresses 0, 4, 8,.

Word length
Signed integer:
Most significant bit stores sign information
MSB
b31

LSB
b30

32 bits

b2

b1

b0

Big endian and Little endian


W ord
address

Byte address

Byte address

2 - 4

2 - 4

2 - 3

2 - 2

2 - 1

(a) Big-endian assignment

2 - 4

2 - 1

2 - 2

2 - 3

2 - 4

(b) Little-endian assignment

Big endian and Little endian (cont..)


Example : Hexadecimal number 12345678
Big endian :
Word address
100
104

Little endian :
Word address
100
104

100

101

102

103

12

34

56

78

100

101

102

103

78

56

34

12

Translation Hierarchy
C Program
Compiler
Assembly Language Program
Assembler

Object: Machine language module

Object: Library routine(mach.lang)

Linker
Executable: Machine language program
Loader
Memory

Programming Languages - Levels


High Level Languages
Designed to eliminate the technicalities of a particular
computer.
Statements compiled in a high level language typically
generate many low-level instructions.
e.g. : C, C++ and Vbasic

Assembly Language (Low Level Language)


Designed for a specific family of processors
Consists of symbolic instructions directly related to
machine language instructions.

Machine Language
Consists of individual instructions that will be executed by
the CPU one at a time

Assembly Language
Machine instructions are represented in 0s and 1s
Very difficult to read, interpret and write.
Mnemonics : Group of bits in machine language.
Assembly Language : Set of mnemonics along with
the rules for using them.
Assembly language format
Label
Operation Operand(s)
Comments
Assembler : Translates Assembly language program
into machine language code.

Assembly Language (cont)


Assembler directives
Assembler directives give information to the
assembler to build the machine language module.
Assembler directives are used to
declare variables
declare constants
create storage space for results
and to label locations in the code to be used as
branch destinations.

Assembly Language (cont)


Equal to (EQU) Directive:
Declares a constant value.
LENGTH EQU 10 ; sets the value of LENGTH to 10

EQU does not cause anything to be stored in


memory.
Symbol table: the value of a label declared with EQU is
the value, NOT THE MEMORY LOCATION.

Assembly Language (cont)


Code Label
Labels an instruction in the program so that it can be
used as a branch target.
LOOP ADD R1, R2
; LOOP is the code label
Symbol table: The value of a code label is the memory
location of the instruction on that line.
END
Tells the assembler to stop looking for more code.
This should be the last line in any code file.

Assembler
Translates
Assembly language program into
executable machine language object code. This is
done by
Replacing all symbols denoting operations and
addressing modes with the binary codes
Replacing all names and labels with their actual
value.
If the program contains a branch to a later location
in the code (a forward branch), the assembler wont
know where to find the branch target because it
would not have seen it yet.

Assembler (cont )
Most assemblers are two-pass assemblers.
On the first pass, any labels found in the
program are put into the symbol table along
with the corresponding numerical value.
Code labels memory location of instruction
Variable labels memory location of
reserved memory
Constants (EQU) value of constant
On the second pass, all labels in the code are
replaced with their values from the symbol
table.

Example
Label
SUM
N
NUM1

START

LOOP

Operation
EQU
ORIGIN
DATAWORD
RESERVE
ORIGIN
MOVE
MOVE
CLR
ADD
ADD
DEC
BGTZ
MOVE
RETURN
END

Addressing/data information
200
204
100
400
100
N, R1
#NUM1, R2
R0
(R2), R0
#4, R2
R1
LOOP
R0, SUM

Example (cont)
100
104
108
LOOP 112
116
120
124
128

SUM
N
NUM1
NUM2
Numn

200
204
208
212

604

MOVE
MOVE
CLR
ADD
ADD
DEC
BGTZ
MOVE
100

N, R1
#NUM1, R2
R0
(R2), R0
#4, R2
R1
LOOP
R0, SUM

Stack

Stack: Stores information in such a manner that the


item stored last is the first item retrieved.
Also called last-in first-out (LIFO) list. Useful for
compound arithmetic operations and nested
subroutine calls.
Stack pointer (SP): A register that holds the address
of
the
top
item
in
the
stack.
SP always points at the top item in the stack
Push: Operation to insert an item into the stack.
Pop: Operation to retrieve an item from the stack.

Stack (cont)
A stack can be organized as a collection of a finite
number of registers.

Stack (cont)
In a 64-word stack, the stack pointer contains 6
bits (2 6 = 64).
The one-bit register FULL is set to 1 when the
stack is full
EMPTY register is 1 when the stack is empty.
The data register DR holds the data to be written
into or read from the stack.

Stack - Micro operations


Initialization
SP 0, EMPTY 1, FULL 0
Push
SP SP + 1
M[SP] DR
If (SP = 0) then (FULL 1) // SP becomes 0 after 63
EMPTY 0

Pop
DR M[SP]
SP SP - 1
If (SP = 0) then (EMPTY 1)
FULL 0

Stack (cont)
The stack normally grows backwards in the memory.
In other words, the programmer defines the bottom
of the stack and the stack grows up into reducing
0
address range.
Stack pointer reg.

Push
Sub #4, sp
Mov ITEM, (SP)
Pop
Mov (sp), ITEM
Add #4, sp

SP

56
29

Stack

Bottom

13

2k - 1

Subroutines
A subroutine is a group of instructions that will be
used repeatedly in different locations of the
program.
Instead of repeating the same instructions several
times, they can be grouped into a subroutine that
can be called from the different locations.
In Assembly language, a subroutine can exist
anywhere in the code.
However, it is customary to place subroutines
separately from the main program.

Subroutines (cont)
Can be called multiple times to perform task
By main program
By another subroutine
Call and Return instructions
Call a subroutine
Return from the subroutine

PC holds the address (while the CALL instruction


is being executed) from where the calling
program has to resume.
This is done by subroutine linkage method.

Subroutines (cont)
Subroutine linkage using link register
CALL instruction performs the following operations
Store the contents of PC in the link register.
Brach to the target address specified by the instruction.

Return instruction performs the following operations


Branch to the address contained in the link register.

Subroutine nesting - can be achieved by pushing the


PC contents into processor stack while calling a
subroutine and popping out the PC contents while
returning back to the called program.

Parameter Passing
Information exchanged between a calling program
and a subroutine
Call by Reference:
The data is stored in one of the registers by the
calling program and the subroutine uses the value
from the register. The register values get modified
within the subroutine. Then these modifications
will be transferred back to the calling program upon
returning from a subroutine

Parameter Passing (cont)


Calling program
Move N, R1
Move #NUM1, R2
Call LISTADD
Move R0, SUM
Subroutine
LISTADD
Clear R0
LOOP
ADD (R2) + R0
Decrement R1
Brach > 0 LOOP
Return

Parameter Passing (cont)


Call by Value:
The data is stored in one of the registers, but the
subroutine first PUSHES register values in the stack
and after using the registers, it POPS the previous
values of the registers from the stack while exiting
the subroutine. i.e. the original values are restored
before execution returns to the calling program.
Calling program
Move #NUM1, -(SP)
Move N, -(SP)
Call LISTADD
Move 4(SP), SUM
Add #8, SP

Parameter Passing (cont)


Subroutine
LISTADD

LOOP

MoveMultiple R0-R2, -(SP)


Move 16(SP), R1
Move 20(SP), R2
Level 3
Clear R0
ADD (R2) + R0
Decrement R1
Level 2
Branch > 0 LOOP
Move R0, 20(SP)
MoveMultiple (SP)+, R0-R2
Return
Level 1

[R2]
[R1]
[R0]
Return Address
n
NUM1

Stack frame & Frame pointer


SP
Stack
pointer

Saved [R1]
Saved [R0]
Local Var 2
Local Var 1

FP
Frame
pointer

Saved [FP]

Stack Frame
for called
subroutine

Return Address
Parameter 1

Parameter 2
Parameter 3
Old TOS

Register
Parallel load register

X31 0 : Input
Z31 0 : output
C Control
clk

Register File (MIPS)

Input/Output Problems
Wide variety of peripherals
Delivering different amounts of data
At different speeds
In different formats
All slower than CPU and RAM
Need I/O modules

I/O Mapping
Memory mapped I/O
Devices and memory share an address space
I/O access looks just like memory access
No special commands for I/O
Large selection of memory access commands
available
Isolated I/O
Separate address spaces
Need I/O or memory select lines
Special commands for I/O
Limited set

Input/ Output Techniques


Programmed I/O
Program uses a busy-wait loop
Anticipated transfer
Interrupt-driven I/O
Interrupts are used to initiate and/or terminate
data transfers
Powerful technique
Handles unanticipated transfers
Direct memory access (DMA)
Special controller (DMA controller) handles data
transfers
Typically used for bulk data transfer

I/O Data Transfer Methods


Programmed I/O (PIO): Polling (For low-speed
I/O)
The I/O device puts its status information in a
status register.
The processor must periodically check the
status register.
The processor is totally in control and does all
the work.
Waste of processor time.
Used for low-speed I/O devices (mice,
keyboards etc.)

Bus Connection for Processor & I/O


System components are interconnected by buses
Bus: a bunch of parallel wires
BUS

Processor

DATAIN

SIN

Keyboard

DATAOUT

SOUT

Display

I/O Interface - Input Device


Address lines
Control lines
Data lines

Address
Decoder

Data & Status


Registers

Control ckt
I/O interface

Input Device

BUS

Reading a line from Input device


READWAIT

Move
Testbit
Branch=0
MoveByte

#LOC, R1
#3, INSTATUS
READWAIT
DATAIN, (R1)

Interrupts
Mechanism by which other modules (e.g. I/O) may
interrupt normal sequence of processing
Program
e.g. overflow, division by zero
Timer
Generated by internal processor timer
Used in pre-emptive multi-tasking
I/O
from I/O controller
Hardware failure

Interrupts (cont)
Interrupt request signal
Interrupt-acknowledgement signal
Interrupt service routine
Interrupt latency Delay between the
time an interrupt request signal is received
and the start of the execution of the
interrupt service routine.

I/O Data Transfer Methods

Interrupt-Driven I/O (For medium-speed I/O)


An interrupt line from the I/O device to the CPU is used to
generate an I/O interrupt indicating that the I/O device
needs CPU attention.
The interrupting device places its identity in an interrupt
vector (vectored interrupts).
Interrupt vector used to store the starting address of the
interrupt service routine.
Once an I/O interrupt is detected the current instruction is
completed and an I/O interrupt handling routine (by OS)
is executed to service the device.
Used for moderate speed I/O (optical drives, storage,
networks ..)
Allows overlap of CPU processing time and I/O processing
time

Interrupt Cycle
Added to instruction cycle
Processor checks for interrupt
Indicated by an interrupt signal
If no interrupt, fetch next instruction
If interrupt pending:
Suspend execution of current program
Save context
Set PC to start address of interrupt handler
routine
Process interrupt
Restore context and continue interrupted
program

Transfer of Control via


Interrupts

Instruction Cycle with


Interrupts

Interrupt Hardware

INTR = INTR1 + INTR2 + INTR3 + + INTRn

Interrupt Request
Single Line
Interrupt Request

Interrupt Request Line

CPU
Interrupt
Resource

Multiple Line
Interrupt Request

Interrupt
Resource

CPU
Interrupt
Resource

Mixed Interrupt
Request

...

CPU

Interrupt Interrupt
Resource Resource

Interrupt
Resource

...

...

Interrupt Interrupt
Resource Resource

...

Enabling and Disabling Interrupts


Since the interrupt request can come at any
time, it may alter the sequence of events
from that anticipated by the programmer.
Interrupts must be controlled.
The interrupt request signal will be active
until it learns that the processor has
responded to its request.

Enabling and Disabling Interrupts


(cont)
Method 1:
Processor hardware to ignore the interrupt
requests until the execution of the first line in
interrupt service routine.
Execute interrupt-disable instruction as the
first instruction in the interrupt service
routine.
Execute interrupt-enable instruction as the
last instruction in the interrupt service
routine.

Enabling and Disabling Interrupts


(cont)
Method 2
Processor to automatically disable the
interrupts before starting the execution of
Interrupt service routine.
Save PC and PS (process status) register
contents in the stack.
Set interrupt enable bit in PS to 0.
On Return-from-interrupt instruction save
the PS and PC contents back.
Set interrupt enable bit in PS to 1.

Enabling and Disabling Interrupts


(cont)
Method 3
Make interrupt request line edge triggered.

Handling Multiple Devices


How can the processor recognize the device
requesting an interrupt?

Poll the devices and check the IRQ bit of the


status register.
Easy to implement
Waste of time

Handling Multiple Devices (cont...)


Given that different devices are likely to require
different interrupt-service routines, how can the
processor obtain the starting address of the
appropriate routine in each case?
Vectored Interrupts
A device requesting an interrupt can identify
itself by sending a special code to the
processor over the bus.
Interrupt vector starting address of Interrupt
service routine
Interrupt req INTA Interrupt vector.

Interrupt Nesting

Simultaneous Requests

Controlling Device Requests


Some I/O devices may not be allowed to
issue interrupt requests to the processor.
At device end, an interrupt-enable bit in a
control register determines whether the
device is allowed to generate an interrupt
request.
At processor end, either an interrupt
enable bit in the PS register or a priority
structure determines whether a given
interrupt request will be accepted.

Quiz 1
1. Consider the following possibilities for saving the return

address of a subroutine.
In a processor register
In a memory location associated with the call, so that a
different location is used when the subroutine is called
from different places.
On a stack.
Which of these possibilities supports subroutine nesting and
which supports subroutine recursion?
2. Why most of the assemblers are two-phase assemblers?
3. Consider 6 devices numbered from 1 through 6 connected to
the bus of a computer. There are priorities to which the devices
belong to. Devices 1, 3 belong to one priority group P1. Similarly
devices 2, 4 belong to priority group P2 and the devices 5 and 6
belong to priority group P3. Among each priority group, the
device with lowest ID has the highest priority. There are three
interrupt lines. With the help of a block diagram implement the
interrupt mechanism for the devices so that the devices can
raise the interrupt whenever required and get served.

You might also like