Professional Documents
Culture Documents
8-2
CISC Architecture
Examples
Intel x86, IBM Z-Series Mainframes, older
CPU architectures
Characteristics
Few general purpose registers
Many addressing modes
Large number of specialized, complex
instructions
Instructions are of varying sizes
Chapter 8: CPU and Memory:
8-3
8-4
RISC Features
Examples
Power PC, Sun Sparc, Motorola 68000
8-5
8-6
8-7
8-8
8-9
VLIW Architecture
Transmeta Crusoe CPU
128-bit instruction bundle = molecule
4 32-bit atoms (atom = instruction)
Parallel processing of 4 instructions
8-10
EPIC Architecture
Intel Itanium CPU
128-bit instruction bundle
3 41-bit instructions
5 bits to identify type of instructions in bundle
8-11
Paging
Managed by the operating system
Built into the hardware
Independent of application
8-12
8-13
8-14
8-15
8-16
8-17
Memory Enhancements
Memory is slow compared to CPU processing
speeds!
2Ghz CPU = 1 cycle in of a billionth of a second
70ns DRAM = 1 access in 70 millionth of a second
Memory Interleaving
Cache Memory
Chapter 8: CPU and Memory:
8-18
Memory Interleaving
8-19
Why Cache?
Accesso in memoria spreca cicli di
clock!
Accesso cache 2 nanosecondi: 1 access in
2 millionth of a second
Guadagno velocita
Es:
Trovare tempo accesso alle memorio
Dram, ddram sdram, sram,
Chapter 8: CPU and Memory:
8-20
Cache Memory
Blocks: 8 or 16 bytes
Tags: location in main memory (memoria
fisica!)
Cache controller
hardware that checks tags
Cache Line
Unit of transfer between storage and cache memory
8-21
8-22
8-23
Performance Advantages
Hit ratios of 90% common
50%+ improved execution speed
Locality of reference is why caching works
Most memory references confined to small region of
memory at any given time
Well-written program in small loop, procedure or
function
Data likely in array
Variables stored together
Chapter 8: CPU and Memory:
8-24
Two-level Caches
Why do the sizes of the caches have to be
different?
8-25
8-26
8-27
8-28
Timing Issues
Separate Fetch/Execute Units
Pipelining
Scalar Processing
Superscalar Processing
8-29
Timing Issues
8-30
Determine opcode
Identify type of instruction and operands
Execute Unit
Receives instructions from the decode unit
Appropriate execution unit services the instruction
Chapter 8: CPU and Memory:
8-31
8-32
Instruction Pipelining
Assembly-line technique to allow overlapping between fetch-execute cycles of sequences of instructions
Only one instruction is being executed to completion at a time
Scalar processing
Average instruction execution is approximately equal to the clock speed of the CPU
8-33
8-34
Pipelining Example
8-35
Superscalar Processing
Process more than one instruction per
clock cycle
Separate fetch and execute cycles as
much as possible
Buffers for fetch and decode phases
Parallel execution units
8-36
8-37
8-38
Superscalar Issues
Out-of-order processing dependencies
(hazards)
Data dependencies
Branch (flow) dependencies and speculative
execution
Parallel speculative execution or branch
prediction
Branch History Table
Register access conflicts
Logical registers
Chapter 8: CPU and Memory:
8-39
Hardware Implementation
Hardware operations are
implemented by logic gates
Advantages
Speed
RISC designs are simple and typically
implemented in hardware
8-40
Microprogrammed Implementation
Microcode are tiny programs stored in
ROM that replace CPU instructions
Advantages
More flexible
Easier to implement complex instructions
Can emulate other CPUs
Disadvantage
Requires more clock cycles
Chapter 8: CPU and Memory:
8-41
8-42