You are on page 1of 266

von Neumann

Stored Program concept Main memory storing programs and data ALU calculating data Control unit interpreting instructions from memory and executing Input and output equipment operated by control unit

von Neumann

von Neumann
1000 x 40 bit words
Binary number 2 x 20 bit instructions

Set of registers (storage in CPU)


Memory Buffer Register Memory Address Register Instruction Register Instruction Buffer Register Program Counter Accumulator Multiplier Quotient

Moores Law
Increased density of components on chip Gordon Moore - cofounder of Intel Number of transistors on a chip will double every year Since 1970s development has slowed a little Number of transistors doubles every 18 months Cost of a chip has remained almost unchanged Higher packing density means shorter electrical paths, giving higher performance Smaller size gives increased flexibility Reduced power and cooling requirements Fewer interconnections increases reliability

Speeding it up
Pipelining On board cache On board L1 & L2 cache Branch prediction Data flow analysis Speculative execution

Intel 8086 & 8088


The 8086 is a 16-bit microprocessor chip designed by Intel and introduced on the market in 1978, which gave rise to the x86 architecture. Intel 8088, released in 1979, was essentially the same chip, but with an external 8-bit data bus (allowing the use of cheaper and fewer supporting logic chips), and is notable as the processor used in the original IBM PC.

Segmentation
Compilers for the 8086 commonly supported two types of pointer, "near" and "far". Near pointers were 16-bit addresses implicitly associated with the program's code or data segment (and so made sense only in programs small enough to fit in one segment). Far pointers were 32-bit segment:offset pairs.

Segmentation
To avoid the need to specify "near" and "far" on every pointer and every function which took or returned a pointer, compilers also supported "memory models" which specified default pointer sizes. The "small", "compact", "medium", and "large" models covered every combination of near and far pointers for code and data. The "tiny" model was like "small" except that code and data shared one segment. The "huge" model was like "large" except that all pointers were huge instead of far by default. Precompiled libraries often came in several versions compiled for different memory models.

x86 Registers Category


Category General Purpose Register General Purpose Register Pointer Register Bit 32/16 8 32/16 Register Name E/AX,E/BX,E/CX,E/DX AH,AL,BH,BL,CH,CL,DH,DL E/SP (Stack Pointer) E/BP (Base Pointer)

Index Register
Segment Register

32/16
32/16

E/SI (Source Index) E/DI (Destination Index)


CS (Code Segment) DS (Data Segment) SS (Stack Segment) ES (Extra Segment) E/IP (Instruction Pointer) E/FLAGS (Flag Register)
12

Instruction Pointer Register Status Register (Flag)

32/16 32/16

General Purpose Registers


This register is used for general data manipulation Even CPU able to operate on the data stored in memory, the same data can be process much faster if it is in register

Register E/AX E/BX E/CX E/DX

Function Accumulator Register For arithmetic, logic and data transfer operation Base Register Also as address register Count Register Used for loop counter, shift and rotate bits Data Register Used in division and multiplication also I/O operation
13

8-bit Data Division from 16-bit


16-bit register can be divided into two 8-bit register (i.e AX=AH&AL, BX=BH&BL, CX=CH&CL, DX=DH&DL)

Figure 1: 8-bit Data Division from 16-bit

14

16-bit Data form into 32-bit


Similarly, 32-bit register can be made from 16bit register i.e EAX=undefined 16 bits +AX EBX=undefined 16 bits +BX ECX=undefined 16 bits +CX EDX=undefined 16 bits +DX

15

Segment Register
Main memory management in 8086 use segment concept The following show the usage of segment in memory
Segment
Code (CS) Data (DS)

Usage
Space to store program that will be executed Space to store data that will be processed

Stack (SS)

Special space to store information needed by microprocessor to execute subroutine or interrupt service Function is the same as DS

Extra (ES)

17

Instruction Pointer Register (IP)


Register which stores instruction address to be executed Each time instruction is fetch from memory to be executed in processor, IP content will be added so that it always show to the next instruction If branch instruction, the IP content will be loaded with new value which is the branch address
19

Index Register and Pointer


This registers is used for storing relative shifting value for memory address location There are 2 pointer register:
Stack Pointer (SP) point to the top stack Base Pointer (BP) used for fetch data in data segment

There are 2 index register:


Source Index (SI) contains offset address for source operand in data segment Destination Index (DI) - contains offset value for destination operand in DS

20

Assembler
An assembler translates assembly language programs into machine codes It resolves symbolic names for memory locations and other entities.

Assembler
There are two types of assemblers based on how many passes through the source are needed to produce the executable program. One-pass assemblers go through the source code once and assumes that all symbols will be defined before any instruction that references them. Two-pass assemblers (and multi-pass assemblers) create a table with all unresolved symbols in the first pass, then use the 2nd pass to resolve these addresses. The advantage of the two-pass assembler is that symbols can be defined anywhere in the program source.

Assembler
As a result, the program can be defined in a more logical and meaningful way. This makes two-pass assembler programs easier to read and maintain.

Variable Declarations
Our compiler supports two types of variables: BYTE and WORD.
Syntax for a variable declaration: name DB value name DW value

DB - stays for Define Byte. DW - stays for Define Word.

Variable Declarations
Syntax for a variable declaration: name DB value

name DW value
DB - stays for Define Byte. DW - stays for Define Word. name - can be any letter or digit combination value - can be any numeric value in any supported numbering system (hexadecimal, binary, or decimal), or "?" symbol for variables that are not initialized.

Declare Variables
use DB to declare variables with small values 0 to 255. use DW to declare variables with larger values 0 to 65000. C++: int HT=47, WD=2415, SZ; ASM: HT DB 47 WD DW 2415 SZ DW ? unitialised value in ASM is denoted by ?

comments / remarks
C++: comments denoted by // anything after // will be ignored in ASM: comments denoted by ; anything after ; will be ignored C++ has multi-line comments enclosed by /* and */. In ASM, each comment line must be individually preceded by ;

Simple Assignments
The easiest expressions to convert to assembly language are the simple assignments. MOV instruction Simple assignments copy a single value into a variable

Eg:

variable := value C++: P = 5; ASM: MOV P, 5

This move immediate instruction copies the constant into the variable.

Small Sample Program


ORG 100h MOV AL, var1 MOV BX, var2 RET ; stops the program. VAR1 DB 7 var2 DW 1234h

ORG
Program start address ORG 100H normally start at address 100H (hexadecimal)

RET
similar to C++ return statement.

Simple Assignments(MOV)
The easiest expressions to convert to assembly language are the simple assignments. Simple assignments copy a single value into a variable

Eg:

variable := constant C++: P = 5; ASM: MOV P, 5

This move immediate instruction copies the constant into the variable.

It is possible to enter numbers in any system, hexadecimal numbers should have "h" suffix, binary "b" suffix, octal "o" suffix, decimal numbers require no suffix. mov AX, 46H ;hex mov BX, 1011B ;binary mov AH, 251o ;octal mov CH, 36 ;decimal

Assembler Directive/Pseudo-Ops
2 types of ASM statements program instruction
convert to machine code of the program, ADD, SUB, MUL, MOV, DIV, etc

assembler directives (pseudo-ops)


does not convert to machine code of the program info for the assembler eg:
variable declaration DB, DW program start address ORG data, code, stack segment, .DATA, .CODE, .STACK etc

Assignments (MOV)
This example assignment copies a variable into a variable

P := Q
The assignment above is somewhat complicated since the 80x86 doesnt provide a memoryto-memory mov instruction. Therefore, to copy one memory variable into another, you must move the data through a register.

Eg:

C++: ASM:

P = Q; MOV AX, Q MOV P, AX ;AX = Q ;P=AX=Q

Addition (ADD)
Examples of common simple expressions:

X := Y + Z
ASM:

mov ax, y add ax, z mov x, ax

;ax=y ;ax=ax+z=y+z ;x=ax=y+z

Arithmetic Expressions
Arithmetic expressions, in most high level languages, look similar to their algebraic equivalents:

X:=Y+Z;
In assembly language, youll need several statements to accomplish this same task, e.g.,

mov ax, y add z mov x, ax

;ax=y ;ax=ax+y=z+y ;x=ax=x+y

Arithmatic Expressions
A math expression takes the form: var := term1 op term2 Var is a variable, term1 and term2 are variables or constants, and op is some arithmetic operator(+,-,*, /, etc) ASM: op term1 term2 where op = ADD, SUB, MUL, IMUL, DIV, IDIV, etc

SUBTRACT (SUB)
X := Y - Z; ASM: mov ax, y sub ax, z mov x, ax

;ax=y ;ax=ax-z=y-z ;x=ax=y-z

INCREMENT (INC)
X := X + 1; ASM: inc x

DECREMENT (DEC)
X := X - 1; ASM: dec x

Instruction Format
instruction has opcode + operand(s) ADD AX, 2 SUB BX, Y MOV Z, 1584 INC Y

opcode is ADD, SUB, MOV, INC operands are AX, BX, X, Y, 2, 1584

Arithmetic Expressions
Arithmetic expressions, in most high level languages, look similar to their algebraic equivalents:

X:=Y+Z;
In assembly language, youll need several statements to accomplish this same task, e.g.,

mov ax, y add z mov x, ax

;ax=y ;ax=ax+y=z+y ;x=ax=x+y

Complex Expressions
A complex function that is easy to convert to assembly language is one that involves three terms and two operators, for example: W := W - Y - Z;

Complex Expressions
Clearly the straight-forward assembly language conversion of this statement will require two sub instructions. However, even with an expression as simple as this one, the conversion is not trivial. There are actually two ways to convert this from the statement above into assembly language: mov ax, w sub ax, y sub ax, z mov w, ax

Debuggers and Debugging


The final, and almost certainly the most painful, part of the assembly language development process is debugging. Debugging is simply the systematic process by which bugs are located and corrected. A debugger is a utility program designed specifically to help you locate and identify bugs.

Debuggers and Debugging


One of the problems with debugging computer programs is that they operate so quickly. Thousands of machine instructions can be executed in a single second, and if one of those instructions isn't quite right, it's past and gone long before you can identify which one it is by staring at the screen. A debugger allows you to execute the machine instructions in a program one at a time, allowing you to pause indefinitely between each one to examine the effects of the last instruction on the screen. The debugger also lets you look at the contents of any location in memory, and the values stored in any register, during that pause between instructions.

Opcode mnemonics
Instructions (statements) in assembly language are generally very simple, unlike those in high-level languages. Generally, an opcode is a symbolic name for a single executable machine language instruction, and there is at least one opcode mnemonic defined for each machine language instruction. Each instruction typically consists of an operation or opcode plus zero or more operands.

Assembly directives / pseudo-ops


Assembly Directives are instructions that are executed by the Assembler at assembly time, not by the CPU at run time.

Assembling the Source Code File


The text editor first creates a new text file, and later changes that same text file, as you extend, modify, and perfect your assembly language program. As a convention, most assembly language source code files are given a file extension of .ASM. In other words, for the program named FOO, the assembly language source code file would be named FOO.ASM. It is possible to use file extensions other than .ASM, but I feel that using the .ASM extension can eliminate some confusion by allowing you to tell at a glance what a file is for-just by looking at its name. All told, about nine different kinds of files can be involved during assembly language development-more if you take the horrendous leap into Windows software development.

Assembling the Source Code File


Each type of file will have its own standard file extension. Anything that will help you keep all that complexity in line will be worth the (admittedly) rigid confines of a standard naming convention. As you can see from the flow in figure above, the editor produces a source code text file, which we show as having the .ASM extension. This file is then passed to the assembler program itself, for translation to a re locatable object module file with an extension of .OBJ. When you invoke the assembler, DOS will load the assembler from disk and run it.

Assembling the Source Code File


The assembler will open the source code file you named after the name of the assembler and begin processing the file. Almost immediately afterward, it will create an object file with the same name as the source file, but with an .OBJ extension. As the assembler reads lines from the source code file, it will examine them, construct the binary machine instructions the source code lines represent, and then write those machine instructions to the object code file. When the assembler comes to the end of the source code file, it will close both source code file and object code file and return control to DOS.

Linking
In traditional assembly language work, what actually happens is that the assembler writes an intermediate object code file with an .OBJ extension to disk. You can't run this .OBJ file, even though it generally contains all the machine instructions that your assembly language source code file specified. The .OBJ file needs to be processed by another translator program, the linker. The linker performs a number of operations on the .OBJ file, most of which would be meaningless to you at this point. The most obvious task the linker does is to weave several .OBJ files into a single .

Debuggers and Debugging


The final, and almost certainly the most painful, part of the assembly language development process is debugging. Debugging is simply the systematic process by which bugs are located and corrected. A debugger is a utility program designed specifically to help you locate and identify bugs.

Debuggers and Debugging


One of the problems with debugging computer programs is that they operate so quickly. Thousands of machine instructions can be executed in a single second, and if one of those instructions isn't quite right, it's past and gone long before you can identify which one it is by staring at the screen. A debugger allows you to execute the machine instructions in a program one at a time, allowing you to pause indefinitely between each one to examine the effects of the last instruction on the screen. The debugger also lets you look at the contents of any location in memory, and the values stored in any register, during that pause between instructions.

You might also like