You are on page 1of 52

CS2422 Assembly Language and System Programming

Assembly Language
Fundamentals

Department of Computer Science


National Tsing Hua University
Assembly Language for Intel-
CS2422 Assembly Language and System Programming
Based Computers, 5th Edition
Kip Irvine

Chapter 3: Assembly Language


Fundamentals
Slides prepared by the author
Revision date: June 4, 2006

(c) Pearson Education, 2006-2007. All rights reserved. You may modify and copy this slide show for your personal use,
or for use in the classroom, as long as this copyright statement, the author's name, and the title are not changed.
Chapter Overview
 Basic Elements of Assembly Language
 Example: Adding and Subtracting Integers
 Assembling, Linking, and Running Programs
 Defining Data
 Symbolic Constants
 Real-Address Mode Programming

2
Starting with an Example
TITLE Add and Subtract (AddSub.asm)
; Adds and subtracts three 32-bit integers
; (10000h + 40000h + 20000h)
INCLUDE Irvine32.inc
.code
Title/header
main PROC
mov eax,10000h ; EAX = 10000hInclude file
add eax,40000h ; EAX = 50000h
sub eax,20000h ; EAX = 30000h
call DumpRegs ; display registers
exit
main ENDP
END main Code section
3
Meanings of the Code
Assembly code Machine code
MOV EAX, 10000h B8 00010000
(Move 10000h into EAX)
Operand in instruction

ADD EAX, 40000h 05 00040000


(Add 40000h to EAX)

SUB EAX, 20000h 2D 00020000


(SUB 20000h from EAX)

4
Fetched MOV EAX, 10000h

Register Memory
EAX
EBX
… data B8
00
01 MOV EAX, 10000h
00
00
05
00
ALU 04 ADD EAX, 40000h
00
00
SUB EAX, 20000h
IR PC
B8 00010000 0000011 …

address 5
Execute MOV EAX, 10000h

Register Memory
EAX 00010000
EBX
… data B8
00
01 MOV EAX, 10000h
00
00
05
00
ALU 04 ADD EAX, 40000h
00
00
SUB EAX, 20000h
IR PC
B8 00010000 0000011 …

address 6
Fetched ADD EAX, 40000h

Register Memory
EAX 00010000
EBX
… data B8
00
01 MOV EAX, 10000h
00
00
05
00
ALU 04 ADD EAX, 40000h
00
00
SUB EAX, 20000h
IR PC
05 00040000 0001000 …

address 7
Execute ADD EAX, 40000h

Register Memory
EAX 00010000
00050000
EBX
… data B8
00
01 MOV EAX, 10000h
00
00
05
00
ALU 04 ADD EAX, 40000h
00
00
SUB EAX, 20000h
IR PC
05 00040000 0001000 …

address 8
Chapter Overview
 Basic Elements of Assembly Language
 Integer constants and expressions
 Character and string constants
 Reserved words and identifiers
 Directives and instructions
 Labels
 Mnemonics and Operands
 Comments
 Example: Adding and Subtracting Integers
 Assembling, Linking, and Running Programs
 Defining Data
 Symbolic Constants
 Real-Address Mode Programming
9
Reserved Words, Directives
 TITLE:
TITLE Add and …
 Define program listing
; Adds and subtracts title
; (10000h + …  Reserved word of
INCLUDE Irvine32.inc directive
.code  Reserved words
main PROC  Instruction mnemonics,
mov eax,10000h directives, type
add eax,40000h attributes, operators,
predefined symbols
sub eax,20000h
 See MASM reference
call DumpRegs in Appendix A
exit  Directives:
main ENDP  Commands for
END main assembler
10
Directive vs Instruction
 Directives: tell assembler what to do
 Commands that are recognized and acted upon
by the assembler, e.g. declare code, data areas,
select memory model, declare procedures, etc.
 Not part of the Intel instruction set
 Different assemblers have different directives
 Instructions: tell CPU what to do
 Assembled into machine code by assembler
 Executed at runtime by the CPU
 Member of the Intel IA-32 instruction set
 Format:
LABEL (option), Mnemonic, Operands, Comment

11
Comments

TITLE Add and …


 Single-line comments
 begin with semicolon (;)
; Adds and subtracts
; (10000h + …
 Multi-line comments
 begin with COMMENT
INCLUDE Irvine32.inc
directive and a
.code programmer-chosen
main PROC character, end with the
mov eax,10000h same character, e.g.
add eax,40000h
sub eax,20000h COMMENT !
call DumpRegs Comment line 1
exit Comment line 2
main ENDP !
END main
12
Include Files
TITLE Add and …  INCLUDE directive:
; Adds and subtracts  Copies necessary
definitions and setup
; (10000h + … information from a text
INCLUDE Irvine32.inc file named Irvine32.inc,
.code located in the
main PROC assembler’s INCLUDE
directory (see Chapt 5)
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
13
Code Segment

TITLE Add and …


 .code directive:
 Marks the beginning of
; Adds and subtracts
the code segment,
; (10000h + … where all executable
INCLUDE Irvine32.inc statements in a
.code program are located
main PROC
mov eax,10000h
add eax,40000h
sub eax,20000h
call DumpRegs
exit
main ENDP
END main
14
Procedure Definition
 Procedure defined by:
TITLE Add and …
 [label] PROC
; Adds and subtracts
 [label] ENDP
; (10000h + …
 Label:
INCLUDE Irvine32.inc  Place markers: marks
.code the address (offset) of
main PROC code and data
mov eax,10000h  Assigned a numeric
add eax,40000h address by assembler
sub eax,20000h  Follow identifier rules
 Data label: must be
call DumpRegs
unique, e.g. myArray
exit
 Code label: target of
main ENDP jump and loop
END main instructions, e.g. L1:
15
Identifiers
TITLE Add and …  Identifiers:
; Adds and subtracts  A programmer-chosen
name to identify a
; (10000h + … variable, a constant, a
INCLUDE Irvine32.inc procedure, or a code
.code label
main PROC  1-247 characters,
mov eax,10000h including digits
add eax,40000h  not case sensitive
sub eax,20000h  first character must be
a letter, _, @, ?, or $
call DumpRegs
exit
main ENDP
END main
16
Integer Constants
 Optional leading + or – sign
 Binary, decimal, hexadecimal, or octal digits
 Common radix characters:
 h – hexadecimal
 d – decimal
 b – binary
 r – encoded real

 Examples: 30d, 6Ah, 42, 1101b


 Hexadecimal beginning with letter: 0A5h

17
Instructions
[label:] mnemonic operand(s)
TITLE Add and … [;comment]
; Adds and subtracts  Instruction mnemonics:
; (10000h + …  help to memorize
INCLUDE Irvine32.inc  examples: MOV, ADD,
.code SUB, MUL, INC, DEC
main PROC  Operands:

mov eax,10000h  constant


add eax,40000h  constant expression

sub eax,20000h  register

call DumpRegs  memory (data label,

exit register)
main ENDP Destination Source immediate values
END main operand operand
18
Instruction Format Examples
 No operands
 stc ; set Carry flag
 nop ; no operation
 One operand
 inc eax ; register
 inc myByte ; memory
 Two operands
 add ebx,ecx ; register, register
 sub myByte,25 ; memory, constant
 add eax,36 * 25 ; register, constant-expr.

19
I/O

TITLE Add and …  Not easy, if program by


; Adds and subtracts ourselves
; (10000h + …  Will use the library
provided by the author
INCLUDE Irvine32.inc
 Two steps:
.code
 Include the library
main PROC (Irvine32.inc) in your
mov eax,10000h code
add eax,40000h  Call the subroutines
sub eax,20000h  call DumpRegs:
call DumpRegs  Calls the procedure to
exit displays current values
main ENDP of processor registers
END main
20
Remaining

TITLE Add and …  exit:


; Adds and subtracts  Halts the program
; (10000h + …  Not a MSAM keyword,
but a command
INCLUDE Irvine32.inc
defined in Irvine32.inc
.code
 END main:
main PROC
 Marks the last line of
mov eax,10000h the program to be
add eax,40000h assembled
sub eax,20000h  Identifies the name of
call DumpRegs the program’s startup
exit procedure
main ENDP
END main
21
Example Program Output
 Program output, showing registers and flags

EAX=00030000 EBX=7FFDF000 ECX=00000101 EDX=FFFFFFFF


ESI=00000000 EDI=00000000 EBP=0012FFF0 ESP=0012FFC4
EIP=00401024 EFL=00000206 CF=0 SF=0 ZF=0 OF=0

22
Alternative Version of AddSub
TITLE Add and Subtract (AddSubAlt.asm)
; adds and subtracts 32-bit integers
.386
.MODEL flat,stdcall
.STACK 4096
ExitProcess PROTO, dwExitCode:DWORD
DumpRegs PROTO
.code
main PROC
mov eax,10000h ; EAX = 10000h
add eax,40000h ; EAX = 50000h
sub eax,20000h ; EAX = 30000h
call DumpRegs
INVOKE ExitProcess,0
main ENDP
END main 23
Explanations
 .386 directive:
 Minimum processor required for this code
 .MODEL directive:
 Generate code for protected mode program
 Stdcall: enable calling of Windows functions
 PROTO directives:
 Prototypes for procedures
 ExitProcess: Windows function to halt process
 INVOKE directive:
 Calls a procedure or function
 Calls ExitProcess and passes it with a return code
of zero

24
Suggested Program Template
TITLE Program Template (Template.asm)
; Program Description:
; Author:
; Creation Date:
; Revisions:
; Date: Modified by:
INCLUDE Irvine32.inc
.data
; (insert variables here)
.code
main PROC
; (insert executable instructions here)
exit
main ENDP
; (insert additional procedures here)
END main 25
What's Next
 Basic Elements of Assembly Language
 Example: Adding and Subtracting Integers
 Assembling, Linking, and Running Programs
 Defining Data
 Symbolic Constants
 Real-Address Mode Programming

26
Assemble-Link-Execute Cycle
 Steps from creating a source program through
executing the compiled program

Link
Library
Step 2: Step 3: Step 4:
Source assembler Object linker Executable OS loader
Output
File File File

Listing Map
Step 1: text editor File File

http://kipirvine.com/asm/gettingStarted/index.htm

27
MASM History
 v6.11
 Independent product
 v6.15
 Visual C++ 6.0 Processor Pack
 v7.0
 Visual C++ .NET 2002
 v7.1
 Visual C++ .NET 2003
 v8.0
 Visual C++ .NET 2005

28
Download, Install, and Run
 MASM 6.15 (with all examples of textbook)
 Masm615.zip: download from the course web site
 Unzip the archive and run setup.exe
 Choose the installation directory
 Suggest using the default directory
 See index.htm in the archive for details
 Go to C:\Masm615 (if installed default)
 Write assembly source code
‒ TextPad, NotePad++, UltraEdit or …
 make32 xxx (where xxx is your file name)

29
Suggestion
 Study make32.bat and make16.bat
 To know where assembling stage and linking
stage are
 Think about linking with other language (ex: C or
C++ or …)
 Understand that MASM is only one of the
assemblers, and there are still many other
assemblers to use
 Try to use NASM or TASM
 Try to use high‐level compiler to generate
assembly codes
 gcc or visual c++ or turbo c or …

30
Listing File
 Use it to see how your program is compiled
 Contains
 source code
 addresses
 object code (machine language)
 segment names
 symbols (variables, procedures, and constants)
 Example: addSub.lst

31
Listing File
00000000 .code
00000000 main PROC

00000000 B8 00010000 mov eax,10000h


00000005 05 00040000 add eax,40000h
0000000A 2D 00020000 sub eax,20000h
0000000F E8 00000000E call DumpRegs

exit
00000014 6A 00 * push +000000000h
00000016 E8 00000000E * call ExitProcess
0000001B main ENDP
END main
address memory
content 32
What's Next
 Basic Elements of Assembly Language
 Example: Adding and Subtracting Integers
 Assembling, Linking, and Running Programs
 Defining Data
 Intrinsic Data Types
 Data Definition Statement
 Defining BYTE, SBYTE, WORD, SWORD,
DWORD, SDWORD, QWORD, TBYTE
 Defining Real Number Data
 Little Endian Order
 Symbolic Constants
 Real-Address Mode Programming
33
Intrinsic Data Types
BYTE 8-bit unsigned integer
SBYTE 8-bit signed integer
WORD 16-bit unsigned integer
SWORD 16-bit signed integer
DWORD 32-bit unsigned integer
SDWORD 32-bit signed integer
FWORD 48-bit integer (Far pointer in protected
mode)
QWORD 64-bit integer
TBYTE 80-bit (10-byte) integer
REAL4 32-bit (4-byte) IEEE short real
REAL8 64-bit (8-byte) IEEE long real
REAL10 80-bit (10-byte) IEEE extended real
34
Data Definition Statement
 A data definition statement sets aside storage in
memory for a variable
 May optionally assign a name (label) to the data
 Syntax:
[name] directive initializer [,initializer] . . .

value1 BYTE 10

 All initializers become binary data in memory


 Use ? if no initialization necessary
 Example: Var1 BYTE ?
35
Defining BYTE and SBYTE Data
 Each of following defines a single byte of storage:
value1 BYTE 'A' ; character constant
value2 BYTE 0 ; smallest unsigned byte
value3 BYTE 255 ; largest unsigned byte
value4 SBYTE -128 ; smallest signed byte
value5 SBYTE +127 ; largest signed byte
value6 BYTE ? ; uninitialized byte

• MASM does not prevent you from initializing a BYTE with a


negative value, but it is considered poor style
• If you declare a SBYTE variable, the Microsoft debugger will
automatically display its value in decimal with a leading sign

36
Defining Byte Arrays
 Examples that use multiple initializers:

list1 BYTE 10,20,30,40


list2 BYTE 10,20,30,40
BYTE 50,60,70,80
BYTE 81,82,83,84
list3 BYTE ?,32,41h,00100010b
list4 BYTE 0Ah,20h,‘A’,22h

37
Defining Strings
 An array of characters
 Usually enclosed in quotation marks
 Will often be null-terminated
 To continue a single string across multiple lines,
end each line with a comma
str1 BYTE "Enter your name",0
str2 BYTE 'Error: halting program',0
str3 BYTE 'A','E','I','O','U'
greeting BYTE "Welcome to the Encryption Demo program "
BYTE "created by Kip Irvine.",0
menu BYTE "Checking Account",0dh,0ah,0dh,0ah,
"1. Create a new account",0dh,0ah,
"2. Open an existing account",0dh,0ah,
"Choice> ",0
End-of-line sequence:
Is str1 an array? •0Dh = carriage return
38
•0Ah = line feed
Using the DUP Operator
 Use DUP to allocate (create space for) an array
or string
 Syntax:
counter DUP (argument)
 Counter and argument must be constants or
constant expressions

var1 BYTE 20 DUP(0) ; 20 bytes, all equal to zero


var2 BYTE 20 DUP(?) ; 20 bytes, uninitialized
var3 BYTE 4 DUP("STACK"); 20 bytes,
; "STACKSTACKSTACKSTACK"
var4 BYTE 10,3 DUP(0),20 ; 5 bytes

39
Defining WORD and SWORD
 Define storage for 16-bit integers
 or double characters
 single value or multiple values

word1 WORD 65535 ; largest unsigned value


word2 SWORD –32768 ; smallest signed value
word3 WORD ? ; uninitialized, unsigned
word4 WORD "AB" ; double characters
myList WORD 1,2,3,4,5 ; array of words
array WORD 5 DUP(?) ; uninitialized array

40
Defining Other Types of Data
 Storage definitions for 32-bit integers, quadwords,
tenbyte values, and real numbers:
val1 DWORD 12345678h ; unsigned
val2 SDWORD –2147483648 ; signed
val3 DWORD 20 DUP(?) ; unsigned array
val4 SDWORD –3,–2,–1,0,1 ; signed array
quad1 QWORD 1234567812345678h
val1 TBYTE 1000000000123456789Ah
rVal1 REAL4 -2.1
rVal2 REAL8 3.2E-260
rVal3 REAL10 4.6E+4096
ShortArray REAL4 20 DUP(0.0)

41
Adding Variables to AddSub
TITLE Add and Subtract, Version 2 (AddSub2.asm)
; This program adds and subtracts 32-bit unsigned
; integers and stores the sum in a variable.
INCLUDE Irvine32.inc
.data
val1 DWORD 10000h
val2 DWORD 40000h
val3 DWORD 20000h
finalVal DWORD ?
.code
main PROC
mov eax,val1 ; start with 10000h
add eax,val2 ; add 40000h
sub eax,val3 ; subtract 20000h
mov finalVal,eax ; store the result (30000h)
call DumpRegs ; display the registers
exit
main ENDP
END main 42
Listing File
00000000 .data
00000000 00010000 val1 DWORD 10000h
00000004 00040000 val2 DWORD 40000h
00000008 00020000 val3 DWORD 20000h
0000000C 00000000 finalVal DWORD ?

00000000 .code
00000000 main PROC
00000000 A1 00000000 R mov eax,val1 ; start with 10000h
00000005 03 05 00000004 R add eax,val2 ; add 40000h
0000000B 2B 05 00000008 R sub eax,val3 ; subtract 20000h
00000011 A3 0000000C R mov finalVal,eax; store result
00000016 E8 00000000 E call DumpRegs ; display registers
exit
00000022 main ENDP

43
C vs Assembly
main() .data
{ val1 DWORD 10000h
int val1=10000h; val2 DWORD 40000h
int val2=40000h; val3 DWORD 20000h
int val3=20000h; finalVal DWORD ?
int finalVal; .code
main PROC
finalVal = val1 mov eax,val1
+ val2 - val3; add eax,val2
} sub eax,val3
mov finalVal,eax
call DumpRegs
exit
main ENDP
44
What's Next
 Basic Elements of Assembly Language
 Example: Adding and Subtracting Integers
 Assembling, Linking, and Running Programs
 Defining Data
 Symbolic Constants
 Equal-Sign Directive
 Calculating the Sizes of Arrays and Strings
 EQU Directive
 TEXTEQU Directive
 Real-Address Mode Programming

45
Equal-Sign Directive
 name = expression
 expression is a 32-bit integer (expression or
constant)
 may be redefined
 name is called a symbolic constant
 Also OK to use EQU
 good programming style to use symbols

COUNT = 500

mov al,COUNT

46
Calculating the Size of Arrays
 Current location counter: $
 Size of a byte array
 Subtract address of list and difference is the
number of bytes
list BYTE 10,20,30,40
ListSize = ($ - list)

 Size of a word array


 Divide total number of bytes by 2 (size of a word)

list WORD 1000h,2000h,3000h,4000h


ListSize = ($ - list) / 2

47
EQU Directive
 Define a symbol as either an integer or text
expression
 Cannot be redefined
 OK to use expressions in EQU:
 Matrix1 EQU 10 * 10
 Matrix1 EQU <10 * 10>
 No expression evaluation if within < >
 EQU accepts texts too
PI EQU <3.1416>
pressKey EQU <"Press any key to continue",0>
.data
prompt BYTE pressKey
48
TEXTEQU Directive
 Define a symbol as either an integer or text
expression
 Called a text macro
 Can be redefined
continueMsg TEXTEQU <"Do you wish to continue (Y/N)?">
rowSize = 5
.data
prompt1 BYTE continueMsg
count TEXTEQU %(rowSize * 2) ; evaluates expression
setupAL TEXTEQU <mov al,count>
.code
setupAL ; generates: "mov al,10"

49
What's Next
 Basic Elements of Assembly Language
 Example: Adding and Subtracting Integers
 Assembling, Linking, and Running Programs
 Defining Data
 Symbolic Constants
 Real-Address Mode Programming (skipped)

50
Summary
 Integer expression, character constant
 Directive – interpreted by the assembler
 Instruction – executes at runtime
 Code, data, and stack segments
 Source, listing, object, map, executable files
 Data definition directives:
 BYTE, SBYTE, WORD, SWORD, DWORD,
SDWORD, QWORD, TBYTE, REAL4, REAL8,
and REAL10
 DUP operator, location counter ($)
 Symbolic constant
 EQU and TEXTEQU
51

You might also like