Professional Documents
Culture Documents
Module 7
Procedures
Macros
Overview: Procedures
(Read chapter 5: 5.4 - 5.6, chapter 8: 8.1, 8.2)
Whats next
Stack Operations
Defining and Using Procedures
Runtime Stack
top
bottom
Runtime Stack
The stack is managed by the CPU, using special registers
SS (stack segment): pointer to the stack segment. This address is set
by the OS when memory is allocated for the program
ESP (extended stack pointer): pointer to the top of the stack. This
address is the address of the last data being inserted into the stack, so
it points to the top of the stack. Every time data is inserted, the address
decrements by 4 (4 bytes per data value for protected mode)
EBP (extended base pointer): This register is used in procedure calls to
maintain local data for a procedure. EBP is used to point to the base of
the stack frame during procedure call.
Offset
00001000
00000FFC
00000FF8
00000FF4
00000FF0
00000006
ESP
Stack Segment
PUSH Operation
AFTER
00001000
00000006
00000FFC
00000FFC
000000A5
00000FF8
00000FF8
00000FF4
00000FF4
00000FF0
00000FF0
00001000
00000006
ESP
ESP
00000006
00000FFC
000000A5
00000FF8
00000001
00000FF4
00000002
ESP
00000FF0
POP Operation
AFTER
00001000
00000006
00001000
00000006
00000FFC
000000A5
00000FFC
000000A5
00000FF8
00000001
00000FF8
00000001
00000FF4
00000002
00000FF0
ESP
00000FF4
00000FF0
ESP
PUSH syntax
Data from the operand will be copied into the top of the stack.
There are 3 variations
PUSH r/m16 ; 16 bit register or memory
PUSH r/m32 ; 32 bit register or memory
PUSH imm32 ; 32 bit constant
POP syntax
Data will be removed from the top of the stack and copied
into the operand.
There are 2 variations
POP r/m16
; 16 bit register or memory
POP r/m32
; 32 bit register or memory
mov
mov
mov
call
esi,OFFSET dwordVal
ecx,LENGTHOF dwordVal
ebx,TYPE dwordVal
DumpRegs
pop
pop
pop
ebx
ecx
esi
;
;
;
;
;
;
loop L2
pop ecx
loop L1
Related Instructions
What's Next
Stack Operations
Defining and Using Procedures
Creating Procedures
Large problems can be divided into smaller tasks that are more
manageable and reusable
The following is the template of a procedure:
sample PROC
.
.
.
ret
sample ENDP
; start with:
; procedure_name and the directive PROC
;
;
;
;
;
;
; end with:
; procedure_name and the directive ENDP
Documenting Procedures
For this class the required documentation for each procedure is:
In this procedure, the input data and the return value are all
passed through registers
Before calling SumOf, the caller has to make sure that the 3 integers
are in EAX, EBX, ECX
After the call, the caller can expect EAX to contain the sum
CALL Instruction
Recall that EIP holds the address of the next instruction to be executed
CALL procedure_name
A CALL instruction:
RET Instruction
When RET runs, the value at the top of stack is popped off the stack
into EIP. This value should be the return address: the address of the
instruction in the calling procedure that execution returns to.
CALL-RET Example
At the start of the CALL instruction,
EIP = 0000 0025 (the address of the next
instruction)
When CALL runs, 0000 0025 is pushed on
the stack, then the address of MySub is
copied into EIP. EIP = 0000 0040.
This causes execution to jump to the
procedure MySub and start executing the
first instruction in MySub.
main PROC
00000020
00000025
. . .
main ENDP
MySub PROC
00000040
.
.
.
.
00000180
MySub ENDP
call MySub
mov eax,ebx
mov eax,edx
ret
ESP
Input Parameters
One of the main reasons for writing procedures is that they are independent
blocks of code that can be re-used. For example, the procedure WriteString
in the books library is re-used by all of us in our programs.
Input data to a procedure can only come from one of two sources:
registers, if input arguments are passed through registers
the stack, if input arguments are passed through the stack
Heres an example of why you cannot refer to specific variable names when
writing a procedure:
You write a procedure that uses the specific variable names input1 and
input2 in the code. This means input1 and input2 have to be defined in
the data segment of your program.
If other people use your procedure in their programs, input1 and input2
(which are addresses in your data segment) will not be available for their
program, and your procedure will no longer be able to run successfully.
This is where parameter passing comes in. When other people call your
procedure, they pass input data (such as their own values for input1 and
input2) to your procedure. This means your procedure must work with
input parameters, rather than with pre-determined variable names.
Parameter Passing (1 of 2)
There are 2 ways to pass input data to a procedure and receive the return
value from the procedure: through registers and through the stack
Parameter Passing (2 of 2)
call SumArray
Using the stack for parameter passing is more flexible than using
registers
Since the registers are not used to store input parameters, they are
available for use by the called procedure (the callee). This is
important when the callee is complex and needs to use many
registers to do its work.
There is no need to coordinate between the caller and callee
regarding which input parameter is in which register. All that needs
to be communicated is the order in which parameters are passed.
This is how HLL functions pass data to each other
To pass parameter through the stack
The caller pushes all input parameters onto the stack before the call
The callee accesses these data values in the stack when needed
The callee puts the return value (if any) on the stack
The caller pops the return value (if it exists) from the stack
One additional consideration when using the stack for parameter passing:
the stack must be completely cleared of all data by the time execution
finishes running the procedure call
If the stack is not cleared after every procedure call, and some data are left
behind, this causes a memory leak in the stack segment. If the program
runs for a long time or has a deep recursive call, it will lead to stack
overflow.
All data belonging to the procedure call (input parameters and possible
return value) must be popped off the stack at the end of a procedure call
The return value is popped off the stack by the caller to retrieve it and
this clears it from the stack
The input parameters are cleared off the stack by one of two ways:
By the called procedure: this is the STDCALL convention or the
Pascal convention. It is used by the Windows API.
By the caller: this is the C convention or the cdecl (C declaration)
convention. It is used by the C language. Since the caller clears the
stack, this convention allows for variable length argument list, such
as Cs printf, because the caller knows how many arguments are
passed in the stack at the time of the call
For this class we use the STDCALL convention, and the C convention is
used in Advanced Assembly when we interface with C code
push
push
push
call
Advantage:
printArray can easily access the
input parameters in esi, ecx, ebx
Advantage:
- All general purpose registers are
available for printArray to use
- A HLL function call produces the
same code, so this version of
printArray can work with HLL code
Disadvantage:
esi, ecx, ebx are used for input
data. printArray cannot use them
to do its work.
OFFSET array
LENGTHOF array
TYPE array
printArray
Disadvantage:
- Its more work for printArray to
access the input parameters
- The parameters must be cleared
from the stack at the end of the call
The CALL, RET instructions and the parameters being passed through the
stack all go into a stack frame on the stack
A stack frame is also known as an activation record
It is a part of the stack that is set aside for each called procedure
For a 3 level nested procedure call (A calls B, B calls C, C calls D), the
stack will have 3 stack frames, one stack frame for each of the procedures
B, C, and D
The stack frame can contain up to 5 logical parts, in the order that data are
pushed into the stack:
1. Input parameters: pushed on the stack first by the caller
2. The callers return address: pushed on the stack by the CALL
instruction
3. EBP value: pushed on the stack by the callee in order to set the base
pointer for its stack frame
4. Local variables: pushed on the stack by the callee when it needs
temporary data storage
5. Saved register data: pushed on the stack by the callee, before it uses
any register
The callers return address is always part of the stack frame, even if
registers are used for parameter passing. The other parts (input
parameters, EBP, saved register data, and local variables) can be optional
Input parameters
EBP +n
Return address
EBP
EBP
Local Variables
EBP -n
Saved
registers
ESP
Input
parameters
EBP +n
Return address
EBP
EBP
Local
Variables
EBP -n
Saved
registers
ESP
Input parameters
EBP +n
Return address
EBP
EBP
Local Variables
EBP -n
Saved
registers
ESP
33
Input parameters
EBP +n
Return address
EBP
EBP
Local Variables
EBP -n
Saved
registers
ESP
34
Input parameters
EBP +n
Return address
EBP
EBP
Local Variables
EBP -n
Saved
registers
ESP
Since the callee sets EBP to point to its stack frame as the first step in the
procedure, it means EBP points to the location immediately after the
Return Address that is on the stack.
The callee can access data in the stack frame by using constant offsets
from EBP, for example: [ebp + 8] or [ebp 4]
EBP does not change value during the execution of the called procedure,
making it a stable base pointer to rely on
.data
sum DWORD ?
.code
sub esp,4
push 6
push 5
call AddTwo
pop sum
AddTwo PROC
push ebp
mov ebp,esp
push eax
Stack frame
;
;
;
;
;
mov eax,[ebp+12]
add eax,[ebp+8]
mov [ebp+16],eax ; return value on stack
pop eax
pop ebp
ret 8
addTwo ENDP
; restore register
; pop input params
---
ebp+16
ebp+12
ebp+8
Return addr
Old ebp
Old eax
ebp+4
ebp
ebp-4
.data
sum DWORD ?
steps
5)
.code
sub esp,4
push 6
push 5
call AddTwo
pop sum
AddTwo PROC
push ebp
mov ebp,esp
push eax
Stack frame
;
;
;
;
;
1)
mov eax,[ebp+12]
add eax,[ebp+8]
mov [ebp+16],eax ; return value on stack
2)
3)
4)
pop eax
pop ebp
ret 8
addTwo ENDP
; restore registers
; pop return addr
; and input params
11
ebp+16
ebp+12
ebp+8
Return addr
ebp+4
Old ebp
ebp
Old eax
ebp-4
; save room
; 1st argument
; 2nd argument
Option 2:
push eax
push 6
push 5
call AddTwo
; save room
; 1st argument
; 2nd argument
Stack frame
; set ebp
; save all registers
; esi = addr of array
; ecx = count
offset(array)
[EBP + 12]
count
[EBP + 8]
return address
EBP
ESI points to the beginning of the array, we can use a loop and indirect
addressing to access each array element.
EBP
mov esi,[ebp+8]
mov edi,[ebp+12]
mov eax,[esi]
xchg eax,[edi]
mov [esi],eax
popad
pop epb
ret 8
Swap ENDP
; set EBP
; save all registers before using them
;
;
;
;
;
;
; restore registers
; return and clean up the 2 input args
Local Variables
When the registers dont give us enough temporary data storage for a
procedure, we can use the stack as extra storage space
Temporary data that are stored on the stack, not in the data segment, are
local variables
The simplest way to create local variables is just to push and pop the
temporary data onto the stack. But this only works if we access the saved
data in LIFO order. For example, if we push data1 and then data2 on the
stack, then using pop, we will get to data2 first, then data1
If we need to get to our local variables in more random order, then we need
to explicitly create space on the stack for local variables. This is done by
subtracting their total size from ESP.
Example: if we need 3 local variables, then their total size is 4 * 3 = 12 bytes.
We use the instruction: sub esp, 12 to make room for 12 bytes on the
stack
This step is done immediately after setting EBP because we will use an
offset from EBP to access the local variables and we dont want the offset to
be a large number (epb 56 is not as easy to work with as ebp 4)
; set ebp
;
;
;
;
save
save
locA
locB
LEA Instruction
The LEA (load effective address) instruction returns the offset (or address)
of operands, just like the OFFSET operator.
However, OFFSET can only return constant offsets, or offsets of memory
variables that are in the data segment
LEA is used to get the offset of a stack parameter or local variable
MyProc PROC
push ebp
; set ebp
mov ebp,esp
sub esp,8
; save room for 2 local vars
pushad
; save all registers
mov [ebp-4],123456h
; locA = 123456h
mov [ebp-8],0
; locB = 0
mov edi,OFFSET [ebp-4]
; invalid operand
mov esi,OFFSET [ebp-8]
; invalid operand
lea edi,[ebp-4]
; ok
lea esi,[ebp-8]
; ok
; edi and esi now have addresses of locA and locB,
; in case you need to call the next procedure
; and pass locA and locB by address
(1 of 2)
The registers are available for any procedure to use, and every procedure
uses the same registers because many instructions require the use of at
least one register
In a procedure call, it is important that the callee saves the data that are in
each register that it will use, before using the register
Take the example of procedure A calling procedure B:
At the start, instructions in A run and use all 4 general purpose
registers to keep As temporary data
When A calls B, execution jumps to procedure B, but all 4 general
purpose registers still hold As data
When B runs, it will also need to use the general purpose registers
If B does not push the values of the registers on the stack, thus saving
them, then Bs instructions will overwrite As data that are in the
registers
It is also important that at the end of the callee code, the saved register
data are restored back to each register, before the RET instruction
(2 of 2)
Saving register values should be done before the called procedure uses any
of the registers
For EBP, this occurs as the first step in the called procedure because
EBP needs to be set to the stack frame base immediately after entering
the procedure
For other registers, this step occurs after saving room for local variables,
and before the called procedure starts its actual task
If the callee only needs to use 2-3 registers, it is faster to push each register
value on the stack
If the callee will use most or all of the registers, it is faster to use the
PUSHAD instruction
When restoring saved register values, remember that the pop instructions
should be in reverse order of the push instructions
If PUSHAD is used, then use POPAD to restore register values
Saving register values is always an important step when working with procedures. It is
especially important in a nested procedure call, where there are multiple stack frames.
Since each stack frame is accessed by EBP, if EBP isnt saved and restored properly
from procedure to procedure, we would lose our place in the stack very quickly and not
able to get to the return address when RET runs. No programmer wants execution to
return to an unknown address and start executing whatever happens to be in memory
at that unknown address
Your turn . . .
Difference PROC
push ebp
mov ebp,esp
push eax
mov eax,[ebp + 8]
sub eax,[ebp + 12]
mov [ebp+16],eax
pop eax
pop ebp
ret 8
Difference ENDP
;
;
;
;
;
;
;
;
;
;
;
;
;
;
There are several MASM directives and operators that are introduced
in section 8.2 of the book. These make it easier for assembly
programmers to maintain the stack frame
The directives are: ENTER and LEAVE (sets up and restores EBP),
LOCAL (reserves space on the stack for local variables)
The operator is: USES (to save and restore registers)
For this class, we will not use these MASM directives and operator.
One goal of the class is to understand how the stack frame works to
support procedure calls. We get a much better understanding of the
stack frame if we code the assembly instructions, rather than
depending on the assembler to generate the equivalent code. In the
advanced assembly class, after everyone has a clear idea of how to
maintain the stack frame, then the MASM directives can be used.
50
The run time stack is used to support procedure calls, and is defined with
the directive .stack
The stack works in LIFO order and has several operations: push (add
data), pop (remove data), pushfd and popfd (save and restore flag
registers), pushad and popad (save and restore all registers)
Procedures are defined using the PROC and ENDP directives
Procedures run when the instruction CALL is used, and returns to the
calling procedure with the instruction RET. CALL pushes the return
address on the stack, and RET pops the return address so execution can
return to the saved memory location.
Procedures only work with the data that are passed in through registers or
through the run time stack, and they never use a specific variable name
(or specific data address)
Each procedure call has its own stack frame, which is set up with: input
parameters (by the caller), return address (by the caller), EBP (by the
callee), local variables (by the callee), and saved registers (by the callee)
The callee always save all current register values before using the
registers, in order not to clobber data that belong to the caller
Upon a return, the STDCALL convention dictates that the callee will clean
up input parameters from the stack
51
Overview: Macros
(Text book chapter 10: 10.2)
52
Whats Next
Difference between macros and procedures
Defining macros
Passing parameters to macros
53
Whats Next
55
Defining Macros
The directives for the beginning and end of the macro are: MACRO and
ENDM
Syntax:
macroname MACRO [parameter-1, parameter-2,...]
statement-list
ENDM
Macro Example
The mNewLine macro is defined and then invoked
mNewLine MACRO
call Crlf
ENDM
.data
.code
mNewLine
Whats Next
58
mPutChar Macro
Writes a single character to standard output
Definition:
Invocation:
.code
mPutchar 'A'
1
Expansion:
1
1
1
push eax
mov al,'A'
call WriteChar
pop eax
viewed in the
listing file
Invoking Macros
macro
invocation
statement
passes
text
consists
of
assembly
code
argument
generates
replaces
parameter
declared
inside
macro
mWriteStr Macro
Provides a convenient
way to display a string
by passing the string
name as an argument
.code
mWriteStr str1
1
1
1
1
push
mov
call
pop
edx
edx,OFFSET str1
WriteString
edx
Invalid Argument
push eax
mov al,1234h
call WriteChar
pop eax
; error
1
1
1
1
push eax
mov al,
call WriteChar
pop eax
; error
mReadStr Macro
The mReadStr macro provides a convenient wrapper around ReadString
procedure calls.
mReadStr MACRO varName
push ecx
push edx
mov edx,OFFSET varName
mov ecx,(SIZEOF varName) - 1
call ReadString
pop edx
pop ecx
ENDM
.data
firstName BYTE 30 DUP(?)
.code
mReadStr firstName
Labels in Macros
Recall that invoking a macro means the macro code is copied into the
source file. Each time the macro is invoked another copy of the macro is
copied into the source file.
If a macro has a code label and the macro is invoked multiple times, the
label will be duplicated multiple times when the macro code is copied in. This
results in an error because each code label needs to be unique.
To solve this problem, use the LOCAL directive to declare the label before
using it. This causes the assembler to generate a unique name for the label
each time it copies the macro code into the source file
printTen MACRO var
LOCAL loopTop
push ecx
push eax
mov eax, var
mov ecx, 10
loopTop: call writeDec
loop loopTop
pop edx
pop ecx
ENDM
65