Professional Documents
Culture Documents
INTRODUCTION
Whenever a software is defined it is always said that software are of
two types: 1. Application Software (e.g. MS-Office, Windows Media Player etc.) 2. System Software (Which is also known as Operating System.) 3. In this module we will be focusing on the different components of system software .
Contd.
An assembly language source code file consists of collection of statements. These statements include: Instructions Directives Macros The basic building block of an assembly language program includes characters, identifiers, labels, constants and assembly language counter.
Characters
The characters in assembly language can be of any type. It can be alphanumeric character i.e. from A to Z or from a to Z from 0 to 9. The characters may also include other printable ASCII characters that include #,$,:,.,+,-,*,/, and |.
Identifiers
It is also known as symbol, is used as label for an assembler statement. It may also be used as a location tag for data. The symbolic name of a constant can also be determined by using identifiers only. Rules for declaring any identifier is same as that we do in c, c++.
Labels
There is not much difference between an identifier and a label. A label is written as an identifier immediately followed by a colon(:) A label represents the current value of the current location counter. A label is used as an operand in assembler instruction. A label consists of a digit between 0 and 9 and must be followed by a colon. The labels that have been created by you can be reused number of times throughout the program.
Constants
There are four different types of constants such as numeric, character, string and floating-point. A numeric constant also starts with a digit and can be decimal, hexadecimal or octal. Decimal constants contain only digits between 0 and 9. These constants are lesser than 32 bits. They cannot contain leading zeros or commas. Hexadecimal constants start with 0x(0r 0X). Octal constants start with 0 and are followed by one to seven digits. A single character constant consists of a single quotation marks() followed by an ASCII character. A string constant is a sequence of zero or more ASCII characters surrounded by quotation marks.()
Memory
The basic unit of memory is an instruction is a byte that comprises of eight bits of information. The addressing on the 360 memory may consist of three components. The value of an address is equal to the value of an offset + the contents of an index register. Unit of memory Byte Halfword (nibble) Word Doubleword Bytes 1 2 4 8 Equivalent bits 8 16 32 64
10
Registers
The registers are used mainly for storing various arithmetic and logical operations. There are 16 general-purpose registers in IBM 360/370 processor that consist of 32 bits each. In addition there are 4 floating-point registers consisting of 64 bits each. The general purpose registers are sometimes used as base registers also. For e.g. A 1,16(4,10) interprets an add instruction. 16 represents an offset, 4 represents an index register and 10 represents a base register. This means, address of the memory location whose contents we wish to add to the contents of register 1=offset +content of index registers +contents of base register.
11
Assembly Scheme
The assembly scheme used in IBM 360/370 processor can be structured with the help of the following components: USING: This is a pseudo-op, which is responsible for addressing. Since no special registers are maintained for this purpose, a programmer should inform an assembler that which register (s) to use and how to use these registers. This pseudo-op indicates which of the general registers are to be used as abase register and what should be the contents of the register. BALR: This is an instruction for a computer that informs an assembler to load a register with the next address. This instruction branch the address in the second field. This is important to note that whenever the second operand is zero in the BALR instruction, the execution starts proceeding with the next instruction. The major difference between the BALR and the USING is that the BALR
12
instruction loads the base register whereas the USING pseudo-op does not load the register, rather it informs the assembler about the contents of the base register. If the register doesnt contain the address that is specified by the USING, it results in a program error. START: This pseudo-op informs an assembler about the beginning of a program. It also allow a user to give a name to the program. END: This pseudo-op informs an assembler that the last card of a program has been reached. BR X: This is the instruction that branches to the location that has its address in general register X.
13
LOOP
NINE
14
Contd.
EIGHT FORTYFIVE DATA DC DC DC END F8 F45 F1,3,3,3,3,4,5,8,9,0
Each instruction in this program depicts a particular meaning. As the function of the START pseudo-op is already described. The next instruction BALR 10,0 in the program sets register 10 to the address of the next instruction, as the second operation in the instruction is 0. Next comes USING Begin + 2,10, the pseudo-op in this instruction is indicating to the assembler that register 10 is a base register and also that the content of the register 10 is the address of the next instruction. SR 4,4 clears register 4 and sets index to 0.
15
Contd
The next instruction L 3, NINE loads the number 9 into register 3. The instruction L 2,DATA(8) loads data (index) into register 2. Next comes A 2,FORTYFIVE, which adds 45 to register 2. The next two instructions ST 2, DATA(8) and A 8, EIGHT store the updated value of data (index) and 8 is added to register 8. The next instruction BCT 3, LOOP will decrement register 3 by 1. In case if the result generated by this instruction comes to non-zero, then it will branch back to the LOOP instruction. The next instruction BR 14 would branch back to a caller that has called the LOOP instruction. The next instruction contain DC pseudo-op, which indicates that these are the constants in the program. The next instruction is also a DC pseudo-op, which indicates the words that are to be processed in the program.
16
Contd.
The last instruction is the END instruction, which informs the assembler that the last card of the program has been reached.
17
Assembler
An assembler is a program that is responsible for generating machine code instructions from a source code program. Each machine code instruction is replaced by a mnemonic, an abbreviation that represents the actual information. These mnemonics are frequently used as they reduce the chances of making an error. An assembler can be a single pass assembler or a two pass assembler.
18
Features of an Assembler
It allows the programmer to use mnemonics when writing source code programs. It provides variables that are represented by symbolic names and not as memory locations. It allows error-checking procedures. It can easily allow making the changes and incorporating them with a reassembly. It provides a symbolic code that is easy to read and follow.
19
20
Contd..
Alternatively, in a two pass assembler, the source code is passed twice through an assembler. The first pass in a two pass assembler is specifically for the purpose of assigning an address to all labels. Once all the labels get stored in a table with the appropriate addresses, the second pass is processed to translate the source code into machine code. It is the most popular type of assembler currently in use.
21
22
23
24
Contd
In the pass1 of the two pass assembler, it reads the assembly source program. Each instruction in the program is processed and is then translated to the generator. These translations are accumulated in appropriate tables. In addition to this, an intermediate form of each statement is generated. Once these translations have been made, pass2 starts its operation. Pass2 examines each statement that has been saved in the file containing the intermediate program. The pass2 synthesizes the intermediate program and converts it into the machine language program.
25
Contd
The tables that stores the translations also includes tables such as operation table and symbol table. The operation table is denoted as OPTAB. An assembler designer is responsible for creating these tables. The tables include the mnemonics and their translations. There are four different types of mnemonics in an assembly language: Machine operations such as ADD,SUB,DIV etc. Pseudo-operation or directives: Pseudo-ops are the data definitions. Macro-operation definitions, which includes DMACRO and EMACRO Macro-operation call, which includes call such as PUSH and LOAD.
26
Contd
Another type of tables used in the two pass assembler is the symbol table. The symbol table is denoted as SYMTAB. This table stores the symbols that are user-defined in the assembly program. These symbols may be identifiers, constants or labels. Another specification for symbol table is that these tables are dynamic and you cannot predict the length of the symbol table. The implementation of SYMTAB includes the arrays, the linked lists and the hash table.
27
28
Contd
Second Pass Relative Address Mnemonic Instruction 0 L 1,16(0,10) 4 A 1,12(0,10) 8 ST 1,20(0,10) 12 9 16 8 20 -
29
Contd
You can notice from the code shown that JACK is the name of the program. We start from the START instruction and come to know that it is a pseudo-op instruction, which is instructing to the assembler. The next instruction is the USING pseudo-op, which informs the assembler that register 10 is the base register and at the execution time, it will contain the address of the first instruction of the program. Next instruction is Load instruction: L 1, EIGHT. For the execution of this instruction, the assembler needs to have the address of EIGHT. At this point we can not supply the address of the relative address for the index register. Since no index register is being used, therefore we place a 0 in the relative address for the index register.
30
Contd
Till now we have only register 10 as a base register, but then also we can not calculate the offset for it. This is because the base register 10 is pointing to the beginning of the program and the offset is the difference between the location EIGHT and the location of the beginning of the program, which is still unknown. The next instruction is an ADD instruction. The offset is still not known. Similar is the case with the next STORE instruction. Whenever an instruction is executed, the relative address gets incremented. A location counter is maintained, which indicates that the relative address of an instruction is being processed. Here, the counter is incremented by 4 in each of the instruction as the length of a load instruction is 4. 31
Contd
The DC instruction is a pseudo-op that asks for the definition of data. In the instruction, DC F9, it is quite obvious that the word 9 will be stored at the relative location 12, as the location counter is having the value 12 currently. Similarly the next instruction with label EIGHT that holds the relative address with value 16 and label TEMP is associated with the counter value 20. This completes with the description of the column 2 of the above mentioned code.
32
Contd
Now considering the offset of these instructions, lets go back to the beginning of the program again. In the second column, you must have noticed that the offset is left blank, as they were unknown at that time. But now these offsets, can be easily inserted as you know the location counter value of each of the label. In the third column, along with the relative addresses, the offsets are also mentioned. Conclusion: It can be concluded that it is convenient to make two passes over the input. The first pass only defines the symbol and the second pass generates the instruction and addresses.
33
Software Tools
It is a program that interfaces a program with an entity generating the input data. It can also interfaces the results of a program with the entity consuming those results.
Originator Software Tool Consumer
Raw data
Transformed data
34
Contd
Most of the system software designers make use of complex designing strategies, which involve a wide variety of design disciplines. The basic development package are integrated with the components such as the assembler, linker, simulator and debugger. For e.g. if an error occurs while assembling a file, then the software tools can instantly call the text editor that specifies the offending line of code. Computing includes two main activities, namely program development and use of application software.
35
36
38
39
Introduction
Macros are single line abbreviations for a certain group of instructions. In employing a macro, the programmer essentially defines a single instruction to represent a block of code. Macro instructions are usually considered an extension of the basic assembler language, and the macro processor is viewed as an extension of the basic assembler algorithm.
41
Macro Definition
It is sometimes necessary for an assembly language programmer to repeat some blocks of code in the course of a program. The macro proves to be useful when instead of writing the entire block again and again, you can simply write the macro that you have already defined. An assembly language macro is an instruction that represents several other machine language instructions at once.
42
: :
43
Contd
In the above program the following sequence occurs twice. A 1, DATA A 2, DATA A 3, DATA A macro facility permits you to attach a name to the sequence that is occurring several times in a program and then you can easily use this name when that sequence is encountered. The following structure shows how to define a macro in a program:
Start of definition Macro name Sequence to be abbreviated End of definition Macro []
{
Mend
44
Macro Expansion
Once a macro is being created, the interpreter or compiler automatically replaces the pattern, described in the macro, when it is encountered. The macro expansion always happens at the compile-time in compiled languages. The tool that performs the macro expansion is known as macroexpander. Once a macro is defined, the macro name can be used instead of using the entire instruction sequence again and again. The overhead associated with macros is very less.
45
Contd
Source
MACRO INC A 1,DATA A 2,DATA A 3,DATA MEND : : INC : : INC : : DATA DC F2
Expanded Source
A A A A A A DATA
46
Contd
In this example, the name INC has been assigned to the repeated sequence. INC is the name of the macro that corresponds to a particular sequence of instructions. When this sequence of instructions are required in the program, the name of the macro that has been already defined can be replaced instead of writing the entire sequence of instructions repeatedly. The macro replaces each macro call with the following lines: A 1,DATA A 2,DATA A 3,DATA The process of such a replacement is known as expanding the macro.
47
Contd
The macro definition does not appear in the expanded source code. This is because the macro processor saves definition of the macro. In addition, the occurrence of the macro name in the source program refers to a macro call. When the macro is called in the program, the sequence of instructions corresponding to the macro name gets replaced in the expanded source.
48
49
Contd
MACRO SUB 1 &PAR L 1,&PAR A 1,=F2 ST 1,&PAR MEND MACRO SUBST &PAR1, &PAR2,&PAR3 SUB1 &PAR1 SUB2 &PAR2 SUB3 &PAR3 MEND
50
Contd
It can be easily noticed from the example that the definition of the macro SUBST contains three separate calls to a previously defined macro SUB1. The definition of the macro SUB1 has shortened the length of the definition of the macro SUBST.
51
Contd
The following code describes how to implement a nested macro call:
Source
: : MACRO SUB1 L A ST MEND MACRO SUBST SUB1 SUB1 SUB1 MEND : : : SUBST : : :
&PAR1,&PAR2,&PAR3 &PAR1 &PAR2 &PAR3 DATA1 DATA2 DATA3 : : : DATA1, DATA2,DATA3 DC DC DC F5 F10 F15
52
SUB1
DATA2
SUB1
DATA3
53
54
A A A
: : :
: : : :
55
Contd
A A A DATA1 DC DATA2 DC DATA3 DC 1,DATA3 2,DATA3 3,DATA3 F5 F10 F15
In this example, the instruction sequences are very much similar, but these sequences are not identical.
56