You are on page 1of 131

1.

1 Computer Hardware
1.1.1 Computer Concepts

A very basic model of a computer would be:

The refined diagram shows the basic components of a computer system.

RAM

Random Access Memeory. Stores loaded programs and data to be processed. There are
two types: volatile and non-volatile. Unlike non-volatile memory, volatile memory lost
when the computer is switched off or reset.

ROM

Read Only Memory. Stores information about BIOS and startup routines.

1.1.2 Measuring Storage

Both the primary and auxiliary devices have capacity measured in bits, bytes, kilobytes,
megabytes and gigabytes.
A 1 or a 0 is a BInary digiT (Bit).

There are 8 bits in a byte


1024 bytes in a kilobyte
1024 kilobytes in a megabyte
1024 megabytes in a gigabyte
Therefore there are 1,048,576 bytes in a megabyte.

A double density floppy disk holds 720Kb of data. A high density floppy disk contains
about 1.44Mb of data.

Many computers store one character as one byte.

1.1.3 Computer Categories

System Description
Microcomputers This a computer which uses a microprocessor as its CPU. Includes PCs,
laptops etc. They typically have between 1 and 128Mb of memory
(RAM). They process data in anything from 8-bit ro 64-bit chunks.
Minicomputers These are systems designed for multi-user access several terminals.
Varies from processing power from a very powerful micro to a small
mainframe.
Mainframes Supports hundreds of of terminals for multi-user access. Large amount of
primary and auxiliray storage.
Supercomputers These are the fastest and most expensive systems. Although they are not
multi-user machines, they are used when a vast amount of processing is
to be done.

1.1.4 Computer Configuration

This term describes the collection of hardware in use. A large computer configuration
would consist of several hundred micros in a building linked together by cabling to form
a local area network (LAN).

1.1.5 Embedded Systems

An embedded system is dedicated to one specific task. They are special purpose systems.

Embedded systems typically have their programs stored in ROM as opposed to auxiliary
storage and RAM. These programs are referred to as firmware - a combination of
software and hardware.

1.2 Computer Software


Software is a set of instructions that will make a computer perform a task.

There are four types of software:

• Applications
• Operating Systems
• Utility Programs
• Programming Languages

1.2.1 Applications Software

These packages enable people to carry out different tasks on a computer.

General Purpose Software

• Word Processing
• Spreadsheets
• Desktop Publishing
• Databases
• Graphics
• Computer Aided Design
• Telecommunications
• Multimedia Authoring
• Expert Systems

Special Purpose

Special purpose applications software is used widely but only for a small number of
people in any given field. For example:

• Theatre Booking
• Stock Control
• Insurance Quotes
• EPOS (if not firmware)
• Dentist/GP Appointment Systems

1.2.2 Programming Languages

Before the 1950s there was no software to help programmers write programs.

All computers process instructions using machine code. This is called low-level
programming because it uses many simple calculations in binary.

A big development in computing was the introduction of assembly language where the
binary machine code was represented by two or three letters. Assembly language was the
first programming language.
Each program requires its own translation program. This can be an interpreter or a
compiler. An interpretor translates code instruction by instruction and is therefore quite
slow. A compiler translates a source code file into machine code, saving it as a file taht
can be executed later.

1.2.3 Operating Systems

The operating system always runs in the background allowing the system to perform such
tasks as:

• loading and running applications


• managing the position of files on a storage device
• multitasking management

There are two types of operating systems:

• Graphical User Interface Systems which use WIMP (Windows, Icons, Menus &
Pointers) principles to manipulate files and control the system.
• Text User Interface Systems which require the user to type commands in.

1.2.4 Utility Programs

Utility programs have no distinct result - they merely help the user to acheive a result.
They are usually very closely linked to the operating system.

Examples of such programs:

• Screen Savers
• Disk Defraggmenters
• Anti-virus Software
• Compression & Partitioning Software

1.3 The Human-Computer Interface


The term 'Human-Computer Interface' (HCI) describes the interaction between the user
and the computer.

All computers require some sort of HCI.

The following are a list of tasks for which special purpose interfaces are required:

• getting cash from an ATM


• a jest pilot checking an instrument panel
• a modern photocopier
1.3.1 Interface Design Principals

When designing an interface, we wish to ensure that users perform tasks:

• Safely
• Effectively
• Efficiently

and possibly:

• Enjoyably

The Considerations

• Who will use the system?


• What tasks is the computer performing?
• What environment will the computer be used in?
• What is technologically feasible?

1.3.2 Command Driven Interfaces

In order to instruct the computer, the user has to type in the command and enter it to be
processed.

Advantages

• Could be quicker to enter more complex commands that would normally have to
be accessed via a number of menus.

Disadvantages

• The user must learn and use the command syntax.


• Little or no help support. Vague error messages.

1.3.3 The WIMP Environment

WIMP stands for Windows, Icons, Menus, Pointers

Windows

Can be used to display software or files. Easy to manipulate. Often allow more than one
task to be viewed at once.

Icons
Represent a file or directory or a frequently accessed task.

Menus

Menus allow much of the complexity of the software to be hidden until needed.

Pointer

Controlled by mouse - used instead of keyboard.

1.3.4 Menus

Full Screen/Window Menu

This type of menu reamins on screen until the user makes a choice. Usually used at the
start of an application.

Pull Down Menus

Pull down menus are usually displayed at the top an application menu. When the user
clicks on an item a menu appears.

Pop Menus

Usually pop-up in response to an action referring to a particular object.

1.3.5 Forms And Dialogue Boxes

When a user is required to enter data it is common to display a form on the screen for a
user to fill in.

In a WIMP environment forms are usually called dialogue boxes.

Forms should have:

• a title
• plenty of space
• an indication of how many characters should be present in each field
• default values where possible
• a facility to allow the user to go back and correct mistakes
• items displayed in a logical sequence
• exit and help facilities
• written messages mainly in lower case
• a sensible number of 'attention grabbing' devices

1.3.6 Speech Driven Interfaces


Speech/Sound Output

Whole messages or individual words are spoken and recorded digitally. Output that
would normally be displayed can be 'spoken' by the computer.

Uses

• Phone banking
• '192' Directory enquires
• Document text speakers

Command And Control Systems

Such systems recognise a small vocabulary of technical terms.

Uses (In PCs):

• Run software
• Control printing

Uses (In Business):

• Automatic call handling

Large Vocabulary Dictation Systems

With these systems the computer is controlled by instructions spoken in whole sentences.

If the system is unsure what a particular word was it can examine the structure of the
sentence to predict what the word was.

Natural Language Dialogue Systems

These types of interfaces allow the user to instruct the computer without need for a
particular 'syntax'.

Advantages:

• It is a form communication that is natural to people requiring little or no training.


• There is no 'syntax' to follow - the computer will adapt to different ways of
speaking.

However:

• The computer may not recognise some extreme accents.


• Artificial languages may be more concise.
• Although there is some controversy on this point, some people believe that after a
while people may become to regard the computer as human.

1.4 Business Information Systems


1.4.1 Types Of Information System

There are three basic levels in business:

The Strategic Level

This group includes the founders and/or the directors of the organisation. They are
responsible for long term planning and policy.

The Tactical Level

These are people who are middle management. They are responsible for operations
within a particular department.

The Operational Level

These are the people responsible for the very basic needs of the company. They produce
the final output.

Each level requires a different type of information system (Although some people in the
operational level do not require an information system).

These information systems can be classified as:

• Operational Systems
• Management Information Systems
• Decision Support Systems
• Expert Systems

Operational Systems

Such systems process data generated by day-to-day business transactions.


Examples:

• Accounting Systems
• Invoicing Systems
• Stock Control Systems
• Order Entry Systems

Management Information Systems

These systems often summarise information generated at the operational level to generate
management information.

Example:

Decision Support Systems

These systems are used by senior management in the strategic level.

A decision support system is designed to help someone reach a decision by summarising


all the avaliable relevant information.

The information may come from:

• Internal company records


• Government statistics
• The stock market

Decision support systems usually include:

• Query languages
• Spreadsheet models
• Graphics

Expert Systems
An expert system combines the knowledge of human experts on a given subject to copy
human reasoning. The software follows a set of rules to draw it's inferences.

Uses of Expert Systems:

• Complex fault diagnosis


• Geological prospecting
• Social security claims
• Medical diagnosis

1.4.2 Processing Techniques

All computers perform tasks in terms of input, output and process. We can break this
understanding further into three categories:

• Real-time processing
• On-line processing
• Batch processing

Real-time Processing

The computer must keep pace with the external operation and produce almost
instantaneous results.

Real-time systems are usually used in:

a. Process Control

This is the control of an industrial process or machinery by computer.

Examples:

o Nuclear power station


o Chemical engineering
o Life support systems

b. Interactive Processing

Data is processed upon entry and output is produced almost immediately.

If the user enters all of the data for one transaction and then it is processed this is
known as transaction processing.

Examples:
o Airline reservation system.
o Stock control systems where an invoice is printed straight away.

On-line Processing

An on-line system is one where the input device is connected to the computer. The
hardware and software must exist so that the information can be accessed and possibly
changed.

Batch Processing

Batch processing involves several steps:

1. Source documents (usually handwritten) are received at the centralised data


processing department.
2. The source documents are grouped into batches.
3. The source documents are keyed in as a batch and held in a transaction file.
4. The transaction file is processed when the computer is not busy with other
processing.

1.4.3 Combined Interactive And Batch Processing

Some banking applications may use a combination of interactive and batch processing.

Example (Using an ATM):

1. The customer inserts their card and types in their PIN number and the amount of
cash that they want.
2. The ATM computer retrieves the customers record from the banks central
customer file.
3. If the customer has enough money then the ATM computer sends the new balence
to the screen and issues the correct amount of cash.
4. The new balence is written to the customer file.
5. The ATM program then adds the record to the ATM transaction file which
contains a record for every transaction made that day.
6. At 2:00am the ATM is closed for a short while whilst its transaction file is
processed, producing a summary of ATM transactions.

1.4.4 Centralised And Distributed Processing

Centralised Processing

In the 1960's when business systems were introduced it was common to have a
centralised data processing department.

The data processing department had:


• A mainframe
• System Analysts
• Programmers
• and operators

All data to be processed would be sent to the data processing department.

Distributed Processing

With the introduction of minicomputers and later, microcomputers, the trend has been for
each department or individual to do their own processing.

1.5 Batch Processing


1.5.1 Batch Processing Steps

Particular care must be taken to ensure that all data is correctly transcribed and no
documents are lost or entered more than once.

The following stages usually apply:

1. The source documents are scrutinised.


2. The documents are grouped into batches.
3. Every batch is given a header slip or cover note.
4. The number of each batch is written on its header slip and recorded in the batch
register.
5. Control totals are calculated manually and written on the header slip.
6. The batch and its header slip are entered by data entry clerks into the system.
7. The source documents are verified (re-typed).
8. A variety of validation checks are carried out.
9. Any errors uncovered by the validation procedures are printed ona validation
report and the errors will be corrected later.
10. Valid data is stored on disk or tape until it can be processed.

1.5.2 Validation Checks

Validation checks are performed to see if data has been entered correctly by seeing if it
'makes sense'.

Presence Check Has data been entered?


Character Check Is the input the correct length?
Picture Check Does the input follow the correct pattern/format?
Range Check Does the data fall within the correct range?
Check Digits An extra number design to confirm the correct data has been entered.
File Lookup Looks for entered data in a list or database file.
Control Totals Do certain fields in all the records equal the manually inputted number?

1.5.3 Check Digits

Lengthy numbers such as a product can sometimes be entered incorrrectly.

A code number can be made self-checking by adding an extra digit that follows the code.

The Modulus 11 System

This is currently the most popular system. It catches about 99% of errors.

1 5 8 7
1 Each digit is assigned a weight. × × × ×
5 4 3 2

1 5 8 7
× × × ×
Each digit is multiplied by its
2 weight. These numbers are then 5 4 3 2
added up. = = = =
5 + 20 + 24 + 14 = 63

This number is divided by 11 and a


3 63 ÷ 11 = 5 r 8
remainder obtained.
The remainder is subtracted from
4 11 - 8 = 3
11 to give a check digit.
However, there are two exceptions. These exceptions do not apply to this
If the this number is 10 then the number so the check digit is just 3 and the
5
check digit is zero. If this number final code is:
is 11 the the check digit is X. 1 5 8 7 3

4.1 Different Computer Codes


Over the years different computer programmers have used different sets of codes to
represent different characters.

Most computers use 8-bit codes to represent each character.

This is enough to allow a unique code for each character with one bit spare.
This spare bit can be used in one of two ways:

• as a parity bit.
• to represent an extended character set.

Most micro-computers use ASCII, but some larger systems might use EBCDIC
(Extended Binary Coded Decimal)

Parity

Computers use either Odd Parity or Even Parity to detect errors.

In an even parity machine, the total number of 1s must be an even number.

4.2 Binary Integers


Most computer systems allow positive (or negative) integer nnumbers to be stored.

4.2.1 Introduction

For input and output, ASCII codes are used.

ASCII codes are fine for input and output but useless for arithmetic because:

• The numbers would occupy too much memory.


• Two numbers stored character by character are difficult to add.

In the binary system we move right to left, the value of each digit been twice that of the
previous one.

Binary To Decimal

We can set out the binary number 1001 0101 under these column headings.

27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
1 0 0 1 0 1 0 1
128 + 0+ 0+ 16 + 0+ 4+ 0+ 1= 149

Decimal To Binary

Subtract the highest power of two possible from the denary number and place a '1' under
the column for that power of two. Take the remainder and repeat until there is no
remainder.
4.2.2 Binary Arithmetic (Addition)

To see the carry system works in binary we examine the first eight numbers:

Decimal Binary
0 0000
1 0001
2 0010
3 0011
4 0100
5 0101
6 0110
7 0111
8 1000

We can now try some binary arithmetic:

27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
137 1 0 0 0 1 0 0 1
+ 44 0 0 1 0 1 1 0 0
1

181 1 0 1 1 0 1 0 1

4.2.3 Sign Bit

The most significant bit (msb) or leftmost bit can be used as a sign bit.

If the msb is 0 then the number is positive. If the the msb is 1 then the number is
negative.

This is called the sign and magnitude representation of binary numbers.

4.2.4 Two's Complement

Two's complement can be thought of as working like the odometer in a car.

Decimal Binary
-4 1111 1100
-3 1111 1101
-2 1111 1110
-1 1111 1111
0 0000 0000
1 0000 0001
2 0000 0010
3 0000 0011
4 0000 0100

Converting From Negative Denary To Binary Two's Complement

Step 1 00000110
Find the binary value of the
equivalent positive denary number.
Step 2 11111001
Change 0s to 1s and 1s to 0s
(complement).
Step 3 11111010
Add 1 to the result.

Converting From Binary Two's Complement To Denary

11111011
Step 1 00000100
Complement the number.
Step 2 -00000101
Add one add prefix a minus sign.
Step 3 -5
Convert binary to decimal.

4.2.5 Binary Arithmetic (Subtraction)

To perform subtraction in binary, the number to be subtracted is made neagtive and


converted into Binary Two's Complement form. The two numbers are then added.

E.g. Subtract 12 from 15 in binary (1 byte).

12 0 0 0 0 1 1 0 0
-12 1 1 1 1 0 1 0 0
+ 15 0 0 0 0 1 1 1 1
1 1 1 1 1

3 0 0 0 0 0 0 1 1

4.2.6 Binary Multiplication


Multiplication can be acheived by adding the first multiplier the number of times
specified by the second multiplier.

E.g. 6 × 3 = 6 + 6 + 6 = 18

4.3 Higher Number Bases


Sometimes higher number bases can be used as shorthand for binary.

4.3.1 Hexadecimal Numbers (Base 16)

In base 16 we have sixteen symbols to represent each digit.

Decimal Hexadecimal Binary (4-bit)


0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111
8 8 1000
9 9 1001
10 A 1010
11 B 1011
12 C 1100
13 D 1101
14 E 1110
15 F 1111

It is easy to convert from binary to hex and hex is easier to read than a long string of 1s
and 0s.

Converting From Binary To Hexadecimal

Step 1
Divide the binary number into groups of four digits 0111 0101
starting at the LSB.
Step 2
Write down the hexadecimal equivalent for each 7 5
group of digits.
011101012 = 7516
Converting From Hexadecimal To Denary

To perform this operation we use the same method as for converting a binary number to
decimal. However the column headings are in powers of sixteen not powers of two.

4.3.2 Octal Numbers (Base 8)

In the octal number system there are only eight different symbols.

Decimal Hexadecimal Binary (4-bit)


0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111

Converting Binary To Octal

Step 1
Divide the binary number into groups of three digits 111 101
starting at the LSB.
Step 2
Write down the hexadecimal equivalent for each 7 5
group of digits.
111 1012 = 758

Converting Octal To Denary

To perform this operation we use the same method as for converting a binary number to
decimal. However the column headings are in powers of eight, not two.

4.3.3 Binary Coded Decimal

This variation of binary allows denary digits to be encoded separately.

E.g. 271910 is represented by:

2 7 1 9
0010 0111 0001 1001
271910 = 0010 0111 0001 1001BCD
Advantages

• Easy to convert binary to BCD.


• No error due to round off.

Disadvantages

• Occupies more memory.


• Can be harder to perform arithmetic operations.

BCD Addition

Whenever the sum of any two BCD digits is greater than 10012 then 01102 has to be
added to the result to skip over the unused codes.

4.4 Graphics, Sounds And Other Interpretations


Binary patterns often represent text and numbers but they can also represent:

• Graphics
• Sounds
• Executive programs
• Boolean Values

4.4.1 Bitmapped Graphics

The PCs screen can be thought of as being divided up into a grid. Each square on the grid
is called a pixel or a picture element.

4.4.2 Digitised Sound

Sounds such as music or speech can be converted into binary form.

Sounds are analogue (continously variable) and need to be converted into digital form.
This is done using an analogue to digital converter (ADC).

Advantages

• Data integrity.
• Easier to edit.

Disadvantages

• Any suggestions???
4.5 Fixed Point Binary Numbers
4.5.1 Positive Fixed Point Binary Numbers

In binary we can have functional column headings.

27 26 25 24 23 22 21 20 2-1 2-2
128 64 32 16 8 4 2 1 0.5 0.25
1 0 1 0 0 1 0 0 . 1 1
=164.75
Binary Fraction Fraction Decimal
0.1 1/2 0.5
0.01 1/4 0.25
0.001 1/8 0.125
0.0001 1/16 0.0625

4.5.2 Using Two's Complement With Fixed Point Binary Numbers

For negative numbers we use two's complement representation on the entire bit pattern.

Example

Represent -5.2510 in 8-bit binary with the binary point after the fourth digit.

Step 1
Calculate the positive equivalent 5 . 2 5
number in binary.
0 1 0 1 . 0 1 0 0
Step 2
Change 0s to 1s and 1s to 0s 1 0 1 0 . 1 0 1 1
(Complement).
Step 3
1 0 1 0 . 1 1 0 0
Add 1 to the result.

Advantages And Disadvantages Of The Fixed Point Binary System

Advantages

• Preserves accuracy as required.


• Easy to convert.

Disadvantages

• Limited degree of accuracy.


4.6 Floating Point Binary Numbers
With floating point numbers we store a mantissa and an exponent.

Example

7,800,000

becomes 0.78 × 107.

The mantissa holds the digitsand the exponent determines the point where the fractional
point goes.

Binary

Positive Mantissa

0110 1000 0000 0011 = 0.1101 × 210112

Negative Mantissa

The mantissa is always written in Two's Complement form.

Negative Exponents

The exponent can also be negative as it is in Two's Complement form.

If the exponent's left hand bit is a 1 then, when converting the exponent to denary, the
mantissa's bit pattern is shifted right.

4.6.1 Normalised Form Of Positive Floating Point Binary Numbers

The precision of the floating representation depends on how many digits can be stored in
the mantissa.

In order to get the most accurate representation for a given number of digits in the
mantissa the number is written with no leading zeros.

Example

0.0101 × 2100000 would be written as 0.101 × 2101111.

For a positive mantissa in normalised form the first two bits will always be 0.1.

Example
Normalise the number: 0 000110101 000010 (10-bit mantissa, 6-bit exponent)

Step 1 0 . 0 0 0 1 1 0 1 0 1
Put in the assumed binary point and
convert the exponent to denary.
Exponent = 0000102 = 210
Step 2
Remembering that the overall value of
the number must not change:
• Shift the mantissa bit pattern 0 . 1 1 0 1 0 1 0 0 0
three places to the left to make
it start 0.1.
• Subtract three from the 2 - 3 = -1
exponent. Then convert it to -1010 = 1111112
binary.

The answer is: 0 110101000 111111

4.6.2 Normalised Form Of Negative Floating Point Binary Numbers

The first two bits of such numbers is always 1.0

To normalise a negative mantissa shift the mantissa left until the second bit is zero and
subtract as necessary from the exponent.

Example

Normalise the number: 1 111100100 000011 (10-bit mantissa, 6-bit exponent)

Step 1 1 . 1 1 1 1 0 0 1 0 0
Put in the assumed binary point and
convert the exponent to denary.
Exponent = 0000112 = 310
Step 2
Remembering that the overall value of
the number must not change:
• Shift the mantissa bit pattern 1 . 0 0 1 0 0 0 0 0 0
four places to the left to make
it start 1.0.
• Subtract four from the 3 - 4 = -1
exponent. Then convert it to -1010 = 1111112
binary.
With negative numbers with a magnitude greater than one it is better to normalise the
equivalent positive number and carry out Two's Complement.

4.7 Range And Accuracy


4.7.1 Integer Range

QuickBASIC uses two bytes to store integers. As it uses the Two's Complement system,
the integer range is:

Largest Positive Number = 0111 1111 1111 1111 = 32, 767

Largest Negative Number = 1000 0000 0000 0000 = 32, 768

4.7.2 Normalised Floating Point Numbers

Using a 6-bit mantissa and a 4-bit exponent.

Positive Negative
0.11111 × 20111 1.00000 × 20111
Largest Magnitude
= 12410 = -12810
0.10000 × 21000 1.01111 × 21000
Smallest Magnitude
= 0.00195312510 = -0.002075195312510

Representation Of Zeros

The number 0.00000 × 20000 is not normalised and does not exist in this system.

Zero is usually represented by the smallest possible number and this number is left from
the range.

4.7.3 Excess Notation

Exponents of floating point numbers may be represented in excess notation rather than
Two's Complement.

A 7-bit Two's Complement exponent will go from:

-6410 = 1000 000

63<SUB10< sub> = 0111 111


If we add 64 to this in binary it can be coded as a positive value (0 to 127). This is called
Excess 64 notation because a value of 64 has been added.

To interpret such a number we subtract 64 from it.

For an eight bit number, we would use Excess 128 notation.

4.7.4 Accuracy And Errors

Floating and fixed point numbers will be accurate to the smallest number they can
represent.

Round-Off Errors

Often we cannot represent a denary fraction exactly even if we allow many bits in
memory. Therefore the number stored is "rounded off" to the closest possible binary
equivalent.

Truncation Errors

Often, in either floating or fixed point systems, results are calculated with too many
places of accuracy to be represented. We get this type of error when traling bits are
truncated to fit the result in the memory location available.

Overflow

A computational process produces a result so large that it cannot be represented.

Underflow

A result is produced that is smaller in magnitude than the smallest number that can be
represented.

6.1 File Concepts


6.1.1 File Structure

A files is an organised collection of data. For example: An employee file may contain a
collection of organised data relating to employees within a company.

Files are made up of records.

A record is a collection of data about one item or individual.

A record consists of a numbewr of fields.


Each holds one piece of data.

6.1.2 Data Storage Media

In a computer system data is usually stored on a magnetic or optical medium. For


example: Hard disk, floppy disk, CD-RW, etc.

Such media are non-volatile.

Media for data storage are classified:

Direct Access Storage

Data can be stored or retrieved at any address directly.

Serial Access Storage

Data can only be stored or retrieved from one address after another.

6.1.3 Data File Types

Files which consist of records and fields are called data files. There are three main types:

Transaction Files

Such files contain details on all recorded events over the last period.

The period may be a day, a month, etc.

A sales transaction file contains data on all the sales made in the last day.

Transaction files have a very short life. Although they may sometimes be achived.

Master Files

These are permanent files that are regularly updated by processing transaction files.

They contain:

• Fairly permanent data.


• Data that is updated with the transaction files.

Resource Files

These are permanent files that are nt usually updated. Contents are used for reference.
6.1.4 Other File Types

Program Files

Here the data is the 1s and 0s that make up an executable program.

Document Files

This data is used by applications software.

6.2 Serial & Sequential Files


6.2.1 Serial File Organisation

Records are placed onto the disk or tape one after the other with no regard for sequence.

Transaction files are stored serially.

6.2.2 Sequential File Organisation

Records are stored one after another in a recogniable order.

The Primary Key

One field is chosen by which records are ordered.

6.2.3 Creating Serial And Sequential Files

The three QuickBASIC commands that we need to create serial or sequential files are:

The OPEN Command

OPEN Filename$ FOR OUTPUT AS #n

This creates a file called Filename$ so data from the program can be output to it.

n% is an integer, representing the channel number, between 1 and 255.

The WRITE Statement

WRITE #n, <variable>, <variable>, etc.

n% must be the same number used in the OPEN statement.


Each of the variable name after the WRITE statement will create consecutive fields in a
record.

The CLOSE Statement

CLOSE #n

This closes the specified channel.

6.2.4 Reading From Serial And Sequential Files

In order to read data from such a file we need to:

OPEN Filename$ FOR INPUT AS #n%

INPUT #n%, <variable>, <variable>, <variable>, etc.

CLOSE #n%

Note that data is always input starting from the beginning of the file.

The variables and their data types should match with those used in the WRITE statement.

6.2.5 Appending Records To A Serial File

As records in a serial file are in no particular order to add a new record we can simply
append it to the end of an existing file.

In order to do this we need to:

OPEN Filename$ FOR APPEND AS #n%

WRITE #n, <variable>, <variable>, etc.

6.2.6 Deleting Records From A Serial Or Sequential File

In order to delete a record, a brand new file (of the same name) has to be created without
the record that was to be deleted.

There are two methods of doing this:

Physical Deletion

Open a channel for input from the file.

Open another channel for output to a new file name.


Input a record from the old file and check if it needs to be deleted.

If it is to be kept, write it to the new file. If it is not, then move on to the next record.

Close the channels.

Delete the old file and rename the new file as the old one.

Logical Deletion

Each record is given an extra field. This field is a flag field. If it is equal to 0 then the
record is shown as existing. If it is equal to 1 then, although the record exists, it is not
shown and can be considered to be logically deleted.

6.2.7 Using 2D Arrays

All data (textual or numeric) can be represented as string data.

It is a more compact way of programming to swap many 1D arrays for one 2D array.

For example:

DIM Name$(1 TO 4)
DIM Age%(1 TO 4)

could be stored in a 2D array dimensioned as:

DIM NameAge$(1 TO 2, 1 TO 4)

The elements of the array are then:

Names: NameAge(1, 1) through to NameAge$(1, 4)


Ages: NameAge(2, 1) through to NameAge$(2, 4)
In General: NameAge$(Field, Record)

6.2.8 Working With A Variable Number Of Records

There are two ways to handle a situation involving a variable number of records.

Use The End Of File (EOF) Function

At the end of serial and sequential files we can detect the end of file marker.

This can be done by examing the value of:

EOF(n)
where n is the channel number.

At the end of the file EOF(n) = -1 otherwise EOF(n) = 0.

Use A Reference File

When a serial file is created a separate reference file of the same name but with the
extension 'tot' is also created.

If records are added or deleted then the total in the reference file has to be altered.

This is done by:

1. Inputting the old total.


2. Adding or subtracting the appropriate number of records.
3. Writing out the new total over the old total.

6.2.9 Adding Records To A Sequential File

Unlike serial files we cannot append new records to the end of the file. New records are
added using a method known as Updating by Copying.

This is done by inputting records one-by-one and then writing each record one-by-one
making sure that the new record is inserted in the right place.

Pseudocode

INPUT the new details

OPEN old file FOR INPUT


OPEN new file FOR OUTPUT

DO UNTIL end of old file


INPUT a record from old file
IF key of new record is earlier then key of current record THEN
write new record to new file
END IF
write out current record to new file
LOOP

If new record has not been added then add it now

CLOSE files
rename files

END
Updating A Master File

In the business world it is common to 'update' a file by changing more than one record at
a time. A file can be updated by:

• having records added


• having records modified

Example

An electricity company's master file contains records for customers. New customers have
to be added and customers who move house need to have their records modified.

In order to do this we need to have three files:

• Master file
• Transaction file
• New master file

The master file could (for example) contain names, address and account numbers for
customers. The transaction contains records that need to be added and records that need
to be modified. The new master file will be produced by processing the transaction file on
the existing master file.

These are the steps that we would need to go through:

1. Make sure the transaction file is in the same sequence as the master file.
2. Read a transaction record into main memory.
3. Read a master record into main memory.
4. If the transaction record is less than the master record, write to the new master
file, read in the next transaction record and go to 4 (this is recursive).
5. Write the master record to the new file.
6. Go to 3.

If we want to modify existig records as well then this could be incorpriated in to step 4.

We may wish to keep the old master file as a record of previous situations.

6.2.10 Working With An Unknown Number Of Fields

Information can be retrieved from a file's record at the time using:

LINE INPUT #n, rec$

This reads all the characters including commas to the next end of record marker and
assigns the string to rec$.
The string handling functions can then be used to extract the field data.

6.3 Indexed Sequential Files


Records are stored in sequence with an index.

The index enables individual records to be located directly.

The index is created and stored with the file when it is first created.

Indexed sequential files require direct access storage (DAS) media.

6.3.1 Disk Packs And Hard Disks

The way in which the index works depends on how the data is written onto the disk.

Each platter has two surfaces although the outer surfaces are not used.

Each surface is split up into a number of concentric circular tracks.

All the tracks of the same diameter together on different surfaces form a cyclinder.

Tracks are split up into sectors.

6.3.2 Multi-Level Indexing

Indexed sequential files have more than one level of index.

A common method of indexing is:

CYCLINDER - SURFACE - SECTOR

The cyclinder (primary index) is read first. From this we can establish which cyclider
holds the data we want and the read/write heads are moved to that cyclinder. This is
known as seeking.

At the right cyclinder the surface index (the secondary index) is read. We can now switch
on the right head for the right surface. This is known as switching.

The read/write heads are now on the right track. We now read the sector index (the
tertiary index). This gives the sector at which the record should be found.

The sector is now searched serially. If the record is not found then either:

• The record does not exist.


• The record was placed in an overflow area in the disk pack.

Advantages Of Multi-Level Indexing

• Files can be processed randomly which is usually faster than serial processing.
• We have the flexibility to ignore the index and search the file sequentially.

Disadvantages Of Multi-Level Indexing

• The index takes time to create, access and also consumes space.

6.3.3 Overflow

Each sector accommodates a range of key values.

The sector which should accommodate a record is called it's home sector.

If a sector is full, there are two circumstances when a record will not fit into its home
sector.

• A new record is being inserted.


• An existing variable length record becomes longer during updating.

In either case it is stored in the overflow area on the disk pack.

If this happens a tag is left in the home sector which gives the key field of the record and
the address of the sector in the overflow area where the record is to be found.

6.3.4 Blocks

The smallest amount of data that can be transferred between main memory and backing
store is a block. A block of data occupies one sector.

The number of records stored in one block is called the blocking factor of a file. The
choice of deciding how large to make the blocking factor is called the blocking strategy.

When deciding on the blocking strategy it should be remembered that:

• File access should be as quick as possible.


• Addition and deletion of records should be as quick as possible.
• Storage space should be used efficiently.

Two common ratios are:

Block Packing Density


Disk space allocated to records : Total space avaliable in a block

Cyclinder Packing Density

Tracks set aside for records in the


: Total number of tracks in the cyclinder
cyclinder

6.3.5 File Reorganisation

if a large number of records have had to be stored in overflow areas becuase their home
sectors were full then file processing would be slow.

The solution is to reorganise the file.

This involves:

• reading records in a logical sequence.


• writing them out in physical sequence to another file.

More free space is left in the home sector for additional records. The indexes are also
recreated.

6.4 Random Access Files


Records are written and retrieved from disk in adirect or random way.

Random file organisation requires direct access storage (DAS) media.

The program that stores and retrieves the records has to specify the address of the record
first of all.

A field is selected to be the key for each record.

An algorithm (a set of instructions) turns the value of the records key into an address for
the record.

6.4.1 Address Generation

The simplest method uses the value of the key field as the record address.

Example
Record Address Customer Number Customer Name
1 1 <empty record>
2 2 <empty record>
104 104 Davis
208 208 Peterson
405 405 Franks
408 408 Black

However, with this method, records are often too spaced out.

6.4.2 Hashing Algorithms

Hashing algorithms convert a records key into an address for the records.

With numeric keys a common hashing algorithm is:

The Division-Remainder Method

1. Estimate the number of records to store.


2. Find the first prime number greater than the number of records.
3. The key of the record to be stored is divided by this prime number and the
remainder is used as the address.

For alpha-numeric key fields a common way of hashing the string to a record address is
to add up the ASCII values of the characters and find the remainder on division of the
sum.

Pseudocode

INPUT record details

sum = 0

FOR letter = 1 TO length of record's key


extract character
find ASCII code
add it to sum
NEXT letter

address = sum MOD nearest prime

6.4.3 Synonyms

When two record keys are hashed to the same address, we say that they are synonyms.
Possible Solutions

1. Put the second in the next avaliable space.


2. Use a separate overflow area for such records.

As with indexed sequential files, at some point we may need to reorganise the file.

6.4.4 Composite Data Types

As well as the standard data types in QuickBASIC we can also define our own data types
using the TYPE statement.

Example

TYPE MyRecord
Aname AS STRING * 12
Phone AS STRING * 12
Units AS INTEGER
Price AS SINGLE
Amount AS DOUBLE
END TYPE

This is a composite data type.

We can now dimension variables and arrays as this new data type.

DIM details AS MyRecord

We can now store several items of data in one variable.

details.Aname = "James Bond"


details.Amount = 0.07

Likewise, we can do the same for arrays.

Example

DIM detailsarray (1 TO 10) AS MyRecord

Each element of the 1D array would have the composite parts as defined in the TYPE
statement.

detailsarray(1).Aname = "Another Person"


detailsarray(1).Amount = 0.95

6.4.5 Data Storage In Random Access Files


Although it is possible to have variable length records with random access files it is
simpler to work with fixed length records.

Random access records are stored in a different way to sequential records.

Example

Field Name Data Type Example


Customer's Name $ Jones P.
Telephone Number $ 01503 123456
Phone Units Used % 428
Price Per Unit ! 8.0
Total To Pay # 3424.0

These data types are combined into a composite data type.

We need to decide how many characters to allow for each field.

The customer's name and telephone number are simple as they are strings. We decide that
no customers have names over 20 characters long. We allow 12 characters for the phone
number.

Storing Numeric Data

In sequential files numbers were stored as a series of ASCII characters. For example:
17,002 is stored using five bytes; one for each digit.

This is wasteful of memory, and in random access files, numbers are saved in a
compressed binary format.

In general:

Integers take: 2 bytes


Long integers take: 4 bytes
Floating point (single precision): 4 bytes
Double precision floating point: 8 bytes

So we can now complete our example:

Field Name Data Type Bytes Required Example


Customer's Name $ 20 Jones P.
Telephone Number $ 12 01503 123456
Phone Units Used % 2 428
Field Name Data Type Bytes Required Example
Price Per Unit ! 4 8.0
Total To Pay # 8 3424.0

6.4.6 Inserting Data Into A Random Access File

Step 1

First, the field structure of each record is defined by means of the TYPE - END TYPE
statement.

TYPE MyRecord
Aname AS STRING * 12
Phone AS STRING * 12
Units AS INTEGER
Price AS SINGLE
Amount AS DOUBLE

Step 2

An array or variable is explicity declared.

DIM Phonebill AS MyRecord

Step 3

Open our random file.

OPEN "G:\Raphone.dat" FOR RANDOM AS #n LEN = L%

where n is the number and L% is the length of each record in bytes.

Note that random files are opened for input and output simultaneously.

Step 4

Now we assign data to our variable, phonebill.

phonebill.Aname = "Adams M"


phonebill.Phone = "01802 123456"

Step 5

We can now store this data into our file by

PUT #n, m, phonebill

where n is the channel number and m is the record address.


Step 6

Finally the file is closed.

CLOSE #n

As with inserting data we need to have declared a variable or array with a composite data
type that matches the files field structure.

As before we open the file using

OPEN "G:\Raphone.dat" FOR RANDOM AS #n LEN = L%

where n is the channel number and L% is the length of each record in bytes.

Now we can retrieve the data using

PUT #n, m, phonebill

where n is the channel number and m is the record address.

When we've finished retrieving data we close the channel.

CLOSE #n

6.4.7 Retrieving Data From A Random Access File

If we wish to find out how many bytes are in a random file we can use

LOF(n)

If this is divided by the byte length of each record, the number of records can be
calculated.

6.4.8 Variable Length Records In A Random Access File

If the number of characters in a string field varies greatly or the number of fields in the
record varies then the use of variable length records is appropriate.

There are two ways of implementing:

1. Pick your own end of field and/or record markers.


2. The first byte of each field is a character count.

6.5 Overview of File Processing


In this final part we shall consider operations on files and their uses.

The following terms are often used:

Interrogating/Referencing - Searching to find a particular key.

Maintenance - Updating various records plus adding and deleting records.

Sorting - Changing the sequence of records.

6.5.1 Updating Files

Updating By Overlay

Records in indexed sequential files and random files can be accessed directly, modified
and written back to their original locations.

Updating By Copying

This method involves copying the records one by one to a new file, making modifications
as needed.

The result is two versions (or generations) of the file.

6.5.2 File Backup And Generations

Each time a master file is updated another, out of date, generation is left.

It is common to keep three generations:

• Grandfather
• Father
• Son (Current version)

6.5.3 Choosing Between Serial And Direct Access Files

The choice of file organisation is a vital consideration. The following questions need to
be answered:

• What is the most suitable storage medium for the volume of data involved?
• Must the information always be up-to-date>
• Do users require immediate access to data>
• Can requests for information be grouped together and be batched processed?
• Are report required in a particular sequence?
• What is the hit rate?
• How volatile is the file?
6.5.4 Hit Rate

This is the measure of how many records are accessed out of the total number, usually
expressed as a percentage.

Example

Updating a payroll master file.

During the process 190 out of 200 employee records are updated.

(190 / 200) × 100 = 95%

6.5.5 Volatility

This is the frequency at which records are added or deleted from a file.

If this frequency is high, then the file is said to be volatile.

6.5.6 Uses Of Different File Organisations

Serial Files

Serial file organisation is mainly used for transaction files. As events in the real world
take place, relevant data records are written to a transaction file.

Mainly used in:

• Sales in a shop.
• Customer's withdrawing money from an ATM.
• Postal orders arriving at a mail order company.

The transactions may be batched and the master file updated later. Alternatively, the
master file may be updated as soon as each event occurs (in real-time). The transaction
file is then kept as a record of what occured in case the master file corrupted and its father
needs to be updated.

Sequential Files

Sequential file organisation is used for master files in high-hit rate applications.

May be used in:

• Payroll
• Direct mailing (a.k.a. Junk mailing)
Indexed Sequential Files

Indexed sequential files can be processed either sequentially or randomly. This is very
useful because when most of the records need to be processed then they can be
sequentially processed. When only a few need to be updated then they can be directly
accessed.

May be used in:

• Stock control

The stock file would be directly accessed when the customer makes a purchase. The
master file would be accessed using a multi-level index to find the relevant record. The
description and the price would then be printed on the receipt and the quantity in stock
would be updated right away.

The file would be sequentially processed if a report of all the stock or sales is needed in
stock code sequence. Processing the file this way is fast, but it is not as fast as processing
a sequential file.

Random Files

Random files are used when extremly fast access is required to individual records.
Becuase the hashing algorithm generates the record address when it's applied to the
record's key no time is taken looking through various levels of index.

May be used in:

• Utility programs to validate user names and passwords (on a network).


• Airline booking systems.

If reports are needed containing all the records in key sequence, these will take a long
time to generate.

7.1 Linear Searches


If we have a table of data stored in an array (or ifle) the most straightforward way to find
a record is to examine each record until we find the one we want.

7.1.1 Linear Searches

If our data items are stored in no particular order then a linear search of these items is
called a serial search.

7.1.2 Sequential Search


If the data items are in key sequence then a linear search is called a sequential search.

The search can be abandoned either when the record is found or the search goes past the
item been sought.

Pseudocode

Input search key value


Open sequential file for input
Load array into file into memory as array

DO
IF record matches THEN
matchflag = 1
EXIT DO
ELSE IF search value > current record THEN
matchflag = 0
EXIT DO
END IF
Increment counter
LOOP end of file / end of array data

Display results

7.2 Binary Search


A binary search cab be performed on an array of sequential data items.

If the middle item matches the search item then we have found the item and we can stop
the search.

Else if the middle item is greater then the search item, then only the first half the table
needs to be searched.
Else if the middle item is less than the search item then only the second half of the table
needs to be searched.

We repeat this process until:

• we find the search item


• when we get to one item which is not equal to the search item

Pseudocode

'Intialise
MatchFlag = 0
SearchFail = 0

DIM array

READ data items into array

INPUT SearchValue
Calculate startp and endp

DO
Calculate midp

IF SearchValue found THEN


MatchFlag = 0
ELSEIF startp = endp THEN
SearchFail = 1
ELSEIF middle > searchitem THEN
endp = midp
ELSE show error
LOOP UNTIL MatchFlag = 1 OR SearchFlag = 1

'Output
Show Output

7.3 Internal Sorting


When a file is small enough to be held in the computer's memory an internal sort can be
used.

7.3.1 Insertion Sort

In order to convert an unsorted list to a sequential list we can take an item out of the list,
shuffle the other items along until a gap appears at the appropriate place and insert our
item. This is repeated for all items.
Example

Position: 0 1 2 3 4 5
Value: 5 3 8 6 2

Start with the item in position 2. Take it out of the list to position 0.

Position: 0 1 2 3 4 5
Value: 3 5 8 6 2

Now start just to the left of the gap. Compare each item with the item in position 0 and
move it right if it is greater.

Position: 0 1 2 3 4 5
Value: 3 5 8 6 2

Finally insert the item in position 0 in the gap.

Position: 0 1 2 3 4 5
Value: 3 5 8 6 2

The third item is greater than the item in position 2 so nothing has to be done and the
number stays in the same position.

Next we move the fourth item into position 0.

Position: 0 1 2 3 4 5
Value: 6 3 5 8 2

We now move the appropriate items right...

Position: 0 1 2 3 4 5
Value: 6 3 5 8 2

and move the item in position zero into the gap.

Position: 0 1 2 3 4 5
Value: 3 5 6 8 2

Now we move the item in position 5 to position 0.

Position: 0 1 2 3 4 5
Value: 2 3 5 6 8
We now move the appropriate items right...

Position: 0 1 2 3 4 5
Value: 2 3 5 6 8

and move the item in position zero into the gap.

Position: 0 1 2 3 4 5
Value: 2 3 5 6 8

Pseudocode

DIM card% (0 TO 5)

READ in the data

FOR position% = 2 TO 5
card%(0) = card%(position%)
DO UNTIL card%(movpos%) <= card%(0)
card%(movpos% + 1) = card%(movpos%)
movpos% = movpos% - 1
LOOP
card%(movpos% + 1) = card%(0)
NEXT position%

7.3.2 Bubble Sort

In this internal sort, adjacent values in a list are compared and swapped if necessary.

Several passes are usually required to sort a list.

1st Pass 6 5 3 2 8
5 6 3 2 8
5 3 6 2 8
5 3 2 6 8

2nd Pass 5 3 2 6 8
3 5 2 6 8
3 2 5 6 8

3rd Pass 3 2 5 6 8
2 3 5 6 8

4th Pass 2 3 5 6 8
On the fourth pass no swaps are made so the sort is complete.

This sort is called a bubble sort because small (or light) numbers 'bubble' to the top.

Pseudocode

DIMension card(1 to n)

READ in data

DO
swapped = 0
FOR position = 1
IF card(position) > card(position + 1) THEN
swap the cards
swapped = 1
END IF
NEXT position
LOOP UNTIL swapped = 0

7.3.3 Quick Sort

This algorithm is a fast sorting algorithm because it swaps items that are a very large
distance apart.

It works by picking a pivot item in the list and then moves every item that is greater to
one side and every item that is less is passed to the other side.

The two sub-divisions are then recursively passed one after another to the Quick Sort
algorithm.

The loop unravels when the sub-divisions are sorted.

7.3.4 Extraction Sort

When lengthy records or variable length records need to be sorted it is often faster to sort
only the key field.

In order to retrieve the full record details a pointer is added to each key field value.

7.4 External Sorting


External sorts are used when the volume of data is so great that it cannot be held in
memory.
7.4.1 Merge Sort

This algorithm takes a serial file and produces a sequential file.

It has temporary files: file A, file B, file C, file D.

At any one time, two of these files are 'transmitting' records and the other two are
'receiving' records.

Example

The serial file contains records with the key values:

Serial
23 16 57 43 90 13 29 75 36 25 41 82 19
File

The records are written out alternatively:

File A 23 57 90 29 36 41 19
File B 16 43 13 75 25 82

Files A & B 'transmit' record to files C & D.

Each pair of records are merged to be in sequence.

File C 16 23 13 90 25 36 19
File D 43 57 29 75 41 82

File C and File D hold sequences of records with a minimum length of two records.

A second pass starts and File C and File D now 'transmit' their records to File A and File
B.

File A (16 23 47 57) (25 41 82)


File B (13 29 75 90) (19)

On the third pass:

File C (13 16 29 43 57 75 90)


File D (19 25 36 41 82)

On the final pass:

File C 13 16 19 23 25 29 36 41 43 57 75 82 90
7.4.2 Classic Four Tape Merge Sort

This is a better version of the basic merge sort. The main difference lies in the
initialisation step.

Serial
23 16 57 43 90 13 29 75 36 25 41 82 19
File

The records are moved in groups of ascending key field values to File A and File B.

File A (23) (43 90) (36) (19)


File B (16 57) (13 29 75) (25 41 82)

These values are 'transmitted' to files C and D.

File C (16 23 57) (25 36 41 82)


File D (13 29 43 75 90) (19)

On the second pass:

File A (13 16 23 29 43 57 75 90)


File B (19 25 36 41 82)

On the final pass:

File C 13 16 19 23 25 29 36 41 43 57 75 82 90

This algorithm took 3 passes whereas the simple merge sort took four passes.

8.1 The Computer Missuse Act 1990


This act made it a criminal offence for anyone to access or to modify computer held data
or software without authority, or attempt to do so.

It created three specific offenses:

1. Access is deliberate and unauthorised.


2. Access is without authority and with attention to commit a further offense (either
immediately or in the future).
3. A person does any deliberate act that causes an unauthorised modification of the
computers contents.

Note that:

• Conspiring, attempting and inciting others are all offences.


• The prosecution does not have to have proof that actions were directed towards
particular items of data or programs.
• The accused need not be in the U.K. at the time of the offence.
• The computer data need not be in the U.K.

8.2 The Data Protection Act 1984


In the early eighties increasing public pressure and the possibility of lost European trade
lead to the Data Protection Act being passed.

The act regulates the use of "automatically processed information relating to individuals
and the provision of services in respect of such information".

The act defined:

Data
Information in processable form.
Personal Data
Data relating to identifiable, living, individuals.
Data Subject
The individual concerned.

Principles Of The Act

The Data Protection Act contains eight principles.

1. Personal data, held for processing, must be obtained fairly for a lawful purpose.
2. Such data must be held for a specific purpose.
3. Personal data must only be used for the specific purpose and may only be
disclosed in accordance with the specific purpose.
4. Personal data must not be excessive for the purpose but merely adequate and
relevant.
5. Personal data must be made avaliable to the individual concerned and provision
made for corrections.
6. The personal data must be held securely against unauthorised access or loss.

The Data Protection Registrar

The act established the office of Registrar who is responsible for maintaining an
organisation's public register of Data Users (those people who collect and process
personal data).

Exemptions From The Act

Personal data held for payroll, pensions, and accounts data is exempt; as are names and
addresses for the purposes of distributing information (e.g. mail merge).
Also, personal data held in connections with national security, crime prevention, or for
the collection of tax or duty.

If personal data is collected for statistical or research purposes only, or is held simply for
backup then data subjects do not have the right to see such data.

Although personal data must be kept secure, it can be disclosed to the data subject's agent
(e.g. lawyer or accountant), to a person working for the data user and to anyone if there is
an urgent need to prevent injury or damage to health.

8.3 Computer Fraud


Fraud is criminal deception. Specifically, this is using false representations to obtain an
unjust pecuniary advantage or to injure the rights or interests of another.

It is difficult to estimate how widespread computer fraud is because:

• it is relatively difficult to detect.


• companies who are concerned about their image may be reluctant to publicise
cases of fraud.

Programmers and other computer workers have more oppurtunity to defraud their
employer than other workers.

8.4 Software Copyright


Computer software is covered by the 'Copyright Designs and Patents Act 1988'.

Under the act, in the case of copyrighted software, it is illegal to:

• copy software (with exception made for backup).


• run illegally copied software.

Some software licenses make provision for copies of the software to be used 'like a
boook'.

8.5 Viruses And Trojans


8.5.1 Viruses

A computer virus is a portion of software that is able to copy itself and usually has an
undesirable effect of computer data.

Sources Of Infection
• Internet downloads
• Floppy disks
• Email attachments
• Local area networks

Virus Categories

Most viruses activate and replicate under certain conditions.

• Time Bombs - triggered by a particular date.


• Logic Bombs - triggered by a set of conditions.

8.5.2 Virus Protection And Precautions

Anti-virus software should be installed on all systems where there is a risk of infection.

The following precautions can be taken to slow down the spread of viruses.

• Internet downloads should only be made from trustworthy sites.


• No removable disks should be used in a computer unless first being virus
checked.
• Macros should be disabled where possible.
• Floppy disks can be inoculated against boot sector viruses.
• New software on floppy disks should remain write-protected.
• Hardware engineers who move from network to network should ensure that their
disks are write protected.

8.5.3 Trojans

Trojans are programs that appear to be desirable peices od software, however they have
the capacity to harm data. Unlike viruses they do not copy themselves automatically.

8.6 Security Of Data


Keeping data secure means protecting data from various hazards.

Deliberate Destruction Of Data

• Disgruntled employees
• Terrorists acts
• Hackers gaining access

Accidental Data Destruction

• Hardware failure
• Program failure
• Operator error

Environmental Hazards

• Fires
• Floods
• Power Surges
• Hurricanes
• Earthqaukes

8.6.1 Backup

There are four main methods of backing-up data.

Periodic Backup

Backups are made regularly and kept in a safe place. This is the least satisfactory method.

File Generations

Grandfather, and son files can be stored along with their respective transactions. This
method is only used with sequential file processing.

Incremental Dumping

During a user's work session all the updated files are marked. When the user logs out
these files are copied or 'dumped' to another disk.

Transaction Logging

Every updating transaction/operation on a master file is recorded in a separate transaction


log.

8.6.2 Recovery

Recovery procedures are established so that an organisation may be able to continue


operating in the case of hardware of data loss.

Disaster Planning

There is a contigency plan that comes into action when an organisations data is used.

A separate site with offices, computers and an up-to-date copy of the organisations data is
used.
Bypass Procedures

Bypass procedures are invoked if the central computer in an on-line system fails.
Intelligent terminals with their own backing store can record transactions temporarily
until the main computer is back on-line.

8.6.3 Malicious Damage

Data security is threatened by both hackers from outside and employees within a
company.

Such people may be trying to defraud the company or just be disgruntled.

Steps to Counteract Threats

• Use password protection (may be on encrypted files).


• Immediately remove access rights for employees who leave the company.
• Vet prospective employees.
• Restrict access to certain rooms or secure areas.
• Educate staff watch out for security breaches.
• Separate staff duties.
• Appoint a security manager.

8.6.4 Password Protection

Usernames and passwords are usually stored in a table. The table is permanently stored in
a file on a disk.

Password tables are often stored with authorisation tables that contain a user's rights to
other files.

Passwords tables should be 'irreversibly' encrypted to prevent their contents from been
read.

8.6.5 Data Encryption

In cryptography a message is converted from plain text to ciphertext.

The encrypted message is sent along a communications link and the receiving computer
decrypts the message.

Transition Cyphers

The characters of a message are rearranged in some way.

Example
"MEET_ME_TONIGHT_AT_8"

could be written, row by row, in a five character wide grid:

MEET_
ME_TO
NIGHT
_AT_8

The grid is then read column by column and sent out as:

"MMN_EEIAE_GTTTH__OT8"

Substitution Ciphers

Each character is replaced by another character.

Example

Decode: "X2MM F4P2"

where A E I O U > 1 2 3 4 5 and consanants > next consanant in alphabet.

gives:

"WELL DONE"

8.6.6 Other Security Measures

Other than passwords, an authorised user can be identified by:

• Iris recognition
• Fingerprint recognition
• Voice recognition
• Face recognition

8.7 Data Integrity


"Data Integrity" refers to the correctness of data.

Errors On Input

• Typing errors
• Transcription errors
• Lost batch sheets
Errors In Operation Procedures

• Master file update program ran twice


• Father generation used instead of the son

Logic Errors In Software

• The program does not work the way it was planned


• Often occurs with new software

Errors In Data Transmission

• Electrical interference
• Loose connections

Errors Due To Outdated Information

• May be a particular problem in some organisations

The following steps can be taken to try and ensure data integrity:

• verification.
• validation (range check, presence check, picture check, character count, file
lookup, check digit).
• control totals in batch processing.
• parity bits for data transmission and RAM.
• Checksums for data transmission.

9.1 Flat-file Databases


In a flat-file database there is no way to link information stored in separate files. They
can be useful for storing lists of data.

9.1.1 Creating A Database File

Before data can be stored in the database, the structure must be created.

The following must be specified:

• the order of the fields


• the name of the fields
• the type of data in each field
• the primary key

9.2 Introduction To Relational Databases


Relational databases consist of a number of linked tables.

An entity is an occurrence in a table. An entity can be thought of as a record.

In database design the entities are identified first.

Each entity has several attributes. These can be thought of as fields.

9.2.1 Table Notation

The standard notation to describe the structure of a table in the database is based on the
attributes of one entity.

Example

Book(Library_Code, Title, Author, ISBN, Edition)

The entity is written first and conventionally a word or singular name.

The primary key is the first attribute written in the brackets and is underlined. Other
attributes follow, separated by commas.

9.2.2 Entity Relationship Modelling

The entities (and therefore the tables) in a relational database can be related to each other
in any one of three ways:

• One-to-one
• One-to-many
• Many-to-many

Example

• A blind person and a guide dog have a one-to-one relationship.


• A hospital and a patient have a one-to-many relationship.
• Films and actors have a many-to-many relationship.

9.2.3 Entity Relationship Diagrams

The entity relationships can be shown graphically.


9.2.4 The Conceptual Model

When designing a database system:

• Identify the entities.


• List the relationship pairs between the entities.
• Draw an ER diagram for the whole system.

Example

Entities:

• Customer
• Order
• Product

9.3 The Aims Of Database Normalisation


9.3.1 The Aims Of Database Normalisation

Normalisation ensures that the database is structured in the best possible way.
To achieve control over data redundancy. There should be no unecessary duplication of
data in different tables.

To ensure data consistency. Where duplication is necessary the data is the same.

To ensure tables have a flexible structure. E.g. number of classes taken or books
borrowed should not be limited.

To allow data in different tables can be used in complex queries.

9.3.2 The First Normal Form (1NF)

A table is in it's first normal form if it contains no repeating attributes or groups of


attributes.

We start off with a repeating attribute Student.

Student(Student number, Student name, Date of birth, Sex, Class number)

Where the overlining represents a repeating attribute.

To put the table into 1NF the repeating attribute is turned into part of the primary key.

Student(Student number, Student name, Date of birth, Sex, Class number)

9.3.3 The Second Normal Form (2NF)

A table is in the second normal form if it's in the first normal form AND no column
that is not part of the primary key is dependant only a portion of the primary key.

Consider Student

Student(Student number, Student name, Date of birth, Sex, Class number)

Student number is uniquely associated with Student name but Class number is not.

To put the table into second normal form, the primary that the other attributes are not
dependant upon is removed and a linking table created.

This gives

Student(Student number, Student name, Date of birth, Sex)


Student_Takes(Student number, Class number)

Class(Class number, Class name, Lecturer number)

9.3.4 The Third Normal Form (3NF)

A table is in the third normal form if it is the second normal form and there are no
non-key columns dependant on other non-key columns that could not act as the
primary key.

In short this is the non-key dependence test.

If tutorgp were not a separate entity, then:

Staff(Lecturer no, Lecturer name, Tel no, Tutorgp name, Tutorgp room)

Tutorgp and Tutorgp room are dependant on each other.

To put the table into 3NF the dependant column is removed from the table and a new
table containing both columns is created.

Staff(Lecturer no, Lecturer name, Tel no, Tutorgp name)

Tutorgp(Tutorgp name, Tutorgp room)

The third normal form helps prevent unintentional deletion of data.

9.3.5 Foreign Keys

A foreign key is the primary key of one table that appears in another table.

Example

In the Staff table:

Staff(Lecturer no, Lecturer name, Telephone no, Tutorgp name)

Tutorgp name is a foreign key because it is the key field of Tutorgp.

9.4 Security And Integrity Issues


9.4.1 Multi-Access Databases
If the database is installed on a shared network drive then more than one person can work
with the data.

9.4.2 Database Integrity

The aim is to ensure that no data is accidentally lost or corrupted.

• Locking
o Operation in exclusive mode.
o Lock other users from tables been modified.
o Lock out the record been edited.
o No locks. Users are informed of conflicts.
o Ordinary users can only open tables in read-only mode.
• Refreshing
• Resolving a deadly embrace

Deadly Embrace

If two users attempt to update two records at the same time then the following can
happen:

User 1 User 2
Accesses and locks record 1 Accesses and locks record 2
Attempts to access record 2 Attempts to access record 1
waits... waits...
and waits... and waits...
and waits... and waits...

The DBMS must resolve this conflict.

9.4.3 Database Performance

When database queires are made it is important to give users a fast response time.
Furthermore if users are locked out of records they will become frustrated and
disgruntled.

Improving Database Performance

• Use the least restrictive locking viable.


• Educate the users to: schedule disk intensive operations and update records
promptly.
• Index attributes that are frequently used in searches and queries.
• Install the database software onto local machines.
• Re-schedule other network jobs.
• Upgrade the hardware.
9.4.4 Security Issues

Database security can be promoted by:

• Regular backups
• Transaction logging
• Checkpoints
o All updates are stored in transaction log.
o When the database is permantly updated a checkpoint is placed in the
transaction log and the old version of the database is kept as a backup.
o If failure occurs then all the updates up to the last check point can be
applied to the old database.
• Passwords
• The DBMS can assign priviledges to users or groups of users.
• Encrypting the data.

s9.5 Database Management


The pooling of data has many benefits, however as there are potential problems the
database needs to be managed.

9.5.1 The Database Administrator

The DBA may be in charge of a group of people known as the database administration.

Responsibilities

• Database design
• Informing users of structural changes
• Maintaining the data dictionary
• Assigning access priviledges and passwords
• Training for users

A database is composed of the raw data and the software to organise it.

The software is known as the DBMS (Database Management System).

10.1 Introduction
The different methods of organising data are known as data structures.

10.1.1 Elementary Data Structures


These are built into programmming languages and cannot be broken down into other data
strctures.

Example

Integer - QuickBASIC
Character - Pascal

10.1.2 Composite Data Structures

These are made up of a number of elementary data structures.

Example

User Defined (TYPE - END TYPE) - QuickBASIC


Strings
Arrays

10.1.3 Static And Dynamic Data Structures

Static data structures always occupy a fixed number of bits in the computers memory.
The number of bits in a dynamic data structure can change.

10.2 Linear Lists


A linear list is a dynamic data structure in which the data items are held one after another
in sequence.

In order to implement a linear list we need an array large enough to hold the maximum
sequence that can occur.

We also need:

• a variable to hold the size.


• a variable to hold to maximum size.

10.2.1 Inserting An Item

New items are inserted at the correct place. The variable holding the size must be
incremented.

10.2.2 Retrieving An Item From A List

To retrieve an item we place the sought item in position 0 and search linearly through the
list.
Example

train(0).dest = sought_value
pointer = 1

DO UNTIL train(0).dest = train(pointer).dest OR pointer = max


pointer = pointer + 1
LOOP

IF pointer > max THEN


found = 0
ELSE
found = 1
END IF

10.2.3 Deleting An Item From A List

To delete an item from a list we first find the item using the code from the last section.

Example

Input item to delete

CALL the retrieve sub program to see if it is in the list and then get its location

IF found = 0 THEN
display an error message
ELSE
FOR current = pointer to size
train(current) = train(current + 1)
NEXT current

size = size - 1
END IF

10.3 Linked Lists


Linked lists are dynamic data structures that hold a sequence of items which are not
necessarily in contigous data locations.

Each item in the list is called a node and is made up of information field (which often has
sub fields) and a next address field called a pointer.

The pointer of the last item is given a value of zero to indicate that there are no more
items.
This data structure also includes a variable that points to the first item in the list.

10.3.1 Managing Freespace

In order to keep track of the free space two linked lists are kept.

When a new item is added the data list a node is removed from the free space list.

When an item is deleted from the data list it is linked into the free space list.

Initialising The Table

At this stage the table just consists of a linked list free space.

10.3.2 Inserting An item Into A Linked List

1. Store the new name in the node pointed to by the next free.
2. Change the next free to point to the new next free.
3. Follow the links to find out where the new item should be linked in.
4. Change the new items pointer to point to the next item.
5. Change the previous items pointer to point to the new item.

Pseudocode

node(nextfree).nam = newname
tempfree = nextfree
nextfree = node(nextfree).pt

follow = start
DO UNTIL node(node(follow).pt).nam > newname
follow = node(follow)).pt
LOOP

node(tempfree).pt = node(follow).pt
node(follow).pt = node(tempfree).pt

Before we can work out the full pseudocode we need to identify the special cases.

They are:

• There is no free space.


• The data list is empty.
• The new item is to be the first item in the data list.
• The new item is to be the last item in the data list.

10.3.3 Deleting An Item From A Linked List


In order to delete an item from a linked list we:

• Follow the pointers to the item.


• Change the next free pointer to the deleted item.
• Change the previous items pointer to the following item.
• Change the deleted items pointer to point to the old next free item.

10.4 Queues
In a queue, new items are added at the end.

Items are retreived (or deleted) from the front of the queue.

This is a FIFO (First In - First Out) data structure.

The simplest way of implementing a queue in memory is to use an array with four
variables.

Example

At first the queue is empty.

Roy and James join the queue.

Now David, Debbie and Sam join the queue. Roy and James leave.

There is now no more space at the rear of the queue.

10.4.1 Circular Queues


The most common solution to the problem of queue overflow is to make the rear of the
queue 'wrap around' to the start.

Items leave from the front. If front > limit then front = 1.

Items are added to the rear. Add one to rear pointer. If rear > limit then rear = 1.

Retrieving/Deleting An Item From A Circular Queue

First check that the queue is not empty. If it is not empty then the first item is placed into
a variable, the front pointer in incremented (cyclically) and the size is decreased by one.

10.5 Stacks
New items are added to a stack by placing then on top of the stack (called pushing).

Items are also removed from the top of the stack.

This is an example of a Last-In-First-Out (LIFO) data structure.

A stack can be impressed by means of an array and two variables.


10.5.1 Stacks In Program Translation

Stacks are used in calculations and translating from one computer language to another.

Example - Interpreting Scopes

For the purposes of this example:

() - These are parenthesis


[] - These are brackets
{} - These are braces

Rule - Scope openers must match scope closers.

A stack is used to keep track of the scopes encountered. This works by going left to right
and:

• whenever an opening scope is encountered it is pushed onto the stack.


• whenever a closing scope is encountered the stack is popped and a comparison is
made.

So for p + (q × [x - y] - {s - t} / q)
The comparisons were valid each time the stack was popped and the stack ended up
empty so there were no errors.

Example

Each instruction in the program has an address. When a subprogram is called, the return
address (the next line) is placed on the stack. When the sub is exited the stack is popped
and program flow continues at the next line.

10.6 Binary Trees


A tree is a dynamic data structure with hieracally organised nodes.

• There is a root node at the beginning of the tree structure.


• Every node except the root node has one parent.
• Childless nodes are called leaf nodes or terminal nodes.

Every tre has only one root, but each node in the tree can be regarded as the root of a sub-
tree.
A binary tree is a special type of tree. Each parent can have no more than children.

Example - Decoding Morse Code

Starting from the root move right if "-" or left if it is a ".".

10.6.1 Constructing A Binary Search Tree

A search tree is an application of a binary tree. It allows the tree to be searched.

We can store a list of names (or any other data) in a binary tree.

We are given a list of names in no particular order.


Example

Legg, Charlesworth, Illman, Hawthorne, Todd, Youngman, Jones, Ravage

Then to create a binary search tree that can be searched we:

• Place the first item in the root.


• Insert the items in the order that they are given.
• The item is added after the left branch if it comes before the previous node in the
alphabetic or numeric sequence.
• or the item is added after the right branch if it comes after the last node in the
alphabetic or numeric sequence.

10.6.2 Traversing A Binary Search Tree

One important use of binary search trees is to rapidly retrieve a single data item.

All the data items can be extracted in a number of different sequences.

The traversal algorithms (that extract data in these different sequences) are recursive.

Traversal Algorithms
Preorder Traversal

• Start at the root.


• Traverse the left-hand subtree.
• Traverse the right-hand subtree.

The nodes are visited in the order:

D-B-A-C-F-E-G

Inorder Traversal

• Traverse the left-hand subtree.


• Visit the root.
• Traverse the right-hand subtree.

The nodes are visited in the order:

A-B-C-D-E-F-G

Postorder Traversal

• Traverse the left-hand subtree.


• Traverse the right-hand subtree.
• Return to the root.

The nodes are visited in the order:

A-C-B-E-G-F-D

10.6.3 Implementation Of Binary Search Trees Using Arrays

Binary trees can be implemented using left and right pointers at each node.

Left Child Node Index Value Right Child Node Index


2 Legg 5
0 Charlesworth 3
4 Illman 7
0 Hawthorne 0
8 Todd 6
0 Youngman 0
6 Jones 0
0 Ravage 0

11.1 Introduction
11.1.1 Project Selection

A business organisation may have more than one reason for the introduction of
computers. Usually a particular area of the business is selected for computerisation.

Reasons why an organisation may computerise an area of business:

• There is a large amount of data that requires repetitive processing.


• There is a need for better access for up-to-date information.
• Clerical costs could be reduced.

There may also be indirect benefits to customer service or cash flow.

For the project to be successful it must not be too complex.

11.1.2 The Systems Analyst

The job of the Systems Analyst:

• Analyse the data processing requirements of the organisation.


• Decide whether the current system could be computerised.
• Specify how the new system should work.
• Take the responsibility for implementing the new system.

In order to do this the analyst must:

• Have a very good understanding of the organisation and how computers can be
used.

11.1.3 The Systems Life Cycle

Commercial systems all share a commom life cycle pattern.


Business systems may need to change because of:

• expansion
• change of business activities
• the economic situation
• technological advances
• or other factors

At this point a Systems Analyst is employed.

11.2 Analysis
11.2.1 Problem Definition
The first step is to write down the terms of reference.

The terms of reference may well include:

Objectives - What the system should acheive.

• Improved customer service.


• Better management information.
• Cope with larger volumes of business.

Constraints

• Maximum cost.
• Avaliable equipment.
• Any business area not to be changed.

Timescale - A special constraint.

Reports - What output is required.

Suggested Solutions - By management.

11.2.2 Feasibility Study

Technological Feasibility

Is the hardware able to meet the demands put upon it?

Social Feasibility

What will be the effect on employees and customers if the new system is implemented?

Economic Feasibility

Will the new system be cost effective?

11.2.3 Further Analysis

Fact Recording And Investigation

The analyst is interested in:

• The procedures
• The data
• The future
• News reports
• Problems

The following are ways of investigating existing procedures and the existing problems:

• Observation
• Reading existing documentation
• Questionnaires
• Task counting - by clerical staff
• Interviews

11.3 Design
In designing a new system all aspects of the design must be documented.

11.3.1 Systems Specification

This involves:

• designing screen layouts and reports formats.


• specifying file contents and organisation.

11.3.2 Prototyping

This is an aid to better, faster, systems design.

In a computing context, prototyping involves building a simple model of software under


development.

This could involve using special software to quickly design input screens and validate
data input.

This gives the user a chance to experience the look and feel of the input process.

Note that prototyping would also be used during analysis.

11.3.3 Computer Aided Software Engineering (CASE)

These are software tools that insist in the design or development of a system.

Examples of CASE software:

• Graphics Tools - to draw structure diagrams, data flow diagrams, etc.


• Interface Generator - for rapid prototyping.
• Project Management Tools - schedule the steps in analysis, design, coding and
testing.
11.3.4 Design Choices

Firstly decide upon:

Processing Method

The systems designer will need to chose between Batch Processing and On-Line
Processing. This decision will be dependant on:

• how often the data changes.


• the volume of data.
• the cost of hardware.
• how the data is collected.

Software Solution

In general there will be many different solutions to a problem.

The following affect this choice:

• Usability
• Performance
• Suitability
• Maintenance

Hardware Solution

This depends on many factors, for example:

• The software
• Volume of data
• Number of users and their locations
• Processing method
• Security considerations

11.4 Graphical System Representation


There are several various methods for representing computer systems.

11.4.1 Data Flow Diagrams

This type of representation is concerned with how data moves through a system not with
such details as what type of data storage is used.
Source or destination of data external to the system.

An operation on data which transforms it.

Any stored data without reference to the physical method of


storage.
This represents the movement of data between any of the above.

Conventions

1. Data flows may not go directly to and from data stores and entities.
2. Data flows must be labelled to indicate what data is been transferred.

11.4.2 System Flow Charts

A systems flow chart shows an overview of the whole system. In particular it represents:

• Tasks to be carried out.


• The hardware devices.
• The inputs, outputs and storage media.
• The files.

Input/Output

Keyboard Input

Process

Manual Operation
Sort

Off Page Connector

Online Storage

Magnetic Tape

Disk Storage

Visual Display

Written or Printed Document

Example

Data validation of mail order forms.


11.5 Development
11.5.1 Program Development

Each program in the system must have a specification written for it which describes what
it will do (and how it will do it).

The program specification could be accompanied by:

• Jackson Structure Diagrams (JSD)


• Psuedocode
• Flowcharts

Applications Generator

These are software tools that generate complete systems. The user defines the input,
output, data, files and what the system needs to do. The applications generator then
produces the program code.

Report Generator

A report generator will produce reports from information supplied by the user. The way
that they work is that the user supplies the headings, the fields to be printed, what order
they are in, the space to allow for each field and what totals are required.

CASE Tools For Development

The CASE software tools that aid the programmer in the development stage are:
• A source code generator
• A data dictionary tool

Applications Packages

Programs may not need to be written for the new system. If an "off-the-shelf" package
may be suitable then it is the analyst's job to evaluate the package and make sure that it
will meet the requirements.

11.5.2 Equipment Acquisition

Following the hardware considerations in the design stage of the life cycle, sutiable
equipment needs to be acquired.

11.6 Testing
11.6.1 Acceptance Testing

The tasks the finished system must perform in order to be accepted (by the user) can be
used as the basis for different tests.

11.6.2 Program Testing Strategies

Bottom-Up Testing

The individual modules are tested in a stand-alone fashion. The individual modules are
combined and tested. Finally, a system test is performed.

Top-Down Testing

The whole system (or at least a skeleton of it) is tested. Individual modules, yet to be
completed, are replaced by 'stubs'. Stubs often display a message on screen to show that
the module has been called.

11.6.3 The Test Plan

A test plan should be developed which will go through as many paths as possible in the
system.

For each test the following points should be included:

• the test's purposes.


• the test's location.
• a description of the test.
• testing procedure.
Test data should test the program to its limits.

It should include:

• data in the extremes


• invalid data
• commonly entered data

11.6.4 Objectives Of Testing

We need to ask:

Does the logic work properly?


- does it work as intended?
- any runtime errors?

Is all the logic present?


- any functions or sub-programs missing?

11.6.5 White And Black Box Testing

White Box Testing (or Logical Testing)

We test the program by examining the code and trying to test each possible path in the
program at least once.

Black Box Testing (or Functional Testing)

In Black Box we are not concerned with the program code. The progarm specification is
used as the basis for producing a set of test data that covers all the inputs, outputs and
program functions

11.6.6 Other Types Of Testing

Performance Testing

The system is tested to see if it can handle the volume of data anticipated in the user
environment.

Recovery Testing

Here we need to ensure that the system can recover from various types of failure.

Such tests can be performed by simulating hardware or power failures.


s11.7 Implementation
11.7.1 Hardware Installation

Before the new system can come into operation hardware will probably have to be
installed. This may include:

• New computers
• New peripherals
• New office layout and furniture

11.7.2 Staff Training

Prior to the new system going live, all the staff involved in the system will need hands on
training.

Realistic data should be used in briefing.

11.7.3 Creation Of Master Files

All the master files will have to be created before the system can be used.

Typically, there are two phases:

Phase 1

All the data that will not change can be typed in over a few days or weeks.

Phase 2

Data that is liable to change needs to be keyed in just before the changeover to the new
system.

11.7.4 Conversion Methods

Direct Changeover

The organisation stops using the old system one day and starts using the new
system the next.

Advantages

o Fast
o Efficient
o Little or no duplication of work
Disadvantages

o If the new system fails then there will be serious disruption

Parallel Conversion

The old system is kept running alongside the new system for a few weeks or
months.

Advantages

o The old system can be relied upon while any problems with the new
system are fixed
o Results from the new system can be checked against the old system

Disadvantages

o There's a heavy workload on staff

Phased Conversion (Direct or Parallel)

This is used with larger system that can be broken down into several stages.

Each part of the system can be implemented at different times.

Pilot Conversion

Only a part of the organisation uses the new system at first.

11.8 Maintenance
11.8.1 Post Implementation Review

This is carried out several weeks or months after the system has been implemented.

Small changes may need to be made:

For example:
• Correct program errors
• Amend clerical procedures
• Modify screen and report formats

11.8.2 Perfective Maintenance

Here we are concerned with taking an acceptable system and making it better.

11.8.3 Adaptive Maintenance

As the system requirements change, the system needs updating.

11.8.4 Corrective Maintenance

There may still be problems with our system that need to be corrected.

11.9 System Documentation


11.9.1 The Aims Of Documentation

• To assist system design


• To ensure everyone understands how their aspect of the system should work
• To help greatly in system maintenance

11.9.2 The Documentation Contents

A system specification
System flow charts or data flow diagrams
Program descriptions
Structure diagrams, program flow charts and pseudocode
Files or database descriptions
Layouts for screen display and reports
Current program listings
Test data with expected results
Clerical procedures manual

This describes the activities that clerical staff undertake when preparing data for
input to the system.

It will also detail what action should be taken if an error occurs.

Operating instructions

This tells the computer operator how to run the program.


Data preparation instructions

This details how to enter data what format the data should be in.

11.9.3 Documenting A Software Package

This may include:

• A user manual
• A technical manual/operations manual
• Tutorials

12.1 Input Devices


There are several methods of data capture:

• Data may be keyed in.


• Document readers.
• Direct capture methods.
• Graphical input device.

12.1.1 Key To Disk Systems

Such a system is used in a batch environment where several thousand documents need to
be keyed in every day.
Step 1

A key station operator runs the data entry program and keys in a batch of data.

Step 2

As the batch is entered, it is validated.

Step 3

The whole batch is stored on disk.

Step 4

The source documents are re-typed by another operator.

Step 5
The original batch is verified and discrepancies are checked and corrected.

Step 6

The completed batch is transferred to the mainframe for processing (either on tape or
electronically).

12.1.2 ???????

12.1.3 ???????

12.1.4 Scanner Technology

The scanner shines a bright light onto the image been scanned while a mirror reflects a
strip of the image onto a bank of photosensors.

The photosensors convert the light into an electrical current.

An analogue to digital converter converts the current into a bit pattern.

12.1.5 Magnetic Ink Character Recognition (MICR)

Most banks use MICR for processing cheques.

The ink head used to print the characters at the bottom of the cheque contains iron oxide
which can be magnetised during processing.

12.1.6 Optical Mark Recognition (OMR)

OMR devices work by scanning marks in certain positions on a form.

Uses of OMR

OMR forms can be used where answers are sought to 'closed' questions.

Examples:

• Scanning multiple choice tests.


• Market research questionnaires.
• Selecting courses at Havant College.

12.1.7 Magnetic Strip Card Readers

Cash machines (ATMs) use a keypad and a magnetic strip card as input devices.
The bank card is encoded with:

• The customer's account number and sort code.


• The PIN number which is encrypted.
• The withdrawal limit.
• The amount withdrawn so far (in the current day).

When a customer inserts the bank card and enters a PIN number which is then checked
against the PIN on the card and if possible, the PIN number held in the customer's record
on the bank computer.

If the customer then wishes to withdraw cash the following checks are made:

• The customer's account balance on the banks central computer.


• The withdrawal details on the card.

12.1.8 Smart Cards

Smart cards incorporate a microchip (with memory) instead of a magnetic strip.

Typically a smart card can store in excess of 10Kb

Uses of Smart Cards

• Telephone cards (makes fraud more difficult)


• Mondex (a system designed to replace conventional money)
• As a door key

12.1.9 Touch Screens

A touch screen allows the user to provide input by touching an area of the screen.

Uses of Touch Screens

• Industrial environments
• Retail order processing
• Public information systems

In general they are useful when the operator is moving but not so useful when the
operator is at a desk.

12.1.10 Other Pointing Devices

Graphics Tablets

A graphics tablet is a flat rectangular plate on which a stylus is placed.


Uses of Graphics Tablets

• Tracing a drawing
• Computer aided design

Pen Stylus Input

Palmtop computers use a pen-stylus in conjuntion with a touch sensitive screen as a


pointing device.

Mice And Trackballs

These are the most commonly used pointing devices.

Trackballs may be used where a robust or compact pointing device is required.

12.1.11 Electronic Point Of Sales (EPOS)

Most EPOS systems use barcodes.

The Barcode Reader

A LASER beam passes rapidly across the barcode on the item been purchased.

The reflected light is picked up by a photosensitive cell and the numeric code is
calculated.

The price and description of the item can then be looked up in a master file.

12.1.12 Electronic Funds Transfer At The Point Of Sales (EFTPOS)

Most Point of Sales 9POS) terminals can automatically deduct money from the customers
bank account. Such systems are known as EFTPOS systems.

At the same time the retailers account is electronically credited.

12.2 Output Devices


12.2.1 Impact Printers

Daisywheel Printers

The printing characters are mebossed on a spherical head.


the head is rotated so that the desired character is foremost when the heads strikes the
paper through the ribbon.

Line Printers (often used with mainframes)

Line printers print off one line at a time. However this is not quite true.

A drum with rows of the same letter embossed upon it rotates at high speed. All the As
are printed for a given line, then the Bs, etc.

This is done by hammers that strike the paper on the other side to the embossed
characters.

12.2.2 Ink Jet Printers

Ink jet printers work bydirecting a fine jet of ink at the paper.

Advantages:

• Low set up cost


• Quiet
• Good print resolution
• Full colour capacity

Disadvantages:

• Slow print speed


• Printing consumables can be expensive
• Large areas of colour can warp the paper

12.2.3 Laser Printers

Laser printers fuse powdered ink/toner onto the paper by heat and pressure.

Advantages:

• Very high quality output


• Typically they are fairly fast
• Quiet

Disadvantages:

• Budget and mid-range models are available in monochrome


• Running costs are higher than those of impacts printers

12.2.4 Plotters
Plotters are used to produce high quality line drawings.

Most plotters are vector plotters. They move a pen using point to point data.

Colour vector plotters have a rack with several pens.

Advantages:

• Can handle large paper sizes


• Ink pens are relatively inexpensive

Disadvantages

• Detailed diagrams can be slow to print


• The line thickness may be too great for some applications

12.3 Storage Devices


Most computer systems require a backing store for programs and data.

We are interested in:

• data integrity
• storage capacity
• access time
• data transfer rate
• low cost

12.3.1 Floppy Disks

The surface of the disk can be magnetised.

Formatting a disk marks out concentric tracks and sectors on both of the surfaces of the
disk.

12.3.2 Hard Disks

The hard disks inside PCs consist of one or more disk platters sealed inside a casing.

They work in a similar but smaller way to disk packs.

The read/write heads float on a cushion of air about one millionth of an inch off the
surface of the disk.

12.3.3 Optical Storage Media


CD-ROM

Advantages:

• Good storage capacity


• Modest transfer rate
• Fairly robust

Disadvantages:

• Only the manufacturer can store data


• Relatively large for the portable market

CD-R (WORM)

Allows data to be written once to the CD.

Advantages:

• Inexpensive method for data storage


• Can be read by CD-ROM drives
• Excellent for archiving

Disadvantages:

• Cannot be reused (although in some cases this may be an advantage)


• Capacity too small for entire HD backup

CD-RW

Can write data out several thousand times.

Advantages:

• Could be a replacement for a floppy disk

Disadvantages:

• Cannot be read by some older CD-ROM drives

DVD-ROM

Stores more using two layers on each side.

Advantages:
• Increased storage capacity
• Marginally faster data transfer than CD-ROM drives

Disadvantages:

• Cost of manufacture is higher (currently)


Recordable & Rewritable DVD Drives

There are currently many incompatible standards including

• DVD-R
• DVD-RW
• DVD-RAM (likely to become standard)

12.3.4 Magnetic Tape

Magnetic tape is used for:

• Backup
• Archiving
• Batch Processing

It is inexpensive, robust and has a large data storage capacity with a very fast data
transfer rate.

Mainframe Environment

A few systems still use tape reels.

Many use Terabyte Disk Farms.

Minicomputer Drives

There are a number of different formats available.

• QIC (Quarter Inch) Tape Streamers (35GB)


• DAT Tape Streamers (26GB)
• VHS Tape (4GB)

12.3.5 Non-volatile Memory

NVRAM is currently only used for storage in specialist embedded systems.

Uses:
• Black box flight recorders
• Satellites
• Fighter aircraft weapon systems

13.1 The Processor And Memory

13.1.1 Memory Addressing

A computers memory can be thought of as a series of boxes each containing one byte (or
more) of data.

Each memory location has its own unique address counting from zero up.

13.1.2 The Data And Address Bus

The CPU is connected to the main memory by two sets of wires (or buses).

• Address Bus
When the CPU wishes to access a particular memory location, it sends its address
down this bus
• Data Bus
The contents of the memory location requested are sent down this bus.

13.1.3 Word Size

This is the number of bits that the CPU can process simultaneously.

The higher the word size, the faster the computer.

13.1.4 Processor Speed (Hertz, Hz)


The CPU carries out an action every time it receives a clock pulse. Its speed is measured
in cycles per second.

The clock pulses are a series of 1s and 0s.

13.1.5 Memory

Different sorts of memory chips can be used.

Memory Mapping
Some RAM is reserved for a special function (e.g. OSs or addresses and paramters (on
the stack)).

Memory allocated to different running applications must also be kept separate.

13.1.6 Cache Memory

Cache is a small amount of fast memory that acts as a intermediate store between the
CPU and memory.

It is fast because:

• It is close to the CPU


• It usually has high speed components

Two principles are used in decoding what data to copy from conventional RAM to cache.
• Temporal Locality
An instruction that has been accessed once is likely to be accessed again in the
near future.
• Spatial Locality
It is likely that data items near previously accessed data items will be accessed in
the near future.

13.2 The Fetch-Execute Cycle


13.2.1 CPU Architecture

The processor consists of:

• The Arithmetic Logic Unit (ALU)


- arithmetic and logic operations take place here.
• The Control Unit
- coordinates all the activities taking place in the CPU, memory and peripherals.
• Circuitory to control the interpretation and execution of instructions
(including registers).

Registers

Registers are special storage locations to hold data whilst its been decoded or
manipulated.

1. The Program Counter (PC)


- this holds the address of the next instruction to be executed.
2. The General Purpose Registers
- used in performing arithmetic functions.
3. The Current Instruction Register (CIR)
- holds the instruction been processed.
4. The Memory Address Register (MAR)
- holds the address of the memory location which data is read from/written to.
5. The Memory Data Register (MDR)
- all transfers from memory to the CPU go here.
6. The Status Register
- certain bits that are set or cleared based upon an instruction.

13.2.2 The Instruction Cycle

Before instructions can be executed they must be fetched.

Additionally, the CPU has to control all data transfer between peripherals and main
memory.
When there is a need to transfer data to an input/output (I/O) device an interrupt is
generated and the CPU deals with the transfer.

13.2.3 The Stack Pointer


Most computer systems have an extra register that points to the top of a set of memory
locations known as the stack. This is the stack pointer.

13.3 Data Buses


13.3.1 Internal And External Buses

Internal buses connect the various registers and components of the CPU.

External buses connect the CPU to the main memory and the periherals.

13.3.2 Single And Shared Buses

Single Bus

A single bus is a dedicated bus that connects two units unidirectionally.

With multiple units the number of single bsues required grows rapidly.

Advantages

• Permits simultaneous data transmission

Disadvantages

• Costly in terms of wires


• Requires a high number of pins on the units

Shared Bus

Often a shared bus system is preferred.

Advantages

• Extra units can be easily added

Disadvantages

• Does not permit simultaneous transmission of data

Shared buses are usually bi-directional

13.3.3 Interfacing With Peripherals


Input/Output devices and data storage devices communicate with the CPU via external
buses.

The peripherals require an interface unit.

13.4 Processing Architectures


13.4.1 The Von Neumann Machine

Instructions are fetched and executed one at a time.

These instructions are fetched and executed one at a time. These instructions are stored in
main memory just like any other data. This is known as the stored program concept.

13.4.2 Pipelining

This is a technique to speed up processing by introducing a degree of processing by


intrucing a degree of parallelism; whilst once instruction is been executed the next
instruction can be fetched.

Both the Von Neumann architecture and its pipelined version are classified as single
instruction stream, single data stream (SISD).

13.4.3 Vector Array Processing

In the CPU architecture a single instruction can operate on an array of data.

This is possible because the CPU has:

• one control unit


• several processing elements (ALUs and registers)

This architecture is known as single instruction stream, multiple data stream (SIMD).

13.4.4 Multiple Processor Architecture

In this system, multiple instruction streams operate on multiple data streams (MIMD).
The CPU consists of several control units and several ALUs. These processing elements
have to be co-ordinated.

13.5 Assembly Language


Each processor type has its own binary (machine code) langauge. Assembly language is
short-hand for this.
13.5.1 Data Transfer Instructions

These move (copy) a value from register or location to another.

MOVE <source>, <destination>

Example Program

MOVE #100, R0
MOVE #112, R1
RTS

13.5.2 Memory Mapped Output

Certain memory locations map directly to an output device.

In the ASM Tutor memory locations 5 to 14 map to the mapped memory window.

MOVE #'H', 5

13.5.3 Address Mode

These modes specify how the data in an instruction is to be acquired.

Immediate Mode

In this mode the operand is a literal value.

MOVE #35, R0 - Move the value 3510

Direct (Absolute) Mode

Here the operand holds the address of the data to be used in the instruction.

MOVE 129, R0 - Move the data in the location 12910 into R0

Direct addressing is slower than immediate addressing because:

• The data must subsequently be fetched from its location


• High memory locations (E.g. Addresses above 128K in a 16-bit machine) will
need to be loaded into the CPU in several FE cycles.

Indirect Mode

In this mode the operand holds a register/memory location that in turn holds the address
of the data to be used.
MOVE (R1), R0 - Moves the contents of the memory location that is
referenced by register R1 into register R0.

Indexed Mode

The operand address is calculated by adding a base address to a value in a register (often
called the index register).

MOVE 200(R0), 129 - Move the value from the address stored in R0 + 200
to address 129.

Relative Mode

This mode is used in branch instructions.

13.5.4 The Keyboard Buffer

In the ASM tutor, memory address 4 is reserved as the keyboard buffer.

The contents are changed by clicking on the appropriate key in the interrupt window.

13.5.5 Unconditional Flow Control

Normally instructions execute in sequence as the program counter is incremented.

Jumping

Jumping causes the program to branch to a specified address.

JUMP <address>

For example

JMP 128 - Jumps to address 128


JMP (R3) - Jumps to address in R3

Sub-routines

Unconditional flow control can be acheived by the use of a sub-routine.

JSR <address> - Jump to an effective address

RTS - Return to the next instruction after the JSR that invoked the sub-rotuine.

13.5.6 Input And Output Using System Subroutines

Most OSs provide a number of sub-routines.


E.g.

• to access a file
• to display a window
• to get input from the keyboard

ASM Tutor emulates this with addresses 86 to 89.

To Output A String

i. Make sure that the string is terminated with a 0.


ii. Place the start address of the string into R0.
iii. JSR 86

To Input A String

i. Place the start address of the buffer to receive a string into R0.
ii. Place the size of the buffer into R0.
iii. JSR 88

13.5.7 Arithmetic Instructionn

Addition

ADD <source>, <destination>

The source is added to the destination. The sum is placed in the destination.

Subtraction

SUB <source>, <destination>

The source is subtracted from the destination and the result is left in the
destination.

13.5.8 Branch Instructions

Conditional flow control can be acheived by comparing two values and branching,
depending on the result.

First use:

CMP - to compare two values and set the status register

Then a BXX instruction:


BEQ - Branch if equal
BNE - Branch if not equal
BGE - Branch if greater than or equal
BGT - Branch if greater than
BLE - Branch if less than or equal to
BLT - Branch if less than

13.5.9 The Status Register

After an arithmetic operation the result may be negative or an overflow may occur.

Normally the status register contains the following bits:

8 (23) 4 (22) 2 (21) 1 (20)


Carry Overflow Zero Negative

13.5.10 Logical Operators

These operations are used to manipulate the individal bits in memory locations.

The AND Operator

The AND operator takes two inputs and gives one output.

The output will be only be 1 if both the first and the second inputs are 1.

Input 1 Input 2 Output


0 0 0
0 1 0
1 0 0
1 1 1

When several logical operators are applied to values represented by several bits the
answer is calculated in a bitwise manner.

E.g. 1010 AND 1001 gives 1000

An Application of AND - Masking

E.g.

1010 1110 1010 0001


AND 0000 0000 1111 1111
gives 0000 0000 1010 0001
In order to find the lower byte in a 16-bit word the word can be masked with 0000 0000
1111 1111.

The OR Operator

Input 1 Input 2 Output


0 0 0
0 1 1
1 0 1
1 1 1

The output will be 1 if either the first, second or both inputs are 1.

E.g. 1010 OR 1001 gives 1011

An Application of OR - Converting Binary Numbers To ASCII

The ASCII code for character '3' is:

0011 0011

This can be generated from the binary code for 310 by:

0011 0000

The NOT Operator

NOT only takes one input.

Input Output
0 1
1 0

An Application of NOT - Two's Complement

In an 8 bit system:

310 = 0000 00112


-310 = NOT 0000 0011 ADD 1
-310 = 1111 1100 ADD 1
-310 = 1111 1101

ASM Syntax
AND and OR take a source and a destination which are treated as input 1 and input 2.
The results are then replaced in the destination.

AND <source>, <destination>


OR <source>, <destination>

Immediate mode binary numbers are prefixed with a '%'.

E.g. MOVE 1011 0111, 123

NOT takes a single address or register and swaps the 1s and 0s.

13.5.11 Shift Instructions

Most assemblers support three types of shift instruction:

• Logic
• Arithmetic
• Rotate

Logical Shift

A logical shift causes the m.s.b. to be shifted to the carry bit (in the status register) and
zero moves to the l.s.b.

ASM Syntax

LSL #1, <address> - Logical shift left (one place)


LSR #1, <address> - Logical shift right (one place)

Arithmetic Shift

This results in multiplying or dividing the value by 2. Arithmetic shifting takes account
of two's complement negative numbers.

ASM Syntax

ASL #1, <address> - Arithmetic shift left (one place)


ASR #1, <address> - Aritmetic shift right (one place)

Rotational Shift

Here the bits in an address are are physically rotated left or right. The bit that jumps from
one end to the other is replicated in the carry bit.

ASM Syntax
ROL #1, <address> - Rotate left (one place)
ROR #1, <address> - Rotate right (one place)

13.5.12 Iteration In Assembly Language

It is useful to be able to work out the equivalent assembly language code for high-level
constructs.

FOR K = 1 TO 10

DO N to N + 1

END FOR

14.1 Interpreters
An interpreter analyses the high level source code, statement by statement.

Advantages

1. Convenient for program development


Because:
a. Programs can be run straight away
b. Programs run up to the next point where an error occurrs. Then the
programmer can fix the error.
2. Usually simpler to write and therefore cheaper

14.2 Compilers
A compiler translates an entire high level program into machine readable object code
prior to execution.

Advantages

1. Compiled programs run faster than interpreted ones.


2. The object code may be svaed and used independantly of the high level language
software.
3. Compilers will check the syntax of the whole program.
14.3 Compilation Phases
There are three stages:

1. Lexical analysis
2. Syntax and semantic analysis
3. Code generation

14.3.1 Lexical Analysis

1. Extra spaces removed

Print A$
becomes...
Print A$

2. Comments are removed

REM My lovely program


Print A$
becomes...
Print A$

3. Simple error check


E.g. Illegal variable names are flagged
4. Keywords, variables, constants and operators are converted to tokens.
Tokens are unique codes that are stored in the symbol table.

14.3.2 Syntax And Semantic Analysis

This is the process of checking to see that the sequence of input characters is a valid
sentence.

Examples

• The use of stacks to check that brakets are correctly paired


• Checking to see that string data is not being assigned to an integer variable
• By the use of meta-language such as Backus-Naur form

14.3.3 The Symbol Table

This table contains an entry for every keyword, variable, constant and operator (called
identifiers)

Generating the Symbol Table


Example Program

Dim Radius As Single


Dim Area As Single
Const Pi = 3.1415926536 As Single

Input Radius

Area = Pi * Radius * Radius

Token Item Name Kind of Item Data Type Run Time Value
1
2 Input Keyword
3 Pi Constant Single 3.1415926536
4 Radius Variable Single ?
5 = Operator
6 Area Variable Single ?
7
8 * Operator

Using this table, the program can be tokenised as the lexical string:
24
6538484

Note that the syntax analyser fills in the columns "Kind of Item" and "Data Type". The
lexical analyser only adds the "Item Name" and "Run Time Value".

The most common way of organising the symbol table is to use a hash table. The
identifier is hashed to a memory location.

14.3.4 Code Generation

In this final phase the machine code (or object code) is produced.

Code Optimisation

Some compilers attempt to make the object code run quickly by removing redundant
instructions and by spotting better ways to produce the same effect as the source
program.

Disadvantages

• compilation time is increased


• unwanted results may be produced

14.4 Assemblers
The assembler program converts assembly language into machine code.

14.4.1 Assembler Directives

These are instructions to the assembly program that do not have a machine code
equivalent.

Action is taken is taken at translation time.

E.g. .START - Start storing object code in memory location 200

The most common use of directives is with symbolic addresses

14.4.2 Symbolic Addresses

These are an alternative to using physical addresses.

E.g. LDA 200 can be replaced with

RADIUS: .RB 1 - Reserve one byte for radius


- - -
- - -
- - -
LDA RADIUS

14.4.3 Macro Instructions

A macro is a single instruction that represents a group of instructions.

E.g. A macro to add two numbers

.MACRO ADDUP NUM1, NUM2, NUM3, RES


LDA NUM1
ADD NUM2
STA RES
.END MACRO

The above would be used thus:

- - -
- - -
- - -
ADDUP 300, 301, 302
- - -
- - -

14.4.4 Two-Pass Assemblers


Such assemblers go through the assembly language twice when producing machine code.

The First Pass

• Comments are removed


• Entries are made in the symbol table - these are symbol names and memory
addresses
• Directives are carried out
• Macro instructions are replaced by their full set of instructions
• Errors found and reported

The Second Pass

• Mnemonics are replaced by their machine code equivalents


• Memory addresses (listed in the symbol table) are replaced by actual machine
code addresses
• Decimal, hexadecimal and character items are replaced by machine code
equivalents
• Further errors are reported

15.1 Operating System Functions


The part of the operating system that is used most frequently is kept in RAM whilst
applications software is running.

User Interaction

Operating System Functions

1. Memory Management
2. CPU Sharing
3. Input/Output (I/O) Control
4. Backing Store Management
5. Interrupt Handling
6. Operator Interface
7. Utilities (eg. Defragmenter)
8. Security Facilities
9. Accounting Facilities
15.2 Different OS Modes
These modes decide the method of program execution.

1. Single program operation - one program running


2. Multiprocessing - OS controls more than one CPU
3. Multiprogramming - Programs run concurrently
a. Multiuser - different users, different programs
b. Batch - several jobs can be run at once
c. Multitasking - different users, one program

15.3 Job Control Language


JCL is a language used to instruct the OS.

JCL has sequence, selection and iteration constructs.

It is often used with batch processing where tasks can be assigned different priorities and
actions can be taken on errors.

15.4 The Scheduler And Dispatcher


In a multiprogramming environment the CPUs time is shared out amongst the various
processes that are running.

It is the job of the scheduler to decide which process to run next.

High-Level Scheduling

The scheduler decides which jobs are allowed into the queue for resources and what their
priorities are.

The Dispatcher (Low-Level Scheduling)

The dispatcher decides which process is allocated CPU time when the CPU becomes
avaliable.

15.4.1 Scheduling Policy Objectives

The aim is to make the most effecient use of the computers resources.

Objectives

1. Maximise throughput
2. Maximise the number of users with acceptable response time
3. Balance resource use
4. Enforce priorities
5. Avoid repeatedly sending low priority jobs to the back of the queue

Not all of these objectives can be fulfilled at once. For example: we need to acheive a
balance between response time and resource utilisation

Low-Level Scheduling Policy Objectives

The dispatcher will consider each proccess in terms of:

• How much CPU time is required


• How much I/O is required
• Whether it is a batch or interactive job
• Whether a response is urgent
• It's priority
• How long it has been waiting

15.4.2 Round Robin Scheduling

Processes are scheduled in a FIFO basis.

The CPUs time is divided into time slices (or quantums).

15.4.3 Alternative Shortest-Job-First Scheduling

The proccess with the smallest runtime completion is next.

Advantages

• Reduces the number of waiting jobs


• Reduces the number of small jobs waiting behind large jobs

Disadvantages

• The user has to estimate how long the jobs will take

15.5 Memory Management


15.5.1 Fixed Partitions

In early multi-programming systems the main memory was divided up into a number of
fixed sized partitions.
Each partition held a single job. A job can only be in one partition.

15.5.2 Virtual Memory (Virtual Storage)

Proccesses to be run are held on disk. While a proccess gets CPU time the OS transfers
the correctly running proccess to disk and the next proccess to memory.

Paging

Data is transferred between memory and disk. Paging means that a fixed number of bytes
is swapped between disk and main memory.

The operating system has to keep track of where the parts of a program are at any given
moment.

For each running process, the OS maintains a map showing which blocks of virtual
storage are currently in real storage and their locations.

Segmentation

Programs are divided up into different sized segments which may be loaded into non-
contigous areas of memory.

15.6 Peripheral Control


15.6.1 Devices Drivers

Device drivers are small programs that contain details on how to communicate with a
particular peripheral. They form part of the OS.

15.6.2 Buffering

A buffer is an area of memory or a memory module used for holding data during I/O
transfer.

Once the CPU issues the 'start' instruction the I/O channel can place data in a buffer while
the CPU is busy.

Because different peripherals have different buffers and I/O channels autonomous
operation of peripherals is permitted.

15.6.3 Spooling
The technique allows effecient communication between devices that operate at different
speeds. The high speed device spools or writes ita data to disk and the slow speed device
can read it later.

15.6.4 Polling

One unit (often the CPU) checks the status of another (often a peripheral) at frequent
intervals.

Polling takes up large amounts of processor time and an interrupt system is usually
preferrable.

15.6.5 Interrupt For Peripherals

An interrupt is a signal generated by an event that alters the sequence in which the CPU
executes instructions.

The hardware generates the interrupts but the interrupts may be related to current
processes.

The interrupt register inside the CPU contains one bit for every different interrupt.

At the beginning of the FE cycle the interrupt register is checked and if a bit is set then
the state of the current process is saved and the OS gives control to the appropriate
interrupt handler.

The interrupt handlers are programs which carry out the action required by an interrupt.

Interrupt Masking

This term refers to the ability to enable and disable peripherals by software instruction.

At least one interrupt is non-maskable.

15.7 Backing Store Management


This is a task of the OS. In particular the OS must manage:

• Access methods
• Files
• Free Space

In deciding where to place a new file there are two strategies.

• First fit
• Best fit

15.7.1 Directory Structure

When backing store media are capable of holding more than a few files, it is common to
organise the files into directories.

These directories can then be organised into a hierarchical (or tree) structure.

Some operating systems (not DOS) give access rights to all the levels above a particular
user.

16.1 High And Low Level Languages


16.1.1 Introduction

All computer languages can be classified as high or low level.

High Level Low Level

BASIC Assembly Language


COBOL
Pascal
Java
High Level Low Level

C
Fortran
Delphi

16.1.2 Low Level Languages

Their characteristics are:

• They are machine orientated.


• Each statement translates into one machine code instruction.

Uses

• When there is a requirement to manipulate individual bits and bytes.


• When there is a need for very fast executable code.
• When there is a need for code that occupies very little memory.

16.1.3 High Level Languages

Their characteristics are:

• They are not machine orientated (i.e. they are portable).


• They are problem orientated.
• They more closely resemble English and mathematical expressions than low-level
code.

High Level Languages usually contain:

• Selection and iteration structures.


• Built in commands to simplify I/O.
• Built in functions.
• Data structures.

16.2 Language Classification


16.2.1 Special Purpose Languages

There are many languages designed to be used in solving particular problems.

16.2.2 Procedural Languages

They consist of a series of statements which act act on a given set of data.
Programs are usually designed in a top-down fashion.

There are two problems:

• If the structure of the data changes all the programs that use the data must change.
• If the way the system function changes then program alterations will be
widespread.

16.2.3 Object Oriented Languages

Objects

An object is a data item with characteristics in common with other objects in its class.

Inheritance

Classes of objects can inherit characteristics from a base class.

Such classes have their own extra characteristics and are known as derived classes.

Encapsulation

Each object has its own functions and data. The data can only be accessed through
functions that belong to the object.

16.2.4 Event Driven Languages

In an event driven language a block of code is executed in response to an event.

Examples of events:

• User clicks on an icon


• A key is pressed
• The pointer is moved over a picture

Event driven languages usually work with GUI

16.2.5 Declaritive Languages

Declaritive languages express problems in terms of structured objects and the


relationships between them.

Example:
male (joe_bloggs)
car (ferrari)
owns (joe_bloggs, ferrari)

A query can be entered:

? male (joe_bloggs)
output: Yes

This is a "what to do", not a "how to do" approach.

16.3 Language Generations


Each successive gerneration of computer language represents a significant development.

1st Generation (late 1940s) Machine code


2nd Generation (late 1950 - late Unstructured Procedural Langauges (E.g.
1970s) FORTRAN)
3rd Generation (late 1970s - mid
Structured Procedural Languages
1980s)
Database Integration & Object Oriented
4th Generation (mid 1980s...)
Programming
5th Generation (1980s...) Declaritive Languages (E.g. Prolog & AI)

16.3.1 Fourth Generation Languages

In the 4GL the programmer sometimes specifies what is to be done, rather than how to do
it.

A 4GL is an integrated development environment. Typically this provides:

• DBMS (Database Management System)


• Data Dictionary
• Query Language (SQL)
• Facilities for selection and sorting
• Report Generation
• Screen Painting
• Text Editor
• Security And Password Facilities
• Backup And Recovery Facilities
• Links to other DBMSs

It is the close integration of the DBMS with a programming language that leads to it
being classified as a 4GL.

16.3.2 Advantages Of 4GLs


• Prototyping is readily acheived. Either through:
a. Throw away prototyping OR
b. Evolutionary prototyping
• Programs productivity is increased
• Program maintanence is easier
• 4GLS are more portable than 5GLS

16.4 Logic Programming


Expert systems and artificial intelligence (AI) systems often make use of logic
programming.

16.4.1 Expert Systems

These are computer programs that attempt to replicate the performance of a human expert
on some specialised reasoning task.

Expert systems have:

A Knowledge Base

E.g. A family tree.

As well as containing facts on who is male and female and who is a parent of whom, the
knowledge base also consists of rules.

An Inference Engine And A HCI

In ProLog this can be done by entering queries.

For example: Does Liz have a brother?

? - brother(X, Liz)

Output: X = Bob

Or: Who is Bob's grandchild?

? - Grandparent (Bob, who)

Output: who = Jim

16.4.2 Artificial Intelligence


All systems are concerned with knowledge. Declaritive programming languages are a
means of solving problems to do with knowledge.

The Turing Test is a functional test of AI.

Applications

• Expert Systems Control


• Problem Solving
• Decision Support Systems
• Pattern Recognition (E.g. Image Recognition)

16.4.3 Processing Natural Langauge

Syntax Rules

Both computer languages and natural languages have well defined syntax rules which
determine whether a statement is correct or not.

Vocabulary

Only programming languages have a small vocabulary.

Ambiguity

Statements in natural languagescan have more than one meaning.

ProLog has been used to process and translate natural lanaguages.

This is because it can parse the phrase-structure grammar.

16.5 Choosing A Programming Language


Different programming languages suit different problems. In order to make programming
easier and faster to complete, as well as making problems easier to maintain, it is
important to make a good choice.

Factors to consider:

• The facilities in the language compared with the requirements.


• The avaliability of a suitable compiler/interpretor.
• The expertise of the programmers.

17.1 Data Transmission


17.1.1 Comms Channels

Coaxial Cable

Coaxial cable (simliar to that used in television systems) has an outer mesh which is
'earthed' and shields the signal in the inner wire. The inner wire carries the signal.

Used in: LAN

Pros

• Inexpensive
• Accurate data transfer

Cons

• Signal degrades over distance


• Not very easy to lay

Twisted Pair Cable

This uses sets of twisted copper wire pairs.

Used in: LAN

Pros

• Easy to wire in situe


• Fast data transfer

Cons

• Susceptable to interference

Fibre Optic Cable

One of many transparent fibres can relay phases of light.

Used in: MAN and WAN (internet backbone)

Pros

• Very fast data transfer rate


• No electromagnetic interference
• Physically secure
• Less signal degradation over distance
Cons

• Cost
• Cable cannot bend around tight corners
• Difficult to interface with computer

Microwave Transmission

Data is encoded on a microwave signal.

Used in: MAN

Pros

• No cost of wiring and associated disruption


• Reasonable bandwidth

Cons

• Limited to line of sight

Satellite Transmission

A one or two-way link can be established by using a satellite (in geosynchronus orbit)
and a dish.

Used in: WAN

Pros

• Good bandwidth
• No wiring or boosters needed

Cons

• Expensive
• Possibly insecure

17.1.2 Transmission Directions

Simplex

Data can only be transmitted in one direction.


Half-Duplex

Data can be sent in both directions but not at the same time.

OR

Full-Duplex

Data can be sent in both directions at the same time.

17.1.3 Transmission Rate

The speed at which data is transmitted is measured in bits per second (bps).

The bandwidth is the maximum speed at which data can be sent along a communication
channel.

17.1.4 Error Detection

A parity bit can be transmitted with the code for each character.

An element of redundancy can be introduced with 2D parity. This allows corrupted data
to be recovered.

17.1.5 Modulation And Digital Transmission

Analogue Systems

MODEM stands for Modulator-Demodulator.


To send digital data, the data is imposed or modulated onto an analogue wave.

The reverse process (demodulation) retrieves the digital information.

Digital Systems

Data is sent digitally via:

• An ISDN line
• A LAN connection
• A digital radio network
• A phone line with ADSL

ISDN (Integrated Services Digital Network)

Data is sent digitally down a special ISDN line.

ADSL (Asymetric Digital Subscriber Line)

This has a maximum bandwidth of 9Mbps and runs over a standard telephone line.

17.1.6 Parallel To Serial Transmission

Parallel transmission is not pratical for long distance communication. This is because of
the cost of manufacturing the wires.

17.1.7 Multiplexing

Multiplexing allows more than one signal to be sent over a single link at one time.
There are two main methods of doing this:

Time Division Multiplexing

The transmission time is broken up into slices and each device wishing to transmit can
take a packet in turn.

Frequency Division Multiplexing

Different frequency carrier waves are used to send many signals over the link. A high
bandwidth is required for this.

17.1.8 Data Compression

Data compression is frequently used when transmitting large amounts of data.

At the heart of the process is the method of looking for repeated patterns and replacing
them by one copy of the bit pattern plus the number of times it occurs.

17.2 Local Area Networks


A LAN is a collection of computers, peripherals and the links between them. It is usually
confines to one site building.

Networing standalone microcomputers:

• Sharing of Hardware Resources


• Shared Access to Software
• Sharing of User Information
• Ability to Communicate

17.2.1 Star Network Topology

This network topology involves connecting each workstation to the server separately.

Advantages

• If one cable fails the other workstations are unaffected.


• Consistent performance even if there is a good deal of network traffic.
• Easy to add extra workstations.
• No problems with data collisions in network cabling.

Disadvantages

• Costly to install.

17.2.2 Bus Network Topology

In this topology all devices share a single cable.


Advantages

• Easy and inexpensive to install as this topology requires less cable than any other.
• Adding more stations is relatively easy.

Disadvantages

• Cable failure will lead to total network failure.


• Such failure is difficult to locate.
• Network performance deteriates under heavy loads.

17.2.3 Ring Network Topology

In a ring network there is no central controlling computer. Messages are passed from
computer to computer (in one direction) until they reach their destination.
Advantages

• No dependance on a central computer.


• High transmission rates are possible.

Disadvantages

• If one station breaks down the whole network fails.

17.3 Wide Area Networks


Both national and international networks are WANs.

There are usually many possible routes across a WAN.

17.3.1 Circuit Switching

A circuit switched network is one where switches are used to connect two computer
systems for a given length of time. The path does not change.
Advantages

• No waiting for messages to arrive.

Disadvantages

• There may not be a line avaliable.

17.3.2 Packet Switching

With packet switching the messages (or other data) are divided up into fixed length
blocks of data called 'Packets'.

These contain:

• a source and destination address


• a packet sequence number
• the data checksum

These packets can travel along different routes.

Advantages

• Communications channels are used more efficently because channels are not 'tied
up' when transmission is not in progress. Also, because data is sent in packets,
data compression can be used.
• More more resiliant to network failure.
• Data is more secure becuase packets are sent down various routes.

Disadvantages

• May be significantly slower.

17.4 The Internet


The Internet is a network of networks. Every computer connected to the Internet can be
considered part of the Internet.

Data can be transmitted in several ways:

• FTP - File Transfer Protocol


• Email
• HTTP - Standard for the world wide web

Some information services avaliable:


• Electronic Shopping
• News Reports
• Online Banking
• Travel Timetables

You might also like