You are on page 1of 47

SECTION 1 - Computing Today

Historical Thoughts
While the activity of counting objects and remembering things extends back to the earliest times of
humans, the idea of a mechanical device that could aid in the counting process, or that could actually
do the counting, is relatively recent. There are individual examples of record-keeping devices in
human history, but these are few: the Quippu of the Incas and the Abacus of ancient China. The
Greeks, Romans, Persians, and many other antique cultures used variations of writing for the keeping
of records or the remembering of partial answers in computations - for instance, wax tablet and
stylus, clay tablets, papyrus and ink, and, later, paper. But the idea of a machine that could actually
do the work of computing, rather than simply aiding the human in doing the thinking, dates only to
the late 1700's.
The computer as we know it today is the product of several major eras of human's technology.
Technology is the application of tools and techniques such as to improve the likelihood of human
survival. In addition to the survival aspect, the use of tools and techniques to solve non-essential, but
still needed or interesting problems, has given rise to many great inventions. These include things
like the automobile, bicycle, radio, etc. The evolution of the computer lists these phases of
development:
1. The Mechanical era, in which the "industrial revolution" provided the mechanical techniques
and devices needed to build machines of any sort;
2. The Electronic era, in which the use of electrical devices and techniques obsolesced
mechanical methods, and the
3. Semiconductor era, in which the relatively new science of semiconductor physics and
chemistry extended the original ideas of the Electronic era to new heights of performance.
A few events will illustrate these time frames:

1780: Blaise Pascal designs and builds a decimal counting device that proves that mechanical
counting can be done.
1790's: Jacquard devises a weaving loom that uses a chain of punched cards to define the
functions of the shuttles through the warp, thereby defining the color pattern and texture of
the cloth.
1822: Charles Babbage presents his Difference Engine to the Royal Society, London, where
he demonstrates the mechanical computation of Logarithms.
1833: Charles Babbage presents his Analytic Engine to the Royal Society; it is never
completed.
1830's: Ada Augusta, daughter of George, Lord Byron (English poet) works with Babbage to
develop a schema for laying out the logical steps needed to solve a mathematical problem,
and becomes the first "programmer". She also invests most of her husband's money in a
scheme to beat the horse races built on Babbage's work - they loose their shirt.
1860's: "The Millionaire", a machine that could multiply by repetitive addition is announced.
1888: In response to an invitation from the US Census Bureau, Herman Hollerith presents
his tabulating machines including a card punch, card reader, tabulator (electric adding
machine), and card sorter. He wins the contract for equipment for the 1890 census.
1900's: Hollerith's machines are a success, and he sells them to countries for census, and to
companies for accounting. His patents are stolen by competitors. He joins with two other

companies to form the Computing-Time Clock-Recording (CTR) company (tabulating


machines, time clocks, and meat scales).
1914: The CTR company hires Thomas J. Watson, Sr., as president. His job is to beat the
competition and put the outfit on the map. He immediately starts training for sales persons on
company time, and renames the company to International Business Machines Corporation
(IBM).
1920's: IBM and several others manufacture ever-more complex electro-mechanical
tabulating equipment. The stock market crash of September, 1929, puts millions out of work
and many companies fold. IBM reduces its activities, but never lays anybody off. They hire
more salesmen.
1934: President Franklin Roosevelt institutes the Social Security Act, which requires that
everybody in the country have a number. Great quantities of tabulating equipment are
purchased to support this effort.
1939: Vannever Bush demonstrates the Differential Analyzer, the last great purely
mechanical calculator.
1941: The United States enters World War II. Most companies refit for the manufacture of
munitions.
The War Years: Many new advances are made in electronics that will have effect on the
tabulating business after the war, including radio, radar, sonar, and television.
1944: J. Presper Eckert and John Mauchley are given a contract to develop a purely
electronic calculator for the calculation of bomb trajectories. The build the Electronic
Numerical Integrator and Calculator (ENIAC) at the Moore School of Engineering,
University of Pennsylvania.
John Von Nuemann produces the Electronic Differential Storage And Computer (EDSAC)
for the Institute of Advanced Studies, the first machine to use the stored program concept.
1949: The transistor is invented by Schockley et al.
1951: A company started by Mauchley and Eckert to build electronic computers goes broke
and is bought by Sperry Rand. With this help, the two deliver the first Universal Calculator
(UNIVAC) to the US Census Bureau, the first computer sold for commercial, non-military
purposes.
1955: IBM introduces the 704 series computers, the first large-scale systems using transistors.
1958: IBM introduces the 1401 and related systems, bringing card-based data processing to
the average company.
1964: IBM bets the company on the introduction of the System/360, using microtransistors
and mass-produced core storage devices, and the idea of the "non-dedicated",
microprogammed system. The product line is upward-compatible - a huge success that
ultimately defines the mainframe market.
Late 1960's: Several companies begin to develop and deliver true integrated circuits.
1973: The Intel Corporation delivers the first integrated circuit capable of executing a fully
usable program, the Intel 8080. The microprocessor is born.
1977: The Apple Computer Company is started by two college dropouts in their garage,
Steve Jobs and Steve Wozniak. Originally sold in kit form, the machine uses inexpensive
parts and the home color television to bring computing to the masses. The BASIC
programming language used in the machine is written by Bill Gates of Microsoft.
1981: IBM introduces the IBM Personal Computer, and coins a term the will live forever.
At first aimed at the home market, the PC is immediately adopted by businesses large and
small. Since the design of the system is published, many begin to write programs for the
machine and to steal it's design. The use of the Intel 8088 processor ensures Intel's survival.
Microsoft provides the Disk Operating System (DOS).

1987: IBM introduces the Personal System/2 (PS/2) product line including the Intel 80386
processor, the Microchannel, and the OS/2 operating system.
Late 1980's: The use of the Windows operating shell produced by Microsoft provides a
Graphical User Interface (GUI) for users.
1990's: Intel introduces the 80486, the Pentium, and the Pentium Pro processors. Speeds
approach 200 megahertz. Advances in memory semiconductors permit millions of characters
of storage to be available for a small price.
Further advances in microprocessor technology bring us the Pentium 2, 3, and 4 processors,
the AMD Athlon, the PowerPC, and clock speeds passing 3 gigahertz. Embedded processors
are used in everything from automobiles to refrigerators, space craft, and graphics
accelerators for display.

Introductory Terms
A primary purpose of an introductory course in computers is to provide definitions of the terms
commonly used in the business, and to ensure that the student understands what the terms really
mean. We shall define a set of terms as an introduction, and provide narrative where suitable.
Information in any business or science is the core of the concern's operation. With good information,
a business can keep track of its accounting, improve sales by knowing its competition, ensure that its
employees are paid, and carry on the business at hand. If information is lacking or inaccurate, a
concern can suffer and may not survive. Computers are excellent tools for remembering, processing,
and manipulating such information, hence they have become indispensable in business and industry.
Information in the computer business is usually called data, a Latin word meaning information.
Actually, data are the plural, indicating many lumps of information, while datum is the singular
form, indicating one lump. Data can take two physical forms in computers. The first is raw data such
as time cards, order sheets, billing statements, invoices, bills of lading, etc. The important
information is present in these forms, but the computer can't use it right away. The data must be
converted to computer-usable form, which the machine can then take in and use without further
delay. A conversion must take place between the raw form and the computer-usable form in order for
the machine to have access to it.
This conversion traditionally took a good deal of time and expense. The classic example is a room
full of hundreds of keypunch machines in which tabulating cards are punched with hole patterns
that specify particular items of data. Later technology has allowed the conversion process to be
automated or eliminated altogether, as in the use of a credit card in an Automated Teller Machine. In
this case, the card is already in computer-usable form by virtue of the magnetic stripe on the back
which is encoded with the essential information of the card holder.
The computer-usable data are entered into the system by an input device, such as a card reader, light
pen, or credit card reader. The word input has multiple meanings in the business and here are two of
them. The input device is the mechanical and/or electrical means by which the data enters the circuits
of the system from the outside world. The computer-usable form of the information thus entered is
the input data. Many words in the business have multiple meanings.
The data thus entered are processed or converted in some way to a form more usable to the operators.
For instance, a time card may form the input data, and from the information stored on it can be
determined the number of hours worked, the employee's name, etc. This data, stored with other
information such as rate of pay and tax information, can then be returned to the human world in a
3

more useful form, such as paycheck for the worker and a W2 for the IRS. This returned information
is called output data, and the printer on which the check is made is an output device. Hence, the
word output also can be defined in more than one way.
In between the capture of the input data and the resulting output data is the processing. This is the
conversion of the one form of data to the other. The processing is done by the Central Processing
Unit (CPU), which is the "magic box" that does all the work and which has held so much intrigue for
the uninitiated over the years. If a CPU is built around a microprocessor device such as the Intel
Pentium, then it may be referred to as the Microprocessor Unit, MPU.
The computer system consists of two major parts, the hardware and the software. The hardware is
the actual mechanical and electronic device that does the work of deriving results data from given
data. The hardware is designed according to the theory of computing and according to what can be
done with the semiconductor devices of which the CPU is made. However, it cannot in itself do the
whole thing, It must be instructed how to use it's circuitry to arrive at desired results that are defined
by the human beings that are working with it. The humans create this plan of processing, or
program, using software, that is, the programming language and its syntax. So the software runs on
the hardware, and the combination follows the plans of the humans to process input data to generate
output data.
Data Storage
Data are stored in the computer and its related parts using the Binary numbering system. Whereas
the Decimal system uses ten different symbols (0, 1, 2, . . , 9) in computation, the Binary system uses
only two, 0 and 1. This is because the physical world is inherently a two-state system. A switch can
be open or closed, the light can be on or off, etc. Building circuits that can exist in two different states
is easy, and so we use the Binary system because it fits nicely into this plan.
The smallest unit of data is the bit, short for binary digit. Bits can be either a 0 or a 1. If taken in
groups of eight, they become a byte. The byte unit is very common as a means of remembering
simple lumps of data because it can be easily handled by simple circuitry and can represent a variety
of different things in various codes (as we shall see). Two bytes together are called a word in the
type of computers that we will be working with initially. The location of bits within a byte or word
are referred to as bit positions. They look like this.
In a byte:
7

13

12

11

10

In a word:
15

14

It is common to use large numbers to represent the number of bytes involved in memory or disk
storage. A kilobyte is 1,000 bytes, a megabyte is 1,000,000 (one million) bytes, and a gigabyte is
1,000,000,000 (one billion) bytes. Actually, these values are not quite correct, but will do for the
moment. There are variations on these terms as you will see later in the course.

Similarly, we have terminology for increments of time. Since the CPU is operating at such a high
speed, it is common to refer to very small increments of time needed for single operations. If a
second is the standard increment of time, then a millisecond is 1/1,000 of a second, a microsecond is
1/1,000,000 of a second, and a nanosecond is 1/1,000,000,000 of a second. These amounts are
incomprehensibly short for the human being whose average cycle time is about 1/18th of a second.
However, computer circuitry regularly operates in these ranges. (Example: If an electron travels in
free space at the rate of 186,256 miles per second, that is, at 300, 000,000 meters per second, how far
will it travel in one nanosecond?)
Storage and Memory
In addition to the entry and exit of data into and out of a CPU, the CPU contains many data paths,
logical units, and functional parts. These will be discussed later. However, one of these parts has
become so important in it's own right that it should be presented from the beginning in the
introduction of computer theory. Originally part of the CPU and discussed like any other, the part of
the computer that stores data for processing and the results thereafter is both an essential part of the
design and can also be one of the technical problems. This is referred to as memory or storage.
There are two types of memory mechanism in typical modern computers. The first is the main
memory, or that circuitry which is directly accessible automatically and at high speed by the rest of
the circuitry of the processor. In the early days, this device consisted of cores or doughnut-shaped
pieces of a magnetic ceramic material that were string like beads on a grid of wire. By passing
current through the wires, binary 1's and 0's could be stored and retrieved from the cores. These
worked well but due to the laws of physics, there was a certain upward limit of performance that
could be achieved. When semiconductor technology advanced to the point of making integrated
circuits practical, semiconductor memory devices were a major produce. You may be familiar with
these as SIPs or SIMMs or DIPs used in personal computers on their mother boards. The term
memory now generally refers to these semiconductor devices.
The term storage originally referred to the magnetic core system discussed above. However, the
word is now used primarily to discuss external data-holding mechanisms such as disk and tape
drives. In the old days, disk drives and tape drives were referred to as bulk or auxiliary storage. We
tend to think today of either floppy diskettes or Winchester-style small fixed disks or hard disks used
in personal computers as storage. These devices have revolutionized the way data are handled in
many computer systems, and a system can find itself dependent on its drives as the main definition of
the computer overall performance.
So, generally, the term memory refers to solid-state, printed-circuit board things, and storage to disk
drives and similar devices.
Some terms involved with memory include:

Read-Write Memory (RWM), typically your motherboard main memory, into which a
computer can place data and from which that data can be later retrieved.
Read-Only Memory (ROM), typically also found on the motherboard, but which can only
be read from, and not written to.
Random Access Memory (RAM), which is a term misused by those who really should say
RWM most of the time.

The term RAM really means that a device supports the ability of the computer to access data in a
random order, that is to store or retrieve bytes in a non-sequential way. Both RWM and ROM are
RAM in nature. However, the acronym RWM is hard to pronounce, so RAM became the norm.
Certain types of memory and storage devices can remember what they contain with or without the
power being applied to the system. Such storage devices are called non-volatile. Magnetic disk
drives and cores are good examples. Other types of memory need the power to remain stabile or they
will forget what they contain. The memory chips on a motherboard are a good example of this type,
called volatile.
More on Hardware
In the good old days, you could tell whether a computer was a mainframe or a minicomputer by
looking at it and measuring the floor space the cabinets took up. Today, with so much computing
horsepower contained in such small devices, physical size is no longer a criteria. Today we measure
computer size in throughput, that is, how many instructions can the system do in a given amount of
time. Depending upon the design of the system, we have computers which work in Millions of
Instructions per Second, called MIPS. Some computers that are designed primarily with scientific
processing in mind do many Floating Point Operations per Second, referred to as FLOPS. A
floating point operation is one where the decimal point in a decimal fraction is taken into account and
is included in the design of the numbers used by the computer. More on this later.
The original idea of a minicomputer was that it was smaller, slower, and cheaper than a mainframe,
which traditionally cost a great deal and required a lot of space and people to work it. The
minicomputer was almost "personal" in its design. This definition persisted until the advent of the
microprocessor, at which time we had the microcomputer, which contained a microprocessor as its
primary computing element. As the microprocessor device progressed in capability, the
minicomputer became obsolete and the size of the traditional mainframe began to shrink.
Microprocessors are now to the point that they can do what minicomputers and small mainframes did
just a few years ago. Accordingly we have table-top or table-side systems, floor standing systems,
and laptop and palm-sized computers. We measure these by throughput and performance, regardless
of physical size.
The term supercomputer is used to identify large mainframe systems that are designed for particular
types of scientific calculations. These systems are designed to work with numbers at great speed, to
prepare everything from weather maps from satellites to keeping track of aircraft in the sky. They
will remain as a specialty item for that type of computing.
Just as the design and inner workings of the CPU have evolved with technology, so have the
input/output devices evolved. In the beginning, the two primary forms of I/O were the reading of
tabulating cards into which were punched holes in patterns or codes that represented numbers and
letters. Although several such coding systems were devised, the Hollerith card code with 80
columns and 12 rows of holes became the standard. The cards were read, the data processed, and the
result was either more punched cards or a simple printout of numbers and accounting data. The
system was limited with how fast the cards could be moved, and some early tab systems had no
storage at all.
Currently we have a variety of I/O devices that have taken advantage of technology. While in the old
days, the conversion process from human-usable form (orders, waybills, etc.) to computer-usable
form (punched cards) was an essential step in the process. Now, many items used by humans
6

everyday are also computer-usable, such as credit cards, touch panels and screens, scanning laser
badge and tag readers, etc. Every school kid is familiar with a mouse and keyboard it seems, as these
are easy to use if the software is provided,
Output today falls into two main categories, softcopy and hardcopy. Softcopy is what you see on the
screen. It is soft because you can't take it with you except in your mind and memory. The Cathode
Ray Tube (CRT) display of the video monitor of the typical personal computer is a prime example.
Although the CRT is being replaced with Liquid Crystal and Plasma displays, the venerable video
monitor is still the standard for video output. Hardcopy is a term used to indicate a piece of paper on
which something is printed. This paper can contain the resulting data in the form of numbers and
letters, pictures, and various other images, both in color and monochrome. The method of placing the
image onto the paper has evolved also. Originally, the impression of a piece of type that pinches an
inked ribbon against paper was the common method, and is called impact printing. The process is
similar to a typewriter. We now can generate images using heat as in thermal printers, light as in
laser printers, and with improved methods of the traditional printing means, as in dot matrix and
inkjet printers.
Disk drives fall into the category of I/O as well as that of storage. Two types are currently in use in
common systems, the "floppy" or diskette, and the fixed or hard or Winchester disk drive. The floppy
is an IBM invention, and originally was released in an 8-inch diameter. This large diskette could
store 256,000 bytes of information on one side of the diskette. We now have diskettes in the 5.25inch size (although this size is fast fading from view), and the 3.5-inch size, whose capacity is
increasing with technology. The floppy is designed for portability, backup, and small storage and is
supported almost universally as a simple means of data exchange and retrieval
The fixed or hard disk is also an IBM invention, and although many makers produce the devices,
IBM holds the most design patents and has done the most to improve the capacity and reduce the
size. The size of the device has been reduced from 28" to 14" to 8" to 5.25" to 3.5" to 1.8" in
diameter, the speeds have increased from 1500 rpm to 7500 rpm, and the chemistry used to store the
data as magnetic lines of force on the surface of the disk has undergone radical changes. Until such
time as a solid-state device takes over the whole job of data storage, such drives will form the
primary means of bulk data storage.
More on Software
Generally, software is divided into two categories. These are applications software and system
software. Application software is the programs you would use to get a kind of work done. Examples
are WordPerfect as a word processor, Excel or Lotus as a spreadsheet, etc. These are software
packages that are interacted with directly by the user or operator of the computer. Enormous work is
done to continually write programs for ever-larger applications programs. Programmers can
specialize in applications of a specific nature, such as for banking, etc.
Systems software is used by the computer itself for its own management, or to support the
application software. An example of this is the DOS used in personal computers. The computer itself
consists of hardware and some small amount of programming in ROM, but to fully support an
application such as WordPerfect, that is, to drive the screen images, work with the keyboard or
mouse, save data on disks, and generate printouts, the application needs to ask DOS for help from the
hardware. So the system software consists of the operating system itself, a wide variety of support
utility programs, and programming support in the form of compilers for different languages.

So, who does the programming? The first programmer is considered to be Ada Augusta Byron, and
she has a language (ADA) named after her. The first machine that could actually follow a program
was the EDSAC developed John Von Neumann in the late 1940's. Von Neumann developed what we
now call the stored program concept. While previous machines such as ENIAC simply took one
piece of data at a time, processed it, and returned it to the world before taking a second piece, the
EDSAC took in a large amount of data, processed it all automatically, and then returned the entire
result. While ENIAC's problems that it was to solve were defined by miles of patchplug wiring that
had to be removed and inserted for each problem, EDSAC used the same mechanism for storage of
data for the storage of coded steps that the machine was to follow automatically to process the data.
Thus, each step became an instruction and the instructions together as a group formed a program.
The program was stored in the same automatically accessible mechanism as the data it was to
process. Arranging the steps that the computer will take to logically solve a program, in the manner
of Ada Byron, is called programming.
The job of programming goes to people of different skill levels and experience. Some choose to
specialize in applications while others choose systems. Typically individuals become expert in some
particular language or system architecture, and that will define their careers. Generally the beginning
programmer, with community college or similar training, will start as a coding clerk where the
primary function is mastering a particular computer language and getting the know the system in use.
The programmer then will write programs to solve problems. In some cases the problem is big or
complex enough that a specialist is needed to lay out the plan for the programmer and this person is
called a system analyst. It should be noted just what a system analyst is. This is a person that is an
expert in some field such accounting, aircraft design, environmental sciences, etc., who also is
knowledgeable in computers and how they can be used as a tool to solve problems. The analyst is the
technical expert of the particular project, and also knows computers well enough to guide others in
the work of programming the various project parts.
An end user is a person who is the last one in the food chain in the writing and marketing of
software. If you use WordPerfect to write a term paper, then you are the end-user as far as Word
Perfect is concerned. As such, you have significant importance and clout. End users can dictate to a
certain extent what products survive and what products fail. The acceptance of WordPerfect in the
marketplace is a classic example of a product that was at the right place at the right time and caught
the public's fancy.
Specialty Items
Here are a couple of terms with which you can astound your friends and family.

Multitasking is defined as the ability of a computer system to execute what appears to be


several programs at the same time. Although this is not really what the computer does, it does
switch between tasks so fast that it appears to be several computers instead of just one.
Windows 2000 and XP and Unix are multitasking systems. It's a hardware/software
combination.
Timesharing is the apparent use of a system by several people at the same time. The classic
example is a mainframe or large minicomputer into which many terminals with screens and
keyboards are connected to the CPU. The system gives each person at a terminal a slice of
time ("timeslicing") for their processing, after which the system moves on to the next user for
his/her timeslice. This technique makes use of multiprogramming as it attempts to serve all
the attached users who may be doing different things.

Front-end Processors are computers that do initial processing and data form conversion
before sending the concentrated data to a bigger faster system. Examples include a
supercomputer that accepts input only from a machine that builds problems for it to solve, or
a communications processor that provides a filtering for communications protocols that would
only slow down the primary system.
Embedded Processors are computers or microprocessors embedded within a larger system.
These provide intelligence and control at a local level. Flight control computers with an
aircraft cockpit are examples, or the processor that controls the firing of sparkplugs in an
automobile engine.

SECTION 2 - Hardware, Part 1


Numbering Systems and Codes
When working with computers it is necessary to deal with numbering systems other than the decimal
system. While the decimal system has served mankind well for thousands of years, it is not easily
adapted to electronics. The primary numbering system used in digital systems is binary, with the
octal and hexadecimal systems going along for the ride.
The reason for the use of the binary system is that each position of magnitude can have only two
possible values, 0 and 1. It happens that in the laws of physics and nature the two-state condition is
easiest to implement. Switches can be open or closed; current can be flowing or not flowing; current
can be traveling left to right, or right to left within a wire; magnetic lines of force can be clockwise or
counterclockwise around a core; lamps can be on or off. Computers make great use of circuits called
bistable multivibrators, or flip-flops, that are stabile in two different electrical conditions. So, it is
extremely easy to implement the binary system in electronic devices.
By contrast, the decimal system can have ten values or symbols in each position: 0, 1, 2, 3, 4, 5, 6, 7,
8, and 9. It would require a device with ten different stabile states to directly implement a purely
decimal computer.
The binary system is based on the powers of the number 2, starting with 20, which is equal to the
number 1 in decimal (all variables raised to the 0th power are equal to 1). The next order or position
of magnitude is 21, equal to 2 (all numbers raised to the first power are equal to themselves). The
same applies for the higher powers of two: 22 = 4, 23 = 8, 24 = 16, 25 = 32, etc. Notice that the
decimal equivalents of the binary powers seem to be doubling as the value of the power increases by
1. A table of the first 16 powers of 2, which we will use often, would look like this.
215

214

32768

16384

213

212

211

210

29

28

27

26 25 24 23 22 21 20

8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1

If you add all of the values up in the second row of the table the total comes to 65,535, and, if you
include the situation of origin 0 you have 65,536 as the largest 16-position number in binary.
Each 1 or 0 that can occur in a binary number position is called a bit, which is short for Binary Digit.
Since we can have a 0 or a 1 in each of the binary power positions, we call these bit positions, and
name them after the power of two for that position. So, on the extreme right of the table, we have a
position that represents 20, and we call it "bit position 0". The next position to the left represents 21,
so we call this "bit position 1". Similarly, at the far left end of the table we have "bit position 15".
The use of the term "bit position" becomes important in programming and in dealing with computer
hardware-software interaction, where the bit positions represent locations with an 8-bit byte or 16-bit
word, and we are interested in the whether a specific bit position contains a 1 or 0.
When dealing with actual numeric values, it is convenient to understand the relationship between
decimal and binary. From the table you can see that there is a decimal equivalent value for each bit
position that corresponds to the power of two for that position. If we wish to convert a binary number
to decimal, we simply add up all the decimal equivalents for those bit positions that contain binary
1's, and ignore those that contain 0's.

10

BIT POSITIONS
DECIMAL
EQUIVALENTS

128 64 32 16

76

176

135

TO CALCULATE THE DECIMAL VALUE OF A BINARY NUMBER, add together the decimal values
of all the bit positions that contain binary 1's.
TO CALCULATE THE BINARY VALUE OF A DECIMAL NUMBER,
1. By inspection, determine the largest power of two decimal equivalent that will successfully be
subtracted from the given decimal value (successful means that the subtraction returns an
answer that is either positive or equal to zero).
2. Subtract the decimal equivalent of this power from the given number. Keep record of this
successful subtraction by placing a 1 in the bit position of that binary power.
3. Now try to subtract the next smaller power of two from the result of step 2. It may be too big;
if so, place a 0 into the bit position for this power of two. If the subtraction is successful,
place a 1 into that bit position.
4. Continue as in step 3 until all the given decimal number is used up. You should end up at 20
power.
The binary system is the basic method of all computer counting, but it generates numbers that
become very wide very fast, and this leads to human error. Two other number systems have been
used to generate a shorthand that makes the handling of larger numbers easier than with pure binary.
These are Octal and Hexadecimal.
The octal number system is based on the number 8. There are 8 symbols possible in each magnitude
position, 0, 1, 2, 3, 4, 5, 6, and 7. This system was widely used in earlier computers starting with
ENIAC, but has been replaced by the hexadecimal system for the most part. This system is based on
the number 16, and that means that there are 16 different symbols and values that can be placed in
each magnitude position. These are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Note that we use
letters when we run out of the decimal numbers.
Unlike the decimal system, the octal and hexadecimal systems work beautifully with the binary
system because their natural place to carry into the next higher magnitude position concurs with that
of the binary system. Look at the locations where the octal number or the hexadecimal number rolls
to a higher position, and you will see that it is in the same place as binary. Decimal values, however,
do not carry at the same place. Therefore, there is a direct correlation between binary and octal or
hexadecimal, but not between binary and decimal. This is why the decimal numbers entering a
computer are usually immediately changed to a binary or hexadecimal value, worked with in that
way by the program, and the answers returned to decimal just before they are returned to the outside
world.

11

BINARY

OCTAL

HEXADECIMAL DECIMAL

161

160

101

100

16

16

10

Coding Schemes: Given that data are stored or passed along inside a computer as binary bits, it soon
became obvious that a method of organizing the bits into groups to represent letters, numbers, and
special characters was needed. Although the process of calculating with binary digits is at the root of
the design of the system, a great deal of data are represented as letters, not numbers. Therefore,
several codes have been developed over the years to deal with letters and special characters.
Baudot Code is named after George Marie Baudot, who was a signalman in the French Army. He
developed a five-bit code named after him that was the standard method of sending data between
teletype machines as these became available. A teletype is essentially a mechanical typewriter and
keyboard connected to a similar unit at a remote distance, and which communicated with each other
by sending telegraph-like bit patterns over telegraph lines. Operating the keyboard on one terminal
would send five-bit groups over the wires where they would actuate the typewriter of the remote
terminal. This technique was standard for the Western Union telegraph service and others during the
first half of the century.
American Standard Code for Information Interchange (ASCII) is a code that grew out of the
expansion of digital devices that needed to communicate, but which needed a greater range of
representabile characters than Baudot code could provide. This code was developed and agreed upon
by a consortium of companies that acknowledged the need for competitors to be able to
communicate. The code comes in two versions, ASCII-7 and ASCII-8. ASCII-8 is little used now,
but finds use in communications overseas where technology may not be to 1999 levels. ASCII-7 is
12

the code most used by minicomputers and microprocessor-based machines, including personal
computers. The seven bits of each byte used can represent 128 different combinations of printable
and non-printable characters, these last used to control equipment rather than to print on them. It has
many of the earmarks of the earlier Baudot code in that it grew out of the teletype paradigm. The
code can represent both upper and lower case letters as well as numbers. However, the organization
of the bit patterns that represent the numbers are not binary-aligned (similar to the problem with
decimal discussed above). Therefore, a translation function is almost always included in
programming where the data are brought into the system or sent from the system as ASCII, but is
used in computations as pure binary.
IBM PC ASCII is essentially the ASCII-7 code, and the lower 128 bit combinations are used in PC
I/O devices such as screens and printers as usual. However, in an 8-bit byte, the 7-bit code leaves a
bit unused. In some systems, this can be used as a parity bit or ignored. However, IBM elected, when
the PC was designed, to use the 8th bit to double the ASCII-7 code range (remember, adding another
bit position to a binary number doubles the number of combinations you have). The new 128
characters thus created were used by IBM for characters for non-English languages, Greek letters and
symbols for mathematics and science, and simple graphics to form borders and grids on screen
displays. Although computer graphics has gone long beyond this phase of display, the PC ASCII
code is used as a reference for simple text work.
Extended Binary Coded Decimal Interchange Code (EBCDIC) was first used by IBM when they
introduced System/360 in 1964. This code uses all 8 bits of the byte for 256 combinations that
represent letters, numbers, and various control functions. The decimal equivalent numbers are binary
aligned such that EBCDIC numbers coming from an input device can be directly fed to the processor
and into computations without translation. The code is based on an earlier Binary Coded Decimal
(BCD), which was the code used in earlier IBM products such as the 1401. This code was a 6-bit
code in which the decimal numbers that were to be involved in calculations were binary-aligned.
There are a variety of other coding systems, and the internal workings of each processor uses the bits
of bytes and words in many different ways. Watch for variations on these themes.
Error Checking is used extensively in computers to make sure that the answers you are getting are
correct. The validity of the data and the results of the computations overall is referred to as Data
Integrity. Are the answers really correct? Error checking can be done with hardware and software.
Usually a system has several different implementations of both to ensure integrity.
Vertical Redundancy Checking (VRC) or Parity Checking is a means of counting bits that are set
to a 1 in every byte or word of a data stream. Suppose a magnetic tape drive is reading a record of
tape. The bits of the bytes of data on the tape stretch across the tape width (vertically), not along the
tape (horizontally). As each byte reaches the processor, the number of 1's in it are counted. We are
not interested in the binary value of the byte, but rather whether or not an even or odd number of bit
positions contain 1's. In an Odd Parity system, we want the number of 1's to be odd, that is one bit,
three bits, five bits, etc. If we count them and find that there are an even number of bits set to 1 (no
bits, two bits, four bits, six bits), we turn on a special bit called the Parity Bit or Check Bit to ensure
that the number of bits in the byte is odd. If we find that the number of bits is already odd, we leave
the parity bit off. So, we have a bit that goes along with every byte that contains no value, but is used
to ensure parity. An Even Parity system works similarly, except that we use the parity bit to ensure
that the total number of bits in the byte is even.

13

Longitudinal Redundancy Checking (LRC) is a similar checking system that counts the numbers
of bits set to 1 horizontally along the data stream. We are not interested in each byte, but rather in one
of the bit positions of all bytes, say bit position 3. As bytes pass by, we add up all the bits in bit
position 3 that pass. We do this also for the other bit positions as well. At the end, we have a
character that represents the summation of all bits in each bit position for that data stream. This
character, called the LRC Character is written on the tape or follows the data transmission from the
sending end to the receiving end. As the data arrived at its destination, a similar LRC characters was
gathered. The two are compared, and if they match we assume that the transmission or tape reading
was OK. If they do not match, we assume a reading or transmission error.
The problem with both of these methods is that if an even number of bits is picked up or dropped
during the transmission or reading of the media, it is possible for errors to go undetected. Therefore,
on magnetic recording systems and some networking, a Cyclic Redundancy Check (CRC) is used.
The CRC check character is gathered similarly to the LRC character above, but it is processed by a
shift-and-add algorithm rather than be simple addition. The result may be more than one check
character in length. When all three check methods are used and no errors are found, the assumption is
that the data are clean.
The Central Processing Unit
The Central Processing Unit (CPU) is the heart of the computer system. It contains all the circuitry
necessary to interpret programs that define logical processes that the human programmer wants to do.
It consists primarily of electronics which implement logical statements. These statements are worked
out in Boolean Algebra, a non-numeric logical algebra that defines the logical relations of values to
each other.
The CPU is responsible for the interpretation of the program, and, following the instructions in the
program, causes data to be moved from one functional unit to another such that the results desired by
the programmer are obtained. Input data are given to the CPU, it is processed by being moved about
within the CPU's functional units, where it undergoes logical or numeric changes along the way.
When the processing is done, the data are returned to the human world as output data.
There are historically two designs that have been used in CPU's. The first dates from the time of John
Von Neumann, and may be referred to as a "dedicated system". This system has circuitry that is
dedicated to specific purposes - an adding circuit that does addition, a subtracting circuit that does
subtraction, a circuit that only compares, and so on. None of the circuits are active except the one that
is needed at the moment. This is wasteful of circuitry and makes the system larger and require more
power.
The second type of system appeared commercially with the advent of IBM's System/360 in 1964.
This system may be defined as "non-dedicated". The individual circuits needed for discrete functions
in the earlier machines were replaced by a single multipurpose circuit that could act like any of them
depending on what it was told to do. This circuit was called the Arithmetic Logic Unit (ALU). It
could act like an adder, a subtractor, a comparator, or any of several other functions based on what it
was told to do.
A block diagram of a modern CPU includes the following functional units:

Registers: Registers are groups of circuits called bistable multivibrators, or flip-flops, for
short. These are circuits made of pairs of transistors that have the ability to remain stabile in
14

one of two logical states. They can be said to contain a binary 0 or 1 at any specific time.
Groups of flip-flops can be used to store data quantities for a short period of time within the
CPU; 8 flip-flops could store one byte, and 16 could store one word.
General Purpose Registers (GPR) are groups of flip-flops that act to hold bytes or words for
a period of time. Unlike most registers in the system, these registers are visible to the
programmer as he/she writes the instructions to implement the program. The programmer can
refer to these registers in the program and put data into, and take data out of, them at any
time.
The Arithmetic Logic Unit (ALU): This unit has responsibility for all of the arithmetic and
logical functions of the system. It is composed of one fairly complicated circuit that can act
like any of several types of mathematical or logical circuits depending on what it is directed
to do. This device has no storage capability, that is, it does not act like a register or memory
device. It introduces a small delay as the data passes through it, called transient response time.
The Instruction Register receives the incoming instruction and holds it for the duration of a
machine cycle or longer. It makes the instruction available to the system, particularly the
Control Unit.
The Program Counter is a register that keeps track of the location of the next instruction to
be processed after the current one is finished. It contains memory addresses in binary form.
The Control Unit (CU) accepts the instruction from the Instruction Register and, combining
the instruction with timing cycles, causes the various functional units of the CPU to act like
sources or destinations for data. The data moving between these sources and destinations may
be processed on the way by moving through the ALU.
System Clock: The system clock is a timing cycle generator that creates voltage waves of
varying periods and duration which are used to synchronize the passage data between
functional units.
Input/Output System: This system provides the means by which input data and instructions
enter the system, and output data leaves the system. Remember that in a Von Neumann
machine, the data and the instructions that direct its processing sit side-by-side in the same
memory device.
Main Memory is contained within the CPU, and stores the data and instructions currently
needed by the program execution. The speed with which the memory and the rest of the
system communicate is a critical issue, and the center of much development. This may also be
called Primary Storage.

Here is more detail on some of these items.


The Control Unit has undergone major design changes over the years. The current procedure is to
make the CU essentially a computer within a computer. Just like the CPU has I/O devices between
which it can move data, the CU treats the functional units of the CPU as sources and destinations.
The CU takes the instruction from the Instruction Register and the timing cycles from the System
Clock. It combines these by stepping through what amounts to be microinstructions contained
within its own circuitry. By following the microinstruction pattern built into itself for a given
instruction, the CU implements the instruction desired by the programmer by moving data between
and through the various functional units in step with the system clock. The effect is one of doing the
required instruction as far as the outside world can see.
An example would be the process of executing an Add instruction. The programmer writes an Add
instruction along with additional information such as where the two data items are in the system that
are to be added. Given this as a starting point, the CU starts to follow its own set of microinstructions
to find the two data items, pass them through the ALU to accomplish the Add, and catch the sum at

15

the output of the ALU. It then returns the sum to a functional unit such as a register to hold the
answer for the next instruction.
Because the sequence of events in the earlier dedicated systems operated at the binary level, and
because the programmer and technician originally could work directly with the circuitry either from
the front panel via lights and switches or via a program, the lowest or binary level of programming
became known as Machine Language (ML). With the advent of the microprogrammed Control
Unit, the instructions contained within the CU became known as microprogramming or microcode.
This means that currently, the Machine Language which is the lowest level that the programmer can
see is made up of microprogram instructions or steps. The technician can work at the microprogram
level, but the programmer typically would not. When programming logic and instructions are
embedded permanently into the circuitry of a device, it is referred to as firmware.
The System Clock is a timing signal generator that creates a variety of voltage waveforms used to
synchronize the passage of data through the functional units. There are two types of electronics or
logic in the system, synchronous and asynchronous. The word synchronous means "in step with the
passage of time", while asynchronous means "not in step with the passage of time. Synchronous
circuitry is that which has a clock timing signal of some kind involved with it. GPRs, for example,
are synchronous, because the accept data into themselves at a particular moment or clock time. The
ALU is an asynchronous circuit - it doesn't store data, it passes it through as quickly as possible and
does not rely on a clock signal to do so.
In synchronous systems, the system timing is divided into regular intervals of time called Machine
Cycles. All system activity is based on the elapse of the machine cycles. These are further roughly
divided into two types of cycles. These are Instruction Cycles, or I-Cycles, and Execution Cycles,
or E-Cycles. Instruction cycles are those that are responsible for obtaining an instruction from the
main memory, placing it into the Instruction Register, and starting the CU's process of analyzing the
machine language instruction to determine what microprogram to execute. By the time the I-Cycles
are completed, the instruction is ready to execute, and the system already knows where the next
instruction to execute will be found in main memory after the current instruction is completed. ECycles have the responsibility of actually causing the instruction to be accomplished. It involves a
series of microinstructions that move data around between the functional units of the system so that
the desired result is achieved. E-Cycles must recognized when the instruction has run to completion,
and hand off the system to the I-Cycles again for the next instruction.
In modern systems, including those based on the microprocessor device, these cycles can be
overlapped. The I-Cycles for instruction number 2 are getting underway while the E-Cycles for
instruction number 1 are being performed. This is a simple example of parallel computing.
Main memory or primary storage is tightly connected to the dataflow of the CPU. It is a primary
source for instructions and data needed for program execution and a primary destination for result
data. The programmer really has little else to specify other than a main memory location or a GPR for
the most part.
Data are stored in main memory at locations called addresses. Each address can contain one or more
bytes of data. If the smallest lump of data that can be referred to with a single address is a byte, then
the machine is referred to as byte-addressable. If the smallest lump of data that a single address can
refer to is a two-byte word, then the system is called word-addressable. Some special purpose
devices can use an address to refer to a single binary bit within a byte in the memory. These
machines are called bit addressable.

16

The number of addresses and therefore the number of storage locations a memory system can have is
determined by the width of the address bus of the CPU. This total number of addresses is called the
address space. The address bus is a set of parallel wires that distribute binary 1's and 0's to the
memory system in a synchronous manner. Each additional bit of width given to the address bus
doubles the size of the memory possible. An address bit one bit wide, A0, could specify one of two
addresses, number 0 and number 1 (remember there are two states possible for a binary bit). If the
address bus were two bits wide, A0 and A1, then there would be four addresses in the memory
system because there are four possible combinations of two bits: A0=0, A1=0; A0=1, A1=0; A0=0,
A1=1; and A0=1, A1=1. If we have three bits of address bus, A0, A1, and A2, then we would have 8
addresses possible, and so on. Following this plan, what would be the address space for address bus
widths of 16, 20, 24, and 32 bits?
The interaction of the main memory to the rest of the CPU is a critical factor in the overall
performance of a computer. Typically, when core storage was used, the speed of the core system was
slow enough compared to the speed of the electronic circuitry that the electronics had to wait for the
memory to respond to a request. The CPU would add machine cycles of wasted time, (in
microprocessors, called wait states), to slow the circuitry down and give the memory time to
respond. With the advent of microprocessors and solid state memory, we still have this problem
because the speed of the microprocessor device is still significantly greater than that of the main
memory connected to it. We overcome this problem by the addition of a cache memory. The cache
is a small amount of high speed memory that is able to keep up with the processor with no waiting. It
interfaces the processor at its speed to the main memory at the slower speed. This is tricky to do, and
there are variety of cache controller devices and methods that are currently in use to make this
process as efficient as possible. With a good cache controller, it is possible for the memory to have
the needed data or instruction information available to the processor about 99% of the time. This is
called a hit rate.
Currently, there are two fields of processor design of which you should be aware. The first is called
the Complex Instruction Set (CISC) approach. This is the traditional mainframe approach, and the
System/360 was famous for it. The CISC machine uses complex instructions to do its work. One
instruction might cause one number to be incremented, another to be decremented, the two results
compared, and a change in execution direction (jump or branch) depending upon whether the two
numbers are equal. That is an great amount of work for one instruction to do, but it is fairly easy to
implement since by the use of microprogramming, it is an easy task to simply connect microroutines
together to accomplish it. Such systems can be rather slow in execution, but are easy to program
because the have what we call a "rich" instruction set. Most microprocessors including the Intel
machines are of this type.
Reduced Instruction Set (RISC) computers are designed in the reverse. These machines have a
small number of simple instructions, but they execute very, very fast. Their electronics is hardwired
or dedicated, as opposed to microprogrammed. The results of the complex instructions can be
obtained by writing routines that implement the logic of the complex instruction using the small
instructions. The result is that a RISC machine can execute at an overall faster rate, even though it
seems to be doing more instructions to get the result. Various tricks with clock distribution, internal
pipelining, and similar approaches are also used in the RISC design to further improve the
throughput. RISC machines are finding use as large workstations for CAD, design, engineering, and
related uses.
In a CISC machine, the ultimate throughput depends on how fast the ALU can be supported by the
rest of the circuitry. Indeed, no matter how fast the support electronics are, if the machine has only

17

one ALU, then it can execute only one instruction at a time. Parallel Processing involves a design in
which there may be more than one ALU. This would allow more than one thing to be processed at
one time, thereby increasing the performance of the system. The parallelism is not limited to just the
ALU. It is possible to have more than one set of GPR's, data paths, and I/O paths as well. The
primary difference between the Intel '386, '486, and Pentium is their internal architecture that uses
ever-increasing amounts of parallel functional units to increase throughput.
Another method of increasing throughput is by use of a Coprocessor. This device acts as a parasite
on, and in concert with, the main processor. It cannot operate by itself. It uses bus access and control
signals that are common to the primary processor at will. The coprocessor is designed to do a specific
set of small tasks, but to do them very fast. The best example is the 8087 math coprocessor from
Intel, that works in concert with the 8086 processor. The 8087 has an additional set of instructions
that it can perform that is over and above the instruction set of the main processor. As the instructions
enter the processor and coprocessor together, the 8087 watches for one of the instructions that
belongs to its set. When such an instruction comes along, the 8086 hands control of the system over
to the 8087 for it to do its thing. When the instruction or instruction stream is complete, the control
passes back to the 8086 again. The 8087 can deal with floating point numbers and very large
numbers that would take the 8086 much longer to process.
Peripheral Devices, Character-based
Peripheral devices are those that support the processor to deliver data to the processor, take results
away, or store data and instructions so that they can be accessed by the processor at any time. In this
section we will discuss those peripheral devices that are primarily character-based, that is, they deal
with data one character or byte at a time.
Source documents are those documents that come from the human world to the computer. They can
be order sheets, sales tags, handwritten receipts, or an infinite number of similar things. They are
empirical; that means that they are gathered at the source of the related activity, which may be miles
from the nearest PC. Computer-usable documents are those pieces of paper or media that can be
accessed by the computer's input/output devices without need for further preprocessing. These
include the venerable punched card, optically read documents, magnetic stripe credit cards, or
keyboard entry. In the old days, there was a major conversion that had to occur to make the source
documents computer-usable. Traditionally, the source documents were brought to the computer site
where they were read by a keypunch operator who generated a deck of punched cards that
represented the data on the source items. This step consumed time and money. Therefore, a great
variety of data entry techniques have been developed to eliminate the translation process. Credit
cards with magnetic stripes, optically read lotto tickets, laser scanned canned goods and potato chip
packages name just a few.
Early methods of generating computer-usable documents centered around punching holes in things.
These included the Hollerith punched card and paper tape, which was used in teletype systems and
various early data recording methods. The punched card had twelve rows for holes named 12, 11, 0,
1, 2, 3, 4, 5, 6, 7, 8, and 9, from the top down. The card was divided into 80 columns, left to right.
The top three rows were called zones, and the bottom rows were called digits. If the area or field, or
group of columns, of the card being discussed contained numeric data, then the 0 row was
considered a number 0. If the field of the card being discussed contained alphabetic or alphanumeric
information, the 0 row was considered a zone. The three zones could represent thirds of the alphabet,
so that a punched card could contain numbers, upper case letters, or a few special characters. The

18

holes were placed into the card by a keypunch machine, that was an electromechanical device with a
keyboard and card path through which the cards passed to be punched or read.
Using the keyboard as the human interface, other key-entry machines have come and gone. These
include key-to-tape devices in which key data was written into 80-column records on magnetic tape
(to match the organization of the punched card). They also include stations for key-to-disk and keyto-diskette entry. The first took information from the keyboard and placed it onto a fixed disk, while
the second placed the data on a floppy diskette. Today we normally assume that a PC or PC-like
station will be the entry point and that it will be connected to the computer by a network of some
kind.
Other types of character entry besides keyboards include the mouse, a small movable device whose
position on the table is represented by a pointer on the screen; Optical Character Recognition
(OCR) devices that read printed characters from a medium by doing pattern recognition; Magnetic
Ink Character Recognition (MICR) used in the banking industry to encode values on checks;
Light Pens that are used to indicate a certain point on the screen to which the user wishes to call the
computer's attention; Touch Panels which can receive input in the form of a person's finger touching
a point on the screen; Bar Codes which are scanned by laser to generate a pattern of 1's and 0's that
can be interpreted as binary data; Point-of-Sale (POS) terminals which act like computerized cash
registers and checkout stands, where commercial selling is done, and which might have other I/O
devices like laser scanners included within them; and Voice Recognition and generation which
attempts to communicate with the user by the spoken word.
The word terminal refers to any of a wide variety of keyboard-plus-display machines that can
interact with a user on behalf of a computer. The earliest was the teletype, which could display
information to the user by printing it on paper. Video display terminals came into their own only in
the early 1970's, because the semiconductor memory devices needed to store the image for the screen
were not plentiful until that time. We now have completely intelligent terminals such as the PC that
can do their own processing most of the time, and need to communicate only at certain times.
Video Display devices are those that can provide a text or graphic image on a surface, usually a
Cathode Ray Tube (CRT). The text image is stored as ASCII data, and a refresh circuit circulates
through the storage device, or buffer, so many times a second to generate the image on the screen.
The circulation of data from the buffer is synchronized with the vertical and horizontal timing of the
raster on the display tube so that a stabile display of letters is produced.
In graphics displays, the field of the CRT's screen is divided into picture elements or pixels. A
pixel is a dot of light on the screen, or the place where such a dot of light can be. Resolution is a
word the indicates the number of pixels horizontally across the screen, and number of pixels
vertically down the screen, that a give video display can produce. A display with a Video Graphics
Array (VGA) image will have 640 pixels horizontally and 480 vertically. This is a standard
reference value in current PC technology.
Each pixel has certain characteristics. These include its size, or dot pitch, which is a function of the
manufacturing process of the CRT, and is the diameter of the dot of light in millimeters (e.g. 0.28mm
dot pitch), and the number of colors it can represent. This is determined by the video display adapter
to which the display itself is attached. It is important to make sure that the display itself and the
adapter to which it is to be attached are compatible in sweep speeds and interfacing. It is possible to
damage a display if it is connected to an incompatible adapter.

19

Displays that can generate only one color are called monochrome displays, while those that do
colors are called color or polychrome displays. The early PC had two monitors and display adapters
available. The monochrome adapter and display (MCA) generated a green image with a
characteristic character shape that is still with us, but this display was a higher resolution than a
regular television and therefore was not compatible with television monitors or standards. It was
modeled after IBM's mainframe display device, the 3270. The Color Graphics Adapter (CGA) and
display was designed to be NTSC compatible so that a buyer could use a color home television as a
display. The resolution was poor, but the device could generate graphics and color reasonably well.
In 1984, IBM introduced the Enhanced Graphics Array (EGA) with the PC/AT. This allowed the
resolution of the MCA device to be viewed in color. The issue of the use of the home television had
by this time become unimportant. In 1987, IBM introduced the PS/2 product line and with it the
VGA device. This set a baseline standard for display resolution and performance. The Extended
Graphics Array (XGA) was an attempt by IBM to define a standard for higher-than-VGA
resolutions, but most makers did not adhere to the specifications. The Super VGA (SVGA) was
proposed instead, with a pixel map of 800 (h) x 600 (v) pixels. This has been adapted since by all
makers including IBM, but it was never a fully agreed-upon standard.
Liquid Crystal Displays (LCD) and their derivative, the Active Matrix Display, also called the
Thin-Film Transistor Display (TFTD) make use of a liquid crystal sandwiched between two pieces
of glass that have been coated with conductive transparent oxides. By controlling the voltage between
the two pieces of glass, the crystal liquid will turn either opaque (no light passes through) or
transparent (light passes through). By installing a transistor at each pixel location on the glass, the
TFTD can increase the contrast ratio of the opacity of the crystal, generating a clearer, crisper image
that changes instantly instead of slowly as does the LCD.
The term printer refers to a variety of devices that place characters on a receiving medium. The
methods of doing this come under the categories of impact printing and non-impact printing.
Impact printing has its beginnings in the press of Gutenburg, which used fonts, or the shape of the
desired character, carved into blocks of wood in high relief. Each letter had to be carved by hand. The
letters were placed together in a frame so that they were compressed on all sides and did not move.
The frame was placed onto a rolling carriage and ink was spread onto the tops of the fonts. Paper was
then placed onto the fonts; the carriage was rolled under a heavy metal plate, or "platen", which was
pressed down onto the back side of the paper. The ink on the fonts was thus transferred onto the
paper in the shape of the fonts. While this method used pressure rather than a fast-moving impact, it
was nonetheless the beginnings of mechanical printing as we know it.
Today, a wide variety of printers use some variation of this process. They all have these five things in
common:
1. The character shape, or font, which can be carved or cast from metal or plastic, or formed by
a pattern of dots. The term "font" also applies to a space where a character could be but is not.
2. Paper, or the medium to which the coloring element or ink is transferred. Paper is the most
common, but printing can be done on plastic, metal, wood, or just about any surface.
3. Ink, usually found in the form of a ribbon that has been saturated with ink. Ribbons can be
made of many fibers, but the standard now is nylon.
4. The platen, or some related device that provides a backstopping action to the printing
movement.
5. Physical motion, which brings all of the above together with sufficient pressure or force to
cause the ink to be transferred to the receiving medium.

20

Almost always we will find that two or more of the five elements of impact printing are combined
into one physical mechanism. Examples include:

Typewriter, in which the font, in the form of a cast slug of metal on the end of an arm, is
thrown toward the ribbon so that it impacts the ribbon to transfer the ink to the paper. The
rubber roller around which the paper wraps is called the platen, and it serves to backstop the
flying key. The font and physical motion are combined into one mechanism.
Dot matrix printer, similar to those found in the student labs. In this case, the font consists
of a dot pattern that is formed by striking the ribbon against the paper with the ends of a set of
wires that are electro-mechanically moved forward, then retracted, at high speed. The font
and physical motion are represented by the print head with the wires inside. The platen is a
small smooth metal piece behind the paper.
Drum printers are fund on larger systems and have largely been replaced by large laser
printers. They have a metal drum whose surface is covered with fonts in lines and circles such
that each time the drum makes one complete revolution, every possible font is exposed to
every possible character position. The paper is pushed from behind by a hammer mechanism
that causes the paper to move forward and be pinched between drum and the ribbon. The
combination here is the platen, formed by the drum, and fonts on it.
Chain and Train printers work similarly to drum printers. However, instead of a drum with
characters on it, the fonts are made on metal slugs that travel around in an oval on a race
track. The chain, where the slugs are hooked together, or the train, where they are not hooked
but push each other around, spins across the width of the paper. A hammer mechanism for
each print position fires from behind the paper to press the paper against the ribbon on the
other side which is then pressed against the font as it passed by. This method combines font
and platen.

Non-impact printing uses modern techniques to form characters of a contrasting color on a medium.
There have been many non-impact printing methods over the years; the three most common now are
thermal printing, where heat is used to form the characters; optical printing, where light is used;
and ink jet printing where ink is simply sprayed onto the paper.
Thermal printing involves a specially treated paper that has a light background tint, but which can
turn darker with exposure to heat. The heat is often formed by a pattern of dots that is created as a
print head passes slowly over the paper surface. The print head consists of a row of diodes encased in
glass bubbles that are turned on and off very quickly, and which can heat up or cool down almost as
fast. As the glass bubbles on the printhead contact the paper, the current is turned on and then off
quickly, causing the glass bubble to heat up, then cool down rapidly. This in turn causes the area of
the paper which was in contact with the paper at the time of the heating to turn darker, typically
either blue or black. This method is used in desk calculators and many credit card and cash register
applications.
Optical printing is best illustrated by Laser Printers. These devices have a rotating drum that is
covered with a cadmium sulfide compound that is sensitive to light. When light shines onto the drum,
the surface that the light impinges is made to be electrostatically charged. As the drum turns, the
charged area is then exposed to a very fine black powder or toner, which sticks to the areas where the
charge was placed. This area is then further rotated to a point where the toner is transferred to a piece
of paper as the two are pressed together. Finally, the paper is heated as it exits the machine to seal the
ink into the paper. The character shapes can be drawn onto the rotating drum by a focused laser bean,
and this beam can be steered to create the desired pattern of dots. The characters are not whole font they are formed by very small dot patterns, typically in a resolution of 300 x300 dots per inch.

21

Ink jet printing involves the spraying of minute ink droplets onto the paper as a spray nozzle moves
across the width of the page. The ink is pumped under pressure to a nozzle that generates a very fine
stream. This stream is passed through electrodes that are charged with an ultrasonic signal so that the
stream becomes a stream of tiny droplets. These are further "steered" by more electrodes to guide the
droplets up or down as the print head makes its excursion. The result is a finely generated printing
that can come in colors and do excellent graphics.
Plotters are large printers that generate drawings or graphics as opposed to print. Pen plotters use
real ink pens in varying sizes and widths that are moved over the paper in an X-Y fashion to generate
the desired line drawing. These can be very fast, but have certain limitations on accuracy and
resolution. Photoplotters are essentially giant laser printers (although the original ones did not use
lasers) which use a dot matrix of light points to generate a high-contrast pattern on film. These are
used to create printed circuit boards and integrated circuit device masks.
A few more terms to round out the printer discussion:

Paper Feed techniques include methods of moving paper through a printing mechanism. The
most common form is called a pressure roll or pressure platen technique, in which the paper is
pinched between two rubber rollers or a roller and a platen which is then rotated to move the
paper. Single sheet or cut sheet paper is most frequently used in these machines. Tractor
feed is used in high speed paper motion to move the paper by tractor pins that pass through
holes along the edge of the paper so that the paper is mechanically positively moved. Paper
used in this type of machine is called continuous forms.
Dot Matrix Printing indicates that the form of the character is made up of a pattern of dots in
an X-Y arrangement rather than by complete unbroken lines. Most printers today use this
technique, as the resolution improves to the point where it is hard to tell the real thing from
the dot pattern.
Near Letter Quality (NLQ) is a term that is used to indicate a dot matrix printed output that
is very close in quality to the results that could be obtained by whole-character, that is impact,
printing.

22

SECTION 3 - Hardware, Part 2


Data Storage Organization
When confronted with storing data, and particularly large amounts of data, it is necessary to organize
the bytes of information in a way that makes sense to the nature of the data, and also to the
mechanism in which the data are being stored. The user wants to see the information in such a way
that makes sense to him or her. For instance, if the user wishes to keep a name and address list of
club members, the interaction between the user and the computer should be in a way that makes
sense to the nature of the list. That is, to add a new member to the list, for example, the user would
enter the member's name on the first line of a screen, the first address line on the second line of the
screen, and the city, state and zip code on the third line. This would match a typical hand-addressed
envelope. The data, however, is not stored in handwritten form, but as bytes on magnetic disk. The
disk drive, being a mechanical device, has certain characteristics and limitations that must be met if it
is to be useful. The data must therefore be converted to a different organization than the simple three
lines when it is sent from the screen/keyboard of the system to the drive. It must be reorganized to fit
the limitations of the disk drive device. Also, when data are retrieved from the drive later, it must be
converted from a disk drive organization to a different organization that better fits the understanding
of how the user will deal with it. It is up to the computer and the operating system between the user
and the drive to make these organization conversions.
The original standard for data organization was the Hollerith punched card. This piece of stiff paper
was arranged as 80 vertical columns of twelve horizontal rows each. The card could therefore contain
as many as 80 letters or numbers. Because it was of fixed length, the card and the 80-character
grouping were referred to as a unit record. The gray covered electromechanical machines used to
read, punch, process, and print the data on the cards were called unit record machines. From the
inception of IBM up through the mid-1950's, unit record machines were the mainstay of the data
processing industry.
In the early days of computers, the group of 80 characters was maintained as a reference quantity of
data. The record, as a standard unit of data, was composed of one or more fields, which in turn were
composed of one or more characters. An example would be a punched card or computer record that
contained the entry of one person's name in a list of names for a club membership. The first field
might contain the member's name, and be 20 characters long. The second field could be an address of
20 characters; the third might be a city name of 15 characters; the fourth field might be a state code
of 2 characters; and the next field might be a ZIP code of 5 characters. Together, these 62 characters
make up one member's address for the club roster.
Notice that in the membership list, the fields each represent one part of an address or identify the
person. Together, all the fields make up a record that identifies the person and his/her address. The
organization of these data must make sense to the user, the person working with the information. It is
organized as one might arrange a holiday card list or other simple mailing list. It makes sense to the
user to organize the information this way.
However, the storage device design is such that it doesn't know about the nature of the data; indeed,
disk drives are dumb devices. The disk drive's electronics know only how to find tracks and sectors.
So, in between the user at the keyboard and screen and the disk drive is the computer along with the
operating system that together rearrange the organization of the data as they pass between the source
and destination. If the data are going to the disk from the screen/keyboard, then the data are taken out
of the mailing list organization described above which made sense to the user and arranged into

23

sectors so that the data can be written on the disk surface. When the data are retrieved, the computer
and operating system reorganize the data in reverse. A major amount of work is done to accomplish
these conversions. A significant amount of the operating system is dedicated to disk handling.
As technology progressed away from the punched card to screen and keyboard data entry, the unit
record gave way to a more general arrangement of data coming from the source. The idea of
characters, fields, and records remained. In addition, the records collectively were grouped into a file.
So a file is one or more records of data. The number of characters in a field, and therefore in a record
and file, can now be variable; we no longer need to deal with a fixed length of 80 characters.
Programmers today deal with data storage in an infinite number of ways that make sense to the nature
of the data being stored, be it accounting data, scientific data, school records, or word processing
documents. However, when the data are sent to the disk, the system hardware and software must
make the conversion to the arrangement that the disk drive can accommodate. The file is the unit that
appears in the directory of a disk drive. If you issue the DIR command at a DOS prompt on a PC, the
listing you get is at the file level. It is assumed that each of the files listed contain one or more
records made up of one or more fields that are made up of one or more characters. To see what is
inside a file, you must execute some sort of program or DOS command that will show the file to you.
A database is made up of one or more files that contain data of a related nature. Again, just how the
data are arranged between these files is up to the programmer who creates a file set that make sense
to the nature of the data and the nature of the use to which it will be applied. One of the files usually
contains either all the data as a base reference, or, if not all the data, at least the essential data against
which the other files may be referenced. This most important file is called a master file.
Fields come in different types, too. First, there is a key field, which is regarded in the database as
being the one first looked at by the program that is using the data. For instance, to prepare the
monthly meeting notice for the club membership, the corresponding secretary might define the ZIP
code field of the mailing list records as the most important. This is because when mailing large
numbers of fliers, the post office will charge less per piece if they are presorted in ZIP code order.
When preparing the list for printing, sorting the records into ZIP code order via the key field will
save the club mailing costs.
Fields can be described by the nature of the data they contain. Fields which contain only alphabetic
letters are called alphabetic fields; those that contain only numbers are called numeric fields; those
that contain a mix of letters and numbers are called alphameric or alphanumeric fields. Alphabetic
fields and alphanumeric fields contain data that usually are stored as-is. Numeric fields, however, are
kept pure so that their contents can go directly to a mathematical processing routine. Numeric fields
can also be compressed and stored in dense form; this saves disk space if the amount of numbers to
be stored is large. There is also a logical field, composed of one or more bytes, whose contents or bit
positions represent answers to "yes or no" questions. For instance, it would possible to store a single
byte in the record along with the name and address to indicate 8 different yes-or-no answers. These
could include "has the member paid this year's dues? Yes or No", with a 1 for a yes and 0 for a no.
Records have a set of characteristics as well. The most obvious is whether the record is a fixedlength record or a variable-length record. The fixed-length record is easy to deal with since all the
records are the same length. This is easily seen in the punched card, where the data were physically a
fixed length as well as logically. Dealing with this kind of record is easy to do in programming.
Accordingly, this type of record storage is the most common and used most of the time. Variablelength records mean that the length of the records are not a fixed size, but can be longer or shorter
than the last because there is no reason to store blank characters if shorter, and there is a reason to

24

store non-blank characters if longer. Programming with variable-length records is difficult because a
method must be devised to determine where one record ends and the next begins - the programmer
can no longer depend on a fixed number of characters per record.
At the file level, the method of approach to accessing data in a file can take several directions. The
first question is how large the body of data are that is to be stored in the file. For example, a credit
card company might have millions of customers, many of whom have multiple cards. How do you
find the one client's records in all those million? Several ways of storing the data within the file
address this question.
The simplest and most obvious way of storing data are the sequential file. This file contains records
in an order sorted by a key field within the records. Again, a file sorted to ZIP code order is a good
example. To find the address of a single person within the file, the program begins at the beginning,
and looks at the first record to see if it is the one desired. If it is, the search is over quickly. However,
if the first record of the file is not the one desired, the program then reads in the second record, and
tries again. If the record we are looking for is close to the beginning of the file, it takes little time to
find it. However, if the record we want is near or at the end of the file, it might take long time to go
through the thousands of records we don't want to find the one we do want. This usually is
unacceptable in anything other than small data sets.
An improvement on the sequential method of file access is the indexed-sequential file. This method
consists of two files. The first is the large file of many records that contains all the details about each
person in the club or credit card client or machined part. This file is in random order; it is not
necessary to keep it organized. The only thing we need to do is to make sure that the records are
filled in correctly. Then, we build a small file called the index file, which acts like the index in a
textbook. At the beginning of the processing session, a pass is made through the large file, and the
key fields of each record, along with the position of that record in the large file, is stored in the index
file. When complete, the index file contains 2 pieces of information about each of the master file
records: The key field contents, and the location of that record in the master file. When we wish to
find a particular record, we look it up sequentially in the index file - this takes little time because the
file is small and the entries in it are short. When we find the item we want, the entry in the index file
gives us the record location for that item in the big file. So we take this information and find the item
in the big file directly, that is, without going through all the entries ahead of the one we want.
Another method of finding information in a large file is called a binary search. In this method, the
large file is sorted by a key field in each record. This takes time, but it puts all the records in some
sort of logical ascending order. Again, the ZIP code field in a large set of records is a good example.
When we wish to find a particular entry in the file, we go to the record in the middle of the file,
obtain its key field data, and compare it to the one we want. If the desired data has a higher value
than the record obtained from the middle of the file, we know that the one we want is in the second
half of the file, above our current location. We therefore know that the desired data are not in the first
half of the file. Conversely, if the desired key field data are less than that of the middle record of the
file, we know that the data we want are in the first half of the file. Immediately, we have eliminated
half the file as not having the data we want.
We continue with the half of the file that contains our data, and again to the middle of that group of
records. Again, the item we want is either above the middle of the second half (the upper 1/4 of the
file), or below the middle of the second half (the third 1/4 of the file). Similarly, we can do this
"divide and conquer" over several times until we zero in on the target record. This is a very fast

25

method, and it takes the same amount of time to find any record in the file, regardless of whether the
desired record is at the beginning of the file, at the end of the file, or in the middle.
A Few Words About Magnetism
A student of basic electronics or physics soon is confronted with the ideas and theories behind
magnetism. Unlike electronic current flow, in which actual matter, the electron, is moving,
magnetism is concerned with pure energy levels that have no weight and take no space. As such, it is
sometimes difficult for the student to visualize the ideas behind it.
Electron flow in a copper wire or other conductor is the result of a pressure placed on the ends of the
wire that the electrons within the wire cannot resist. The electrical pressure is called Electromotive
Force, or EMF, and its unit of measure is the Volt. EMF is created by storing a bunch of electrons at
one end of the wire and an bunch of positive ions, or atoms missing electrons, at the other end of the
wire. The electrons in the copper atoms within the wire feel an attraction for the positive-ion end of
the wire and are repelled by the end that has too many electrons already. These electrons, therefore,
tend to move toward the positive end of the wire. When electrons move, it is said that we have
Current flowing in the wire. The unit of measure of current is the Ampere.
Electrons are spinning about their own axes as the move along the wire. This spinning creates a
magnetic field between the poles of the electron, just like the earth's magnetic field between the
North and South Poles. As the electron moves along, it takes its magnetic field with it. This traveling
field is the basis of the science of electromagnetics. This is the science of magnetic lines of force
created by the movement of electrons. It provides us with all the theory necessary to build motors,
generators, electric lights, stereo sets, radio and television, and all the goodies of the plug-in world.
Magnetism is made up of lines of magnetic force. As we said, these are pure energy, not matter in
motion. It is the same basic idea of energy as the light showering down from the fluorescent tubes in
the classroom ceiling. If light were matter, we would gradually fill the room with it, and we would all
walk around glowing on the head and shoulders where the light had fallen. Magnetic lines of force,
like electrons, travel in some materials better than others, just like electrons. Iron, nickel, cobalt, and
various alloys are used to conduct lines of force. However, where electrons won't travel through
things like wood, plastic, and glass, lines of force pass through these unchanged. So electrons don't
flow unless they are allowed to, while lines of force flow unless they are stopped.
If we take a wire and wrap it about a core made of a magnetic substance, and then pass an electron
current through the wire, the lines of force created by the moving electrons will be concentrated into
the magnetic core. This in turn will tend to hold the lines, and may continue to hold some after the
current is turned off. Lines of force in a core that has no current nearby is called residual
magnetism.
If we take a coil of wire and connect it to a sensitive meter or measuring device, and then pass a core
with residual magnetism past or through the coil, the meter will indicate that as the core passed, a
current attempted to flow and an electromotive pressure was created. If a complete path from one end
of the wire to the other is present, the current will indeed flow because the magnetic fields of the
electrons (remember they are spinning) will interact with the passing magnetic field of the core and
this will force the electrons to move - this is called motor action. If the ends of the wire coil are
connected to an amplifier device, the electromotive pressure or voltage built up at its ends can be
seen by the circuitry and put to use, perhaps as 1's and 0's.

26

Magnetic Data Storage


We can take advantage of these phenomena with the laws of Physics dealing with induction.
Induction is the event in which magnetic lines of force created by one device can create, or induce,
lines of force to flow in an nearby magnetic medium. The simple example of a permanent magnet
attracting nails on a table is an example of induction - the lines of force in the permanent magnet
create lines of force in the nail. If there is sufficient induction, the attraction between the two metals
is so great that motor action will draw the nail toward the magnet.
You will recall that the storage of data as binary 1's and 0's is easy to do in electronics since the laws
of physics governing electronics are inherently two-state in nature. The existence of lines of force is
also a two-state system. The lines can either be there (1) or not (0), or, more likely, traveling
clockwise (1) or counter clockwise (0) around a core or wire.
Storage of data magnetically requires two main items. The first is a medium, that is, something to
store the lines of force between uses. This is usually a coating of spray paint applied to a strip of
plastic, forming magnetic tape, or to a circle of plastic for a floppy diskette or a rigid metal disk for
fixed disks. The coating has undergone extensive development over the years as the industry
continues to cram more and more data in a smaller and smaller space, using chemistry as a tool. The
coating is sprayed onto the medium and cured, polished, and honed to a smooth finish. The coating
consists of extremely fine particles of magnetic elements including iron in a liquid binding agent.
When dry, the surface can store lines of force in its magnetic material.
The second item is a doughnut-shaped core of magnetic material around which is wrapped a coil of
wire. If a current is passed through the coil, lines of force will build up around the wire, and be
captured in the material of the core. Thus, with current flowing in the coil, the core will become
magnetized with flux flowing in, say, a clockwise direction around it. If the direction of current flow
is then reversed, the lines of force created in the coil will reverse direction, and the lines in the core
will also reverse direction. So, by passing current in one direction or the other in the coil, we can
force lines of force to flow in the core in one of two directions. This device is called a magnetic
head. At one point, a notch is cut into the core to force the lines to hop across through the air. This is
called an air gap. It is the place that is directly opposite the medium.
To store data, we simply pass the medium by a head that has current flowing in its coil. As the
magnetic surface passes the head, the lines of force jumping the air gap in the nearby head will find
an easier path through the magnetic surface of the medium than it will through the air in the gap. The
lines therefore pass to the medium, along it, and return to the core on the other side of the gap. This
induces lines of force to be created and stored in the medium that is passing the gap. If the direction
of current flow is reversed in the head winding, the lines of force jumping the gap reverse, and the
induced lines captured by the passing medium are also reversed. Thus it is easy to create areas on the
medium that are magnetized in one direction, then in the other. These areas are called dipoles, and
creating them on a medium is called writing data.
If the head coil is removed from the current source and instead connected to the inputs of a sensitive
amplifier, we can again use the laws of induction in reverse. As the medium that has dipoles on it
passes the head gap, the dipoles induce magnetic lines into the head core. As the lines change, they
induce a voltage or current in the head winding, and this, when amplified, can be interpreted as data.
This process is called reading data. So we can use the same read-write head to both write the data
onto a medium, and read it back later.

27

In reality, the act of reading is looking for the points on the track where the one dipole ends, and the
next begins. At this point, the interaction of the lines of force on the medium with the head will
greatest, according to the formula for induction.
Magnetic Tape
Magnetic tape consists of a ribbon of plastic upon which the magnetic coating is sprayed. The plastic
can be any of several types, with polyvinyl and polyurethane being common. The standard width for
most tapes today is 0.5", and smaller sizes down to 8 mm are available for cassettes and other data
storage devices. The standard thickness is 0.5 mil, that is, 1/2 of a 1/000 of an inch. A standard 10inch reel will contain 2,400 feet of tape.
The coating for tape must be flexible, since the tape makes many twists and turns as it passes through
a drive. The coating must also be as resistant to friction and wear as possible, since the tape touches
not only itself on the reel, but also the head metal and other guides and rollers. The use of tape
assumes a contact between the tape and the head mechanism. This contact can cause wear and flaking
of the coating as the tape deteriorates with age. As data are sometimes stored for long periods on
tape, it is common to find an specially prepared storage room with temperature and humidity control
to minimize the aging process.
Data on tape is arranged in tracks, with both 7-track and 9-track tape being common. The tracks run
longitudinally along the length of the tape. They are defined by either 7 or 9 read-write heads in the
head assembly across the tape width. The 7-track tape was created to store data in Standard Binary
Coded Decimal, while the 9-track tape was used for EBCDIC coding. Each track represented one bit
of a binary byte.
Data are arranged on tape in a parallel-by-bit, serial-by-character order. The bits of an entire byte
or character are all read or written across the width of the tape at the same instant. However, fields
and records of data are written one byte or character at a time along the length of the tape. Obviously,
tape is a sequential access method of data storage.
The amount of data a given tape can hold are determined by several things. First, the tape or bit
density defines how many characters or bytes can be written in a linear inch of tape. This is usually
referred to as Bits Per Inch, abbreviated BPI. Depending on the nature of encoding used for the data,
values of 200 and 556 (both obsolete), and 800, 1600, and 6250 are common. Second, the
organization of the data as they are written is important. Data are written in records similar to
punched cards. Between each record is an Inter-Record Gap, or IRG, that is required to allow the
tape to stop and start motion between records. These gaps are typically 0.6" long. If the data are
written in many short records, a good portion of the tape will be wasted as IRGs. If the data are
written in long records, there will be less IRGs and more data on the tape.
Also important is the speed of the tape through the drive. This is measured in Inches Per Second,
abbreviated IPS. Typical speeds have included 16.25, 37.5, 75, 112.5, 100, and 200 IPS. The data
rate of a given drive may be determined by multiplying the tape density by the drive speed. This is
the speed that the data have as they pass to the drive on a write or from the drive on a read.
Floppy and Fixed Disks
We have seen two means by which a specific item of data can be located in a storage device. In the
case of a computer's main memory, the data are located randomly. This means that we can go directly
28

to the specific item of data without going through any other storage location to get there. We have
also just seen the classic example of sequential access, the magnetic tape. On a full reel of tape, it
might take considerable time to find the last record at the end of the tape if you are beginning at the
start of the tape. Both techniques have their good and bad points, depending on the need of the
moment.
There is a third method for data storage called Direct Access Storage Devices, abbreviated either as
DASD or DASDE. DASD takes into account the best of both random and sequential access. Data are
arranged on the medium in concentric circles called tracks. Since these tracks may contain a great
deal of data, each track has a number of divisions called sectors, which can be thought of a slice of
pie in shape. DASD devices first find the major storage portion of the device randomly by accessing
any track randomly. The device can go from track 3 to track 25, then back to track 7 without having
to process data in the intervening tracks in any way. The device then finds the specific item of data
sequentially starting out at the beginning of the track it has found and sequentially proceeding
through all the sectors until it finds the one it needs. So we combine random and sequential
techniques to gain efficiency and speed of data retrieval and storage.
The general approach to data storage is the same in floppy diskette drives and fixed disk drives.
Generally, data are stored in concentric tracks that can be 40 or 80 in floppy diskettes, and many
thousands in fixed drives. Each track has an arbitrary starting point for its rotation called an index.
Starting at the index, the data are written in a serial-by-bit, serial-by-character format along the
track. This is because, unlike a tape drive head, there is only one gap in the disk drive head.
Data records are typically fixed-length, and the length corresponds to the size of the sector, or pie
slice. The actual starting point of each sector around the circle can be defined by a notch or hole in
the disk, and a system using this approach is called hard-sectored. More typically for PC's, the
sectors' starting points are defined by a timing loop in the disk drive controller that counts
microseconds from the index and compares this to the speed of rotation of the disk. Such systems are
called soft-sectored. The only hole or notch is therefore the index. The sectors can be variable in
number, from 8 through 27, and can contain variable amounts of bytes, from 128 on up. In addition
to the data, each sector contains a header, which identifies the sector uniquely on the disk surface,
and several gaps which are waste space to allow the controller electronics some time to compute and
respond (remember that the disk is always turning, whereas tape can be stopped between records.
The storage of data on a disk at the request of a program using an operating system like DOS requires
a great deal of further study beyond the scope of this course. However, the CT001'er will encounter
one other term that should be defined. This is the cluster. Depending on the medium being used, a
cluster is defined as one or more sectors taken as a logical whole. In the case of the 3.5" diskette, for
example, a cluster is two sectors long. DOS never sees anything but clusters, while the drive only
knows sectors. The translation between the two is up to the lower level coding in DOS and the ROM
BIOS.
Another concept that is similar between floppy and fixed disks is the cylinder. A cylinder is easier to
see on a fixed disk because the drive may have many platters and therefore many recording sides.
The floppy can have cylinders too, but it has only two sides to work with. Basically, a cylinder is
defined on a multiple-head drive as the same head position, or track, on all of the surfaces at the same
time. For example, if the head mechanism on a drive is positioned at track 12, then every head on the
drive can all see track 12 at the same time. This describes a vertical cylinder logically on all the
media surfaces. A floppy cylinder has a maximum of two tracks, while a fixed disk can have as many
tracks in a cylinder as it has surfaces.

29

There is a particular reason why cylinders are important in data storage. This has to do with seek, the
process of finding a particular place on a disk to read or write. In a DASD device, seek is divided into
two types, mechanical seek and electrical seek. Mechanical seek deals with the random selection of
a track, as discussed above. The carriage on which the heads are mounted are moved toward or away
from the center of the disk and can go from one track to any other randomly. When the carriage
arrives at the target track, the mechanism waits to find the index. When the index is found, the head
for the specific surface is activated and the electrical seek begins. The sequential search into the track
to the desired sector is performed.
You will note that mechanical seek involves a physical motion of the heads, and will therefore take
sometime to complete. Electrical seek is faster, because there is not mechanical motion of the heads.
It follows that if we can minimize the electrical seeks, we can have an overall faster response of the
drive. If a large amount of data are to be transferred, say more than one track's worth, then if we go
from track to track to accomplish this we will have used multiple mechanical seeks and wasted time.
This is called track mode. If, however, the data are arranged vertically at the same track position on
every surface, then we can go from track to track vertically using only electrical seeks, and save
much time. This is the standard method used by DOS for both floppy and fixed disks in current
systems, and is called cylinder mode.
The primary differences between fixed and floppy disk systems center around capacity and speed.
Floppy disks, like magnetic tape, are a contact medium. The head is in contact with the surface of the
diskette coating anytime the diskette is rotating (red-light time). As a result, diskettes are considered
consumable media. This means that they are guaranteed to wear out, and they usually do it when
you are least prepared to have it happen. Hence, the habit of backup is important to develop early.
Backup simple says that data and program material that you create, that are your own intellectual
property, should be copied every time you make changes in it, so that if the diskette dies without
warning, you have at least some of your work saved.
Because a diskette is a contact device, the of rotation of the diskette is limited. Five and one-quarter
inch diskettes topped out around 350 rpm, while the 3.5" diskettes rotate at 600 rpm. The access
time, essentially the time for mechanical seek, in a floppy drive is great, in the order of hundreds of
milliseconds. The data density, that is, how many bytes can fit on one track, is limited by the speed
of the medium, since we can't put bits too close together without them mixing.
Because fixed drives do not have their heads in contact with the medium, the speed of rotation can be
much greater. Speeds of 3600, 4200, 4500, 5500, and greater are becoming common. This speed
allows the data rate to be much greater for transfer between the drive and the computer. The
existence of multiple surfaces allows a more efficient cylinder mode arrangement for the data.
Disk Drive Controllers
Of interest to PC enthusiasts are several controller options that are common. It is beyond the scope of
this course to address all the details, but here are some essential facts for reference.
The disk controller is either a discrete card that is inserted into a slot or built onto a motherboard. It
logically stands between the drive and the computer system and operating system. All the requests
from DOS or BIOS to move data to or from a drive must pass through the controller. The levels of
smartness built into the controller, or the lack of it, can be described by looking at the evolution of
the PC.

30

PC/XT: This machine was introduced in 1982 and was a standard IBM PC with a 10-megabyte drive,
a controller, and a larger power supply included. Both the controller and drive were dumb. They
depended on the motherboard processor to do all their computation for them. When the drive
controller card was plugged into a slot, a block of programming was inserted into the memory map in
the upper 384K of the system. This code was an extension of the motherboard's ROM BIOS, and
allowed the machine to deal with the larger volume of the drive. PC DOS version 2 was introduced to
accompany this addition, and it provided for subdirectories, larger FAT tables, and API's. The
controller was an 8-bit device, transferring single bytes at a time to the computer. The drive used an
ST412/506 interface defined by Shugart.
PC/AT: This machine was introduced in August of 1984. It provided a 16-bit controller, a 20megabyte fixed drive, and an Intel 80286 processor. Although IBM and Microsoft soon released
OS/2 version 1 for the machine, it was almost exclusively used by clients as a fast PC. The drive was
connected to a large controller by an ST412/506 interface. However, much more control of the drive
was given to the controller to off-load the motherboard processor. Buffering was included so that
data could be stored temporarily allowing less interaction and interference on the motherboard. The
speed of the drive was increased as well. PC DOS 3.0 was introduced to address the directory
structure of the larger drive.
INTEGRATED DRIVE ELECTRONICS (IDE): This interface is a result of the miniaturization of
the traditional PC/AT interface. As semiconductor technology advanced, the drive electronics needed
to move the heads and transfer data became smaller to the point of needing only a couple of highdensity chips. Similarly, the electronics needed for the controller was shrinking. Eventually, it was
possible to place the electronics for the drive and the electronics for the controller together on the
drive itself. This made the board plugged into the slot nothing more than a connector, since all the
brains were now on the drive. These drives also were smarter, further off-loading the motherboard
system.
SMALL COMPUTER SYSTEMS INTERFACE (SCSI): There are a variety of more advanced
interfaces that have been used, but the one most successful and most widely supported is SCSI. This
is a subsystem, not simply an interface. The controller board contains not only an interface to the
motherboard but also a complete intelligent controller that can carry on business with the drives with
no intervention. The drives need to be smart too, and they have their own controllers built in as well.
When a request from the operating system comes to the SCSI controller for an attached drive, the
controller will contact the drive, send it commands, monitor the data transfer, and handle ending
status with little or not help from the system. When the transfer is completed, the SCSI controller will
simply advise the system that it is done. Other types of devices can be SCSI besides disk drives,
including tape drives, printers, video display adapters, and network adapters. There are several
versions of SCSI, including a fast/wide version, that can handle 20 megabytes of data transfer per
second. Unfortunately, there is a variety of non-standard products in the market, and there is no
guarantee that brand x drive will work with brand y SCSI adapter, even if they say so.
Data Communication
Data Communication has also been called Telecommunication and Teleprocessing. The essential
idea is that information (data) is sent over long distances between systems through a hostile, nondigital medium. The classic example is using a modem to communicate between a terminal and a
mainframe over telephone lines. The reality has now expanded to a point where the communication is
likely to be digital as well as analog, and the distances can be a few feet to around the world.

31

In the beginning, there was the mainframe and the terminal. The mainframe was the center of the
universe, and did all the work. When video display devices and automated typewriters became
affordable, attaching these to the mainframe within the same building became the preferred method
of data entry and interaction with the frame. Initially, these were dumb terminals, which had no
ability to do anything other than display letters and numbers on the screen. They had no processor,
and could not compute anything. The had no memory other than that needed to keep the characters
on the screen. The display was rudimentary, with little or no variation other than plan text.
As technology advanced, and the microprocessor became available, smart terminals appeared, both
as printing and video display devices. These could remember to give certain characters on their
displays attributes, such as blinking, reverse video, automatic underlining, etc. They could protect
certain fields on the screen and unprotect others, thereby allowing the filling in of blanks on a form
without destroying the labels for the blanks unintentionally. The video displays could have an
attached printer and could be directed to feed data to it under control of the mainframe.
Just before the advent of the personal computer, a variety of intelligent terminals began to appear.
With or without microprocessors, these machines were able to carry on their own business locally,
and needed to communicate with the mainframe only occasionally. These first appeared as automated
cash registers in department stores, followed by super markets. With the advent of the PC, the
ultimate intelligent terminal appeared that could do everything for itself, and contact other systems
only to share data or results.
Using these terminals and the still present mainframe, a variety of methods sprang up. Remote Job
Entry was an early attempt to allow a computer at a remote location to do your local work. Card
readers and printers with communication controllers could feed a deck of punched cards to the
remote system over the phone lines, then print the result. The cost of the phone lines and the machine
were still much less than the cost of another mainframe. Timesharing was a similar technique in
which terminals either remotely or at the system site got little slices of a computer's time (time
slicing) and shared the system with others. Each user, the average human being much slower than the
average mainframe, felt as if the system was dealing just with him/her. However, the computer was
dealing with many users at once, and taking advantage of the techniques of multiprogramming, all
the users could be helped.
As technology has progressed further yet, we have had two approaches spring up that take advantage
of the ideas of data communication and intelligent terminals. Toward the end of the 1970's, the term
distributed processing became popular. As people saw that the microprocessor and intelligent
terminals were taking on more and more of the work formerly done by mainframes, the idea was put
forth that we didn't need a mainframe at all, but rather a network of highly intelligent terminals that
could do their own local work and share data as needed. An example would be the LA Community
College District that at one time wanted to be minicomputers at each campus and have no mainframe
at the central office. The processing needed by the district as a whole would be distributed to the
campuses and a sort of parallelism would result.
The current thinking along these lines is based on the fact that distributed processing may or may not
have worked, depending on a large number of variables. In many cases, it was found that although
PCs and intelligent terminals could do a lot of the work, there were some things that the mainframe
could do better. An example is a centralized archiving function. The student records for the school
district is an enormous collection that simply can't be split up among campuses without losses in
speed and accuracy. So the idea is to let the PCs do the work of interfacing with the users, which they

32

do well, and let the mainframe do the archiving and record keeping. The two can communicate
whenever necessary. This is called client-server computing.
With the advent of the PC in more and more homes and offices, there have been a number of
subscriber services that have appeared. These are simply large systems with extensive
communication ability that can connect to hundreds of dial-in users at once, and can also
communicate with other systems that have specific services to offer. Examples include Prodigy,
CompuServe, America OnLine, etc. All of these offer access to services such as airline reservations,
stock market quotations, purchasing services, etc. One of the most popular services is electronic
mail, or Email. This is an outgrowth of several different services that allows you to send and receive
messages from others at remote location, similar to a typed letter or telegram.
The Internet is best defined as a "network of networks", where initially university campuses and
Department of Defense offices, each having their own networks, connected together to share data and
email. Only recently, in the last three years, have the commercial interests taken over and tried to
make money with it. Originally, it was designed as an open, free environment where a "hacker" was a
good thing to be, and where a great deal of experimentation by college students at all hours
developed a lot of the technology and software we now use. The use of the World Wide Web
(WWW) as a resource for research and communication has been overshadowed by entertainment and
unnecessary traffic.
Data Communication Hardware
A complete examination of the current state of data communication technology could take whole
semesters in itself. However, as an introduction, here are some essential facts. Keep in mind that the
original way of dealing with the technology of data transfer between computers was via the telephone
line. The telephone system, particularly the one in the typical home or office, may be fine for
humans, but it is an exceptionally hostile environment for digital to use as a path. It is an analog
world, and until recently, it was necessary to take a digital data stream and convert it analog signals,
pass it from point A to point B as such, and then convert the analog signal back into digital data at the
receiving end. This is still how it is done for the average user. The device used to do this conversion
is called a MODEM, which stands for modulator-demodulator, with the modulation being the
conversion from digital to analog, and the demodulation being that from analog back to digital again.
LINE CLASS: Line class defines how well a particular transmission line can carry certain types of
signals. Voice grade lines are the class that you would have on a home telephone, and they can allow
two people to speak to each other without noise or confusion. Sub-voice grade lines have so much
static and interference that people would quickly hang up. These can still be used for some types of
telegraphy. Leased lines have better than voice-grade ability and can handle data in thousands of
characters per second.
BAUD RATE & BITS PER SECOND: The idea of baud rate was put forth by George Marie
Baudot, an officer in the French Signal Corp, who was interested in the automation of telegraphy. He
defined the unit called the baud as the reciprocal of the transmitter's frequency. It is therefore the
theoretical maximum number of binary bits that can be sent in one second from a given device. Bits
Per Second, however, is the actual number of bits sent in a given second. If the baud rate of a
machine is 2400, but no data was sent for a second, the baud rate would still be 2400 and the BPS
rating would be 0.

33

HALF & FULL DUPLEX: Duplexing has to do with how many data streams are open between two
systems at a time. Half duplex transmission states that data can go either way, but only in one
direction at a time. Full duplex transmission states that data can go both ways at the same time.
MICROWAVE & SATELLITES: These are methods of sending large amounts of data over long
distances, both of which are expensive and no longer considered except for certain purpose.
FRONT END PROCESSOR: A Front-End Processor (FEP) is a specialized computer that is
designed to operate between a mainframe and a group of dial-in circuits. The mainframe is fast and
deals with large amounts of data at a time, while dial-in terminals are much slower and deal with data
on a character-by-character basis. The FEP does the work of speed and protocol conversion so that
the mainframe doesn't spend time on trivia.
Local Area Networks
Local Area Networks (LANs) have become a major player in the business in the last ten years. As the
idea of the PC caught on, and as more products and technical advances appeared, it became obvious
that it should be possible to connect PCs together so that they could share data and printers. This was
the first approach, the sharing of files between systems and printers that were expensive.
The first commercial approaches were based on the idea of peer-to-peer networking. This implied
that all the PCs connected together were of equal capability and importance. All had fixed disks, all
had sufficient processing power, etc. Each peer would make available, or share, the resources of its
machine that it would allow others to see, and make private files and other items that were secret to
the one system. When a peer wanted a file that was located on different system, it would make a
request for it. If the station where the file was located had marked it as sharable, the requester was
granted access to it. A more recent version of this is Windows for Workgroups.
Now, the standard approach is to have one or more large systems attached to a bunch of smaller ones,
with the larger one being the reservoir, or storage bucket, for all the shared programs and important
files. This machine is called a server. It might also have special printing or communication hardware
as well, that could be shared on demand.
There are several types of data conductor in current use in LANs. The original was called coaxial
cable, in which an inner wire is surrounded by insulation, then by an outer shield or conductor, which
in turn is coated with an outer insulation. The two conductors are "co-axial", that is, they share a
common center. Coax has been used for many years for radio and television transmission, and can be
used for data, although it is subject to many problems based on the laws of physics dealing with
transmission lines (reflected and standing waves, etc.).
Another type of wiring is called Shielded Twisted Pair (STP). Twisted pair is simply two wires
twisted together over a long distance, which keeps them together and provides a certain small amount
of noise cancellation. STP is a heavy duty version of this, where the wires are shielded by a braid or
heavy foil. This protection does two things. First, it protects the signals in the cable from being
affected by outside interference. Second, it gives the line a characteristic impedance as a means of
dealing with the transmission line laws of physics.
Unshielded Twisted Pair (UTP) is an attempt to reduce the cost of STP and was originally intended
to make use of the telephone lines strung through commercial buildings. This is a hostile

34

environment and so the term UTP can mean anything from junk telephone wire to expensive tefloncoated cable.
Coax, UTP, STP, and any type of conductor that uses metal and passes electrons is subject to the
laws of physics for transmission lines and induction. Current passing in a wire can induce
interference into adjacent cable. Outside interference, such as spark plug noise, motors, and lightning
can induce interference into the cable. Ideally there should be a method of data transfer that is not
subject to these problems.
Fiber optic cable is subject to none of the problems of induction. It does not generate interference,
nor is it affected by it. It is far faster than electronic conductors. This is because the signal is
conducted by light, rather than electron flow. Electrons, you will recall, are matter - they take space
and have weight. They have magnetic poles that can interact with other magnetic materials nearby.
Light, on the other hand, is not matter, but pure energy levels. This is similar to the energy concept of
magnetic lines of force. But since there is no matter moving in light, and since it does not have any
electromagnetic polarity, there are no problems with interaction between a light beam and other
conductors. In fact, you can shine light in opposite directions down a fiber and the two will not
interfere with each other.
The fiber itself is a strand of flexible glass that is coated with an outer layer of harder glass. The point
where the two glasses meet, along the length of the cable, forms a reflective surface. Light coming
down the softer fiber will attempt to leak out, but be returned into the fiber by the reflection. If
coherent light is used (light of one frequency, such as from a LASER), the losses are very small. So
the light can travel over long distances with little deterioration in signal. The unit of light discussed in
fiber optics is called the photon. Photons travel much faster than electrons do. Therefore, the speed
of data transfer can be much faster over longer distances with less losses than with electronic means.
The topology of a LAN is the road map that the information follows as it makes its way around the
system. Data are sent between computers in packets, which are fixed-length groupings of bytes that
have an addressing scheme and error checking built-in. Common topologies include the star, where
all the stations are wired to a central point and the cabling radiates outward from that point in a start
pattern; bus, a general term that describes a central long path with side paths attached to it along the
way; ring, in which the data path is a giant circle, and the packets pass through all the stations as they
go around. The term backbone is used to describe a high-speed data path that distributes the signals
to local areas. It is not relegated to a particular type of network.
Two types of networks are very common today: The first is called ethernet, and it was proposed in
the late 1960's by the Xerox Corp. and others. It is characterized by using coaxial cable as its main
transmission medium, both in a "thick" version that looks like water pipe, but is low loss, and a "thin"
version that is for short distances. Because it uses coaxial cable, it is limited to a 10 megahertz
signaling rate, and is subject to the physics limitations of such. There are a variety of expansions of
this type of network, using multiple conductive paths, etc., to get more speed.
The token ring technique was proposed by IBM and has been adopted by NASA for the space
station. This is a ring topology, and the top speed is 16 megahertz. It uses UTP or STP, but no coax.
The CITYnet system at LACC is of this type.
Fiber Distributed Digital Interface (FDDI) is an outgrowth of the token ring. It assumes that the
data path is 100% fiber, with no copper conductors involved. As such, it has a 100 megahertz
signaling rate. It is essentially a token ring topology, and similar to token ring in operation.
35

Asynchronous Transfer Method (ATM) is the state of the art in data transmission and is aimed at
users who need to send data at high speed to and from a large number of users at the same time. It
uses multiplexers and exotic packets to merge a large number of users into a single data stream,
transmit the stream, then break the users' streams apart at the receiving end. It is supported by IBM
and others.
The word protocol is very important in networking. A protocol is an establish way of doing
something. There are a great many protocols in data communication, each addressing some particular
aspect of sending bytes somewhere. A lot of these are historical, or replace old protocols with update
versions. Many are the result of promotion by a particular company that works only with their
equipment. Here are three of the most important to PC networks.
LAN Manager/LAN Server: These are a set of software server packages that support protocols
originally proposed by Microsoft and IBM before they got their divorce. The protocols include
NETBIOS, an API that allowed programmers to access the network easily, a later version called
NETBEUI, and finally NDIS, which is the current method of data transfer in an IBM mainframe based network.
IPX: This protocol is part of the Novell Netware suite of softwares for servers, and was their own
approach at trying to be the big canary. Companies that have large Novell installed basis use it as
their standard procedure. However, the netware methods and those of IBM/Microsoft were
diametrically opposed and a network with both is difficult at best to manage.
TCP/IP (Transmission Control Protocol / Internet Protocol): This series of protocols was developed
long ago when the internet was first developed under the Department of Defense. It is a simple
protocol compared to others, and has been accepted as an industry standard. It does not support a lot
of the flashy methods of the other systems, although a great deal of development has been done to
accommodate graphics, sound, etc. While each company in the business has tried to make its
protocols the world standard, none have succeeded, and TCP/IP still reigns as the only standard
supported by everyone.

36

SECTION 4 - Software
Which is it?
This course can only begin to offer the ideas and practices involved in the creation and operation of
programs and software that have evolved since computers first began. There are entire sciences that
have built up around even small ideas that combine with others to form an enormous body of
knowledge, skills, and tools with which programs may be created. As a beginning, it is convenient to
divide software into two areas, system software and applications software. System software includes
programs and related material that is essential to the function of the computer itself. It deals with the
computer at the hardware level, and contains programs and routines that enable the computer to
function, to communicate with the I/O devices, and to communicate with the user. It provides a
platform or base of software support upon which the applications programs can operate, making
available the resources, storage, I/O, and hardware of the system to the user's needs. It is generally
referred to as the operating system.
Application software consists of those programs and material that are designed to provide the user
with some particular type of service. Examples include word processing, accounting, drawing,
communicating, etc. Application programs are designed to deal with the user of the system, that is,
interact with the user via input and output devices such as keyboards, mice, screens, and printers.
When an application program needs the assistance of an I/O device or system resource, it asks for
help from the operating system. The operating system will carry out the request, such as reading or
writing from/to disk, displaying something on the screen, or sending something to a printer. If data
has been entered into the system as a result of the request, the data is forwarded by the operating
system to the application to allow it to continue processing.
The interface between the operating system and the application software is the topic of much
development and contention. Generally, the operating system provides a common method by which
the application can ask for help. The MSDOS "Interrupt 21h" is a classic example. In this case, the
application sets up values in certain registers of the computer and in certain memory areas, then
executes and INT21h instruction. The operating system responds to the interrupt, inspects the
registers, and acts accordingly. When the requested process is finished, the operating system returns
control of the computer back to the application. As operating systems became more complex and the
use of the Graphical User Interface (GUI) evolved, different such interfacing techniques were
developed. There now exists a number of method sets by which the application communicates with
the operating system, and the one used depends largely upon which operating system is in use in the
machine. These interfaces are called Application Program Interfaces (API).
THE OPERATING SYSTEM
The operating system can be roughly divided into three parts, the Kernel, the I/O Control System
(IOCS), and the Shell.
The Kernel is the heart of the system and provides the brains and the personality of the software
environment. The kernel manages the passage of program instructions to the processor,
communicates with the I/O system, sends and receives information to/from the shell, and determines
the overall operation of the computer. In those systems where multiprogramming or multitasking is
supported, the kernel makes the decisions about which request for processing time is allowed first or
last, and how long the time slice will be for the selected process. The overall nature of how the

37

system functions, how you write programs to run on it, and its general personality are determined by
the kernel.
The Input Output Control System (IOCS) provides the interface between the kernel and the
computer's hardware, particularly the I/O devices. The IOCS can be divided into two parts, the
physical IOCS and the logical IOCS. The physical IOCS is responsible for communicating with the
electronics of the computer and its I/O devices. It deals with sending data to a device or receiving
data from it, checking to see if a device is available for transmission, waiting on a device if it is not
ready, and a myriad of error checking and reporting techniques. In the PC, the programming
contained in the ROM BIOS is the classic example of low-level code that deals directly with the
hardware.
The logical IOCS is involved with data blocking and unblocking, high level error checking, and
direction of data flow through the computer. Data blocking involves the conversion of data from an
organization that makes sense to the programmer to one that is demanded by the hardware. For
instance, in MSDOS, the programmer may decide that 70 characters is sufficient for one data storage
record. However, the disk drive deals in 512-byte sectors. As the program executes and data is taken
from the user via the keyboard and sent to the disk, the logical IOCS portion of MSDOS converts the
70-byte records in to groups of 512 bytes so that they will fit into a disk sector; this is the process of
blocking. Since 70-byte records will not fit easily into a 512-byte group, the point at which one
record must be divided and carried into a second 512-byte block is dealt with by the logical IOCS.
When reading information from a disk 70 bytes at a time, the data unblocking must occur to make
the 70-byte records whole as they are taken from the 512-byte sectors.
The Shell is the user interface for the operating system. It is easy to see in MSDOS, since the user at
the keyboard, who is watching the screen, is interacting with the system via the shell. MSDOS
provides a default shell called COMMAND.COM. This is the module that provides interaction with
the user via keyboard and screen at the command-line level. When a user enters a command such as
COPY at the "C:\" prompt, the prompt, the motion of the cursor, the appearance of the letters C-O-PY on the screen and the subsequent function are all provided by the shell. It knows how to interact
with the kernel, and will direct the appropriate requests to the kernel based on the entry of the user.
While COMMAND.COM is the usual MSDOS shell, it is possible to replace it with another of your
own creation. A common example of a replacement shell is the Graphical User Interface (GUI) of
Windows on a PC. Here, the command line system has been replaced with a graphical representation
of functions and programs through the use of icons, little pictures that represent a program or
function. The entire screen of the display is treated as a graphical entity. The placement of multiple
windows, or display boxes for programs and functions, and the execution of the programs themselves
treat the display in a graphical (pixel by pixel) rather than a text (character by character) manner.
Operating System Characteristics
The development of the operating system over the years has followed that of the computer hardware.
As the hardware became faster and was given more ability to do work, the operating system followed
suit. In some cases, the operating system was modified to implement a particular feature, and, when
complete, the hardware was modified to implement the feature more easily. Some of these features
follow.
Multiprogramming is the technique of allowing what seems to be more than one program to execute
at a time in a system. However, if the processor has only one ALU, then only one instruction can be
38

executing at a time, in a traditional system. Therefore, the system gives small amounts of time to
each of the several programs that are running at the same time (timeslicing), and the appearance is of
several things happening at once, even though only one is truly happening at any instance.
Multiprocessing implies that the computer hardware actually has two or more ALUs, that is, that it
can indeed carry on two independent processing streams at the same time. This is also called parallel
processing. Large mainframes were designed with two processors sharing common memory and I/O,
but carrying on independently most of the time. This technique has been expanded greatly, and is
used in current microprocessors where there are several types of process going on simultaneously.
Multitasking is a method be which one system and one user can accomplish multiple tasks in a
circular fashion by multiprogramming and time slicing. What makes multitasking different is that it
implies one user, whereas the traditional multiprogramming with timesharing assumed multiple
users.
Real Time Systems are those that deal with program execution in step with the actual elapse of time
in seconds or microseconds. In a traditional batch system, a program execution was started, and the
time it took to run depended on the bulk of data to be processed and the size of the program in
general. In real-time systems, however, the idea is to keep track of the actual elapse of time and
interlock the execution of the program with it. This is used in industrial and control applications,
where a valve needs to be open for only five seconds, a motor needs to speed up in ten seconds.
Embedded Systems are those where the hardware and/or software are placed inside a large
mechanism, or are designed to work independently at a remote distance from their control point.
These are common in robotics, industrial control, and environmental control systems.
Virtual memory is memory that does not exist, but can be used anyway. You can put data into it,
keep it there, and get it back when you need it. However, there are no memory chips involved. OK,
so where is the data stored? It is stored on disk. OK, so why the big deal - why not just call it disk
storage? The answer is not in the physical way data is stored, i.e. the disk drive, but in the logical
way that the data appears to be stored by the program.
Virtual memory requires both special hardware that allows addressing information to be converted to
disk drive locations on the fly, along with an operating system that monitors the need for this
conversion and causes it to occur. The advantage is that the programmer writes as if the memory
available to him/her was virtually infinite in size. The programmer is not hampered by a small real
memory in the machine. In the old days, if the programmer needed to process more data than would
fit into the system's real memory, the saving of data onto disk and the retrieving of data off the disk
was up to him/her. There were no provisions in the hardware or software to ease the situation. Now,
however, if a programmer needs to store 8 megabytes of data on a machine with only four megabytes
of memory installed, the virtualizer of the system will take over and allow the 8 megabytes appear to
be stored, even though only a small amount of the data may be in memory at any one instant. This is
a major service to programmers working in large systems.
Swapping of data to and from disk comes into play in multiprogramming or virtual memory systems
were the demands for storage made by the current programs exceeds the available memory. The real
system memory is divided into "page frames", usually 4 kilobytes in size. These are sometimes
referred to as swap blocks. The system will look at the available memory, and find the one that has
been least recently used (the LRU rule). This block of 4K is swapped to disk in a special area, and the
frame is then made available to the needing program. If the data in the page to be swapped has not
39

changed since it was brought into the memory from disk, the "dirty bit" is clean (no change was
made), and the new data coming in can be set over the old directly. If the dirty bit is on, indicating
that data had changed in that frame, the data is stored first, or swapped out to the disk.
It is possible that a system can have so little memory and so many requests for service that it spends
all its time swapping blocks and no time processing. Nothing is accomplished, even though the
system is working very hard. This is called thrashing. An example of swapping and thrashing can be
seen in Windows, where the machine has a small RAM memory and a large program is started. The
system becomes very slow, with the drive light on all the time.
Examples
MicroSoft Disk Operating System (MSDOS) is an operating system that is very common and well
known in the small computer area. It provides sufficient operating flexibility to support most
commonly needed application programs, and indeed has a lot of features that have been little used. It
works best in smaller systems, and where character-based use is the norm. PCDOS from IBM is one
OEM's version of MSDOS, and has been modified at the lower levels to fit the needs of the IBM
product line. Today, MS-DOS is used primarily in embedded and development systems that do not
need a graphical user interface.
Windows 3 is a shell that rides on top of MSDOS, and provides a GUI via the mouse and screen to
allow graphically oriented programs to be used. The question of whether the program needs to be
graphical or not is left to the reader. Although the idea was to allow the user to have more than one
program active at a time, that is to dummy up what would appear to be multitasking, the result is that
a user can interface with only one program at a time even though they may be able to see more than
one window at once. When wishing to start using a visible program other than the current one, the
user selects the new program with the mouse. This causes the window of the new program to come
forward and the current program window to recede. There is a great deal more to this, however, as
such an action causes a "context switch". All the data of the current program must be taken out of the
lower 640K of RAM and stored either in high memory or onto disk (swap blocks). The data and code
for the new program must then be loaded from disk or high RAM into the bottom 640K of the system
before it can take over. This swap can take many seconds in some cases.
Windows NT and OS/2 began as the same product in a cooperative effort between MicroSoft and
IBM when the PC/AT was introduced in 1984. They cooperated in its development as OS/2 until a
falling out occurred between the companies. IBM carried on the project with OS/2 version 2, while
MicroSoft did things somewhat differently as Windows NT. The two operating systems work very
much the same. Both systems take advantage of the Intel 80386 processor design and its derivatives
that support multitasking, virtual memory, the use of protected mode and virtual real mode (486, the
Pentiums, etc.). Both work very well as network server platforms. Because of the preponderance of
Windows in desktop business applications, OS/2 has pretty well vanished from the scene except in
IBM mainframe installations.
Windows 95 is a MicroSoft product that is intended to replace Windows 3 in the home and small
business environments. It makes use of the hardware of the '386 to support multitasking, unlike
Windows 3 that was essentially a single-tasking device. It has extended the file system of Windows 3
to accommodate long file names, but this is a kluge, not a better way, and may not be a good idea in
the long run.

40

Windows 98 is an upgrade of Windows 95 and provided a more reliable platform for programming.
Windows 2000 and XP are later generations of Windows NT, with extensive additions to implement
server requirements. Windows XP is supposedly all 32-bit, that is, there is nothing left of the original
16-bit programming that was typical of earlier Windows systems. UNIX is an operating system first
developed at the Bell Labs of AT&T as an in-house product. It has been widely used in universities
and engineering and scientific circles, and is the primary system found on the internet. It is userunfriendly, and has a steep learning curve for the new user. It is used in heavy graphics environments
such as CAD and engineering. Most servers on the world-wide web are UNIX based, since all of the
Internet protocols were developed on UNIX systems. AIX is the IBM version of UNIX.
LINUX is an open-source, free UNIX clone that was developed and runs on Intel systems. The opensource initiative is extensively supported world-wide, and this is the primary operating system to
support these efforts.
OS/X Operating System 10 is the latest Apple Macintosh operating system. It is based on BSD UNIX
and is a major diversion from the earlier Macintosh' systems. It is the first Mac system to have a true
command line interface in addition to the usual Mac GUI.
VMS is the general name of the systems used in the Digital Equipment Corporation VAX series of
computers. VM/IS is one of the systems used in IBM mainframes. Both of these provide
multiprogramming, multitasking, timesharing and a wide variety of development tools.
Solaris Solaris is the UNIX operating system for computers built by Sun Microsystems. Sun servers
run most of the web servers in the world in larger commercial installations. The latest version of
Solaris, Solaris 9, is extremely secure and runs on both Sun and Intel systems.
APPLICATIONS SOFTWARE
Applications software and programs are those that provide the average user with some sort of desired
result. These can be anything from accounting reports, business letters, graphical designs, and
engineering drawings to data exchange between offices and information lookup. In all cases, the
expectation is that 1.) the user will be directly interacting with the software, and 2.) the software will
need the help of the operating system beneath it to get the job done.
Applications programs can be divided into two broad areas, horizontal and vertical. Horizontal
applications are those that can be used by a wide variety of people and businesses; they are not
business-specific. Word Processing is the classic example. Word processing for the preparation of
printed documents can be used to generate business letters and forms, school homework, personal
letters, documentation and record keeping, etc. So, a horizontal application is of a general nature,
with no particular user in mind.
Vertical applications are used only by certain people or types of businesses. A billing system for a
doctor's office is a good example. This system can keep track of visits, patient histories, medical
procedures, and anything else of a record-keeping nature that would be common to that location. It
would not be of benefit, however, to an engineer who wishes to put together cost estimates for a
building.
A question that regularly appears in user's minds is whether to purchase programs that are already
written, or to write them yourself or have them written to your specifications. Packaged programs
have several advantages, such as already having been debugged, adhering to standards, using
41

conventional file formats, and having technical support available from someone that hopefully knows
a lot about it. It would be unwise to write yet another word processor, for example, since there are
many already available that have universal acceptance, work well, and are all ready to go. No need to
reinvent the wheel.
Custom programs are those that are written by the user or the user's contractee to certain
specifications. Typically, these would not have universal appeal. They might be used only in certain
places or conditions, or to record data or solve problems that belong to a certain job or need only.
Examples include data management for a particular kind of business (Federal Express), a certain
highly specialized need (Space Shuttle), or because the user wishes to learn how to program or to
arrange data certain way.
Examples
Examples of application programs include:

Word processors that allow the organized writing and printing of documents, letters, etc.,
(Word Perfect, Word)
Desktop publishers, which allow word processing to take advantage of page layout including
graphics, word art, and embedded pictures (PageMaker, Framemaker, etc.)
Spreadsheets, which allow numbers to be related to each other in an X-Y matrix, usually
used in accounting, but usable in many other areas including engineering and sciences (Lotus,
Excel)
Drawing programs that allow graphic arts to be created using the computer as easel and
palette (Corel, Micrographix Designer)
CAD (Computer Aided Design) programs that permit more accurate graphics to be created,
including scaling and 2D and 3D presentation, and image rotation (AutoCad)
Simulation programs that allow engineers to construct circuits or processes on screen and
that then execute a simulated version of the processo or circuit (Multisim)

The Art of Programming


Of course it is debatable as to whether programming is a true art, but it can be shown over and over
that some people can develop programs quickly and elegantly, that run smoothly with a minimum of
system resources, while others end up with clunky messes that never work quite right.
Suppose that a company wishes to replace an old computer and accounting system with new
equipment and an accounting package that can be expanded as the company grows. The proper
procedure to accomplish this would include
1. An analysis of the way the company does accounting now, followed by an analysis of how it
can be improved and made extensible;
2. A planning or laying out of the big steps needed to implement the new ways, using block
diagrams and "big picture" methods;
3. A generation of a series of detailed plans that define the specific steps needed to implement
the block diagram blocks;
4. The writing of program code in a selected language that implements the details of step 3;
5. Trying the program modules out, one at a time, and correcting problems;
6. Integrating the modules and correcting problems;
7. Putting the programs into place, running along with the old system to compare results;
42

8. Upgrading and expansion as needed.


Each of these steps may be done by different people, depending on the size of the company and of
the job. The first person involved is the system analyst, who is responsible for steps 1 and 2, and
oversees much of the rest. The analyst is a specialist in one or more areas, in this case corporate
accounting, and who is also skilled in the use of computers to solve problems. The first requirement
is the more important. The analyst is an accountant first, and a computer wizard second, not the other
way around. It is possible that the analyst may never see the computer at all. Most of his/her work is
done at a desk with pencil and paper. The result of the analysis is called a system flowchart.
The system flowchart is then broken down into a series of flowcharts that define the steps needed to
implement each of the blocks in the system flowchart. These are called program or detail
flowcharts. Again depending on the size of the project, these may be created by the analyst, or by a
senior programmer or other staff member. In this step, everything that the computer must do to
implement the new system must be logically explained, and the order of execution specified.
The programmer then will convert the detail flowcharts to program language. The language to be
used depends on the nature of the project and hardware, the language already preferred and in use in
the company, the nature of the job, etc. There are many levels of programmer, from trainee or
associate to senior. Generally, the rank is based on years of experience. However, it is the
programmers' job to create the original code in the language selected to implement the new system.
This is called source code.
Depending on the circumstances, a computer operator may be involved in converting the source
code to machine -usable form. The new program is run, and will almost certainly not work. The
programmer, operator, and analyst now go into a circular sequence of write, test, crash, look over,
until the product eventually works to satisfaction. This is called debugging, that is, getting bugs out
of the program.
Programmers will also be involved keeping the programs up to date after the initial product is put
into operation. If smart, the people involved will run both the new system and the old in parallel for a
while and check to see that the results are the same between them. Eventually the new system will be
declared golden, and the old system discontinued. As time elapses, changes will be necessary to
accommodate new business practices and conditions. Modifying a program after it is in place to keep
up with such minor changes is called maintenance.
The actual preparation of a program for the system includes the writing of the source code in the
selected language, the conversion from source code to object code, further conversion to executable
code through linking, and trial on the system as an execution.
Taking the detail flowchart as a guide, the programmer writes lines of code in the selected language
either into a terminal or PC, or on paper. The lines are captured in ASCII, and must adhere to a set of
standards and procedures as defined by that language. These are called the syntax and grammar of
the language, and are similar to the syntax and grammar of any language. The position of special
characters, spelling of certain words and organization of characters in the lines are all spelled out in
the language in use.
The above discussion illustrates top-down design. The idea is that the problem of a new accounting
system is approached from the top and detailed downward to the smallest details at the bottom. As
the analyst designs the system and works with the senior programmers to implement it on a particular
43

computer with a particular language, they may elect to use structured programming as a method.
Structured programming is simply a method of writing code such that every process that depends on
another, smaller process, expects the smaller process to complete first before continuing. Certain
languages are particularly suited to this approach. It may also be decided to implement objectoriented programming, which takes advantage of language characteristics to build logically
complete blocks called objects that include all their own code and data, and can take on a life of their
own within the system.
Languages
There is a hierarchy of computer languages that have developed over the years as technology has
improved. Prior to 1964, commercial computers were designed such that the electronic level of logic
was the lowest level available. This was called the "red light" level, and it was the point where the
electronics via the front panel's red lights and switches were visible and accessible. When interacting
with the machine at this level, such as a technician testing the system by entering bits into the
switches, it was said that we were interacting at the machine language level. This is the level at
which the circuits of the system functioned and implemented the logic.
Each machine language instruction consisted of two parts. The first was the operation code or OP
Code, which was the action word that explained what was going to happen to the data. These were
typically ideas like "add", "subtract", and "move". The second part was called (incorrectly) the
operands. These could include the data itself that would be acted upon, or assist in finding the data
in the system, such as by specifying a register or a memory location. Usually, the operands specified
where the data was coming from for the operation, and/or where the data was going to when the
operation was completed. These are referred to as sources and destinations.
In 1964, IBM introduced the System/360, and with it the idea of the microprogram. This approach
treated the registers and data paths of the machine as sources and destinations, much as the machine
language had done. However, instead of static electronic circuits built to implement every possible
combination of source and destination, the ALU and other system parts were treated dynamically
following little instructions or steps in the microprogram. The machine language instruction "add"
could be implemented by several microinstructions that caused the data to move through the system
dataflow such as to accomplish the effect of an add. So, therefore we have a level of programming
below that of machine language, called microprogramming. While clever operators and programmers
could deal with the system at the red light level, only technicians are involved with
microprogramming.
Above the machine language level we are into symbolic languages. These use ASCII words or
character groups, called "symbols", to represent OP Codes, registers, locations in memory, and I/O
devices. Some of the character groups are already fixed by the language, and are called reserved
words. The idea of a symbolic OP Code is called a mnemonic. It is a group of letters that make it
easier to remember the function represented, instead of a bunch of 1's and 0's. Examples include
ADD, SUB, MOVE, etc. The operands can also be given names by the programmer. If we use a
mnemonic for an OP Code, a reserved word for a register, and assign a name of our own for a data
location in main memory, we might get an instruction like
MOV R3,WIDGET

44

Three types of symbolic languages have evolved. The first was called assembler language. The
language and its grammar and syntax are very closely determined by the machine on which it is to
run. It's characteristics are:
1. Product specific - assembler languages written for one maker's machine will not work on the
product of another company.
2. One line of code equals one instruction. Translation is on a one-to-one basis. The logic of the
instructions is simple, and limited to simple single events.
3. Complex operations may be provided by the language, or need to be created locally in a
macro.
Assembler is easy to learn (sort of) and is the basis for most operating systems and system-level
programming. Its results usually run faster than other types, and makes small code. It can be very
tedious to build large products with it.
The second type of language is referred to as compiler language. In this case, the symbolics are
removed from the hardware of the machine. It's characteristics are:
1. Non-product specific - FORTRAN 90 will run almost without modification on any system
that supports it.
2. One line of code can create large quantities of instructions. The logic can be complicated and
multiple.
3. Complex functions are usually built-in, and can be used frequently throughout the code.
Compiler languages take a lot of computer resource in disk and memory. The compilation process
can be lengthy. The results can be difficult to debug. The results can be elegant and effective.
The third type of language is called interpretive. In this case, the lines of code are translated one at a
time, and immediately executed. This resembles compiler languages, however, the execution of a
single line immediately following translation requires certain coding practices and adjustments. The
downside of the interpretive languages is that they execute slowly, and are not very good for final
production software. The upside is that since a programmer can make a change on one line and then
immediately test it, development time can be minimized.
Once a program written in an assembler or compiler language has been written into its source form, a
file exists that contains the lines of ASCII characters in the proper grammar and syntax. In the case of
the assembler language, a second program called the Assembler is used to translate the ASCII source
to binary object code. This may take one or two passes through the source code (one- or two- pass
assemblers). The object code that results is the binary equivalent of the source, but with certain
missing parts.
In the case of a program written in compiler language, the same process is followed. The source code
is given to a program called a compiler, which converts the source statements into blocks of binary
coding. This is more complicated than with assemblers, because the compiler languages allow more
complex statements and logic and are not restricted by the hardware so much. The resulting object
code, like that of the assembler, has missing parts.
The "missing parts" of the object code files are the result of the fact that the programming methods
used today make allowances for the fact that a complicated programming project must assume that
the code is written in parts, by different people, at different times, in different places. Bringing these
45

parts together into a common whole requires allowances be made for how the final product will be fit
together. Most important here is the allocation of main memory. Since the author of one module may
not have the details of the modules written by others, it is necessary to allow all of the modules to be
fit together all at once. This requires the use of a program called a linker, which links the object code
modules from the assembler or compilers into a final executable file. This file contains everything
needed for the program to execute in the operating system environment provided, including how to
find memory references made by various programmers in different places.
Examples:

MASM (Macro Assembler) - The assembler language from MicroSoft for the PC.
TASM (Turbo Assembler) - The assembler language from Borland Intl. for the PC.
FORTRAN (FORmula TRANslator) - The first compiler language for computers derived by
IBM in the 1950's and still in use for scientific and computational work.
COBOL (Common Business Oriented Language) - A compiler language for business data
processing and accounting.
BASIC (Beginner's All-Purpose Symbolic Instruction Code) - Created at Dartmouth as a
means of teaching the use of computers to non-computer students, e.g. students in Social
Sciences. An interpretive language, it has been widely used in microprocessor-based
machines. It encourages bad programming practices, however, and has been largely
abandoned.
PASCAL (after Blaise Pascal) - A language developed in the 70's in Switzerland to assist
teaching computer logic to beginning computer science students. It was brought to the US by
University of California at San Diego, and issued by them as an interpretive language with an
intermediate code that was also accessible, called pseudocode or p-code. It has replaced
BASIC as the initial programming language for colleges and universities. It is available as
both a compiler and interpretive language.
ADA (after Ada Augusta Byron) - A language designed and specified by the US Department
of Defense as a common standard for military projects. It was written in PASCAL.
RPG (Report Program Generator) - A simple small-system accounting and business language.
C - Created by Bell Labs of AT&T as a language to support development in UNIX. It is a
compiler language, and has become the language of choice for systems development in the
microprocessor-based systems, or where UNIX is involved. It is an object-oriented system,
heavily used in GUI development.
C++ - A superset of C, this language adds extensive object-oriented functions and support.
SQL (Structured Query Language) - A 4GL approach to creating database queries via a
common language idiom, into which other programs can link if the SQL environment is
supported.
APL (A Programming Language) - An interpretive language that uses a variety of special
symbols to implement primitive mathematical or logical functions. It allows complex
relationships to be defined with a few characters, and the program interpreting these can
execute complicated statements quickly.
JAVA - A partially compiled language that is supposed to be "platform independent". Created
by Sun Microsystems, it is designed to allow intelligent mini-applications, or applets, to run
under the supervision of an internet browser such as Netscape.
PERL - An interpretive language heavily used in web servers to implement CGI, the
technique of allowing a user to interact with a web page. It is primarily a text-processing
language, but can do math, network communciation, and disk processing as well. Heavily
used in UNIX and LINUX systems.

46

Last but not least, don't forget that the job isn't done until the paperwork is finished. This means that
it is essential to document your work with full comments in the source code, and detailed procedures
written down for its use. Do not assume that a programming language is "self documenting". There is
no such a thing, and nothing better than an accurate document to explain how a system works.

47

You might also like