Afraz Nawaz

HISTORY
In the late 1950s, faced with the need to rationalize it's computer product lines, IBM instituted a
research program having the objective of creating a range of software compatible computers that
would also capture its existing software investments. The result, introduced on April 7, 1964 was
the System/360, the first commercially available microprogrammed computer architecture (latter
to become known as complex instruction set computer, or CISC architecture). The success of
System/360 resulted in CISC architectures dominating computer, and later microprocessor,
design for two decades.
However, the ability to incorporate any instruction which could be microprogrammed turned out
to be a mixed blessing. During the mid-1970s, improved performance measurement tools
demonstrated that the execution of most application programs on CISC-based systems was
dominated by a few simple instructions, and the complex ones were seldom used. As a result, in
October 1975 the project was initiated at IBM's Watson Research Center which, four years later
gave birth to a 32-bit RISC microprocessor named for the building in which it was developed. In
the immortal words of Joel Birnbaum, the first leader of the 801 project and later designer of the
PA-RISC architecture: "Engineers had guessed that computers needed numerous complex
instructions in order to work efficiently. It was a bad guess. That kind of design produced
machines that were not only ornate, but baroque - even rococo."
The 801 was never commercialized, but a derivative single-chip implementation, the
Research/Office Products Microprocessor (ROMP) was used in IBM's first production RISC
system, the PC RT (introduced, like the first PA-RISC system, in January 1986), and was the
progenitor of today's RISC processors. Several start-up server companies, notably Ridge
Computer (July 1983), Pyramid Technology (October 1983), and Computer Consoles (December
1984) took advantage of the availability of UNIX (which meant that a C compiler was all that
was needed to obtain an OS and applications) beat IBM and HP to market with RISC-based
systems, but lacked the staying power of their larger competitors once the latter caught on to the
benefits of UNIX (a winnowing process which is currently underway again with Linux).
Meanwhile, RISC had caught the attention of the academic community, principally at Stanford,
which followed the IBM example of relying on compiler optimization and pipeline efficiency
and produced what became the MIPS architecture, and Berkeley, which focused on minimizing
inherently slow calls to external memory with a register rich architecture adopted in 1987 by Sun
Microsystems. By 1988, RISC processors had taken over the high-end of the workstation market
from the Motorola 68000 and within a few years dominated the server market too. Today, RISC
cores are found at the heart of every workstation and server microprocessor, although many
(notably Intel's IA products and IBM's zSeries eServers) disguise the fact rather well.
Merchant RISC Microprocessor Shipments (1000s)
Thru
1994 1995 1996 1997 1998 1999 2000 2001
ARM/StrongARM 2,170 2,100 4,200 9,800 50,400 152,000 414,000 402,000
MIPS Rxx00 3,254 5,500 19,200 48,000 53,200 57,000 62,800 62,000
Hitachi SH 2,800 14,000 18,300 23,800 26,000 33,000 50,000 45,000
POWER/PowerPC 2,090 3,300 4,300 3,800 6,800 8,300 18,800 23,000
Total 30,499 33,830 58,480 98,220 149,080 262,820 556,800 538,860
During the late 1980s and early 1990s, RISC processors began to displace CISC in the embedded
applications which account for almost all microprocessor volume, and by the end of the decade
market consolidation was well under way (see table above). MIPS was the volume leader
through 1998 thanks to it's Nintendo game win, ARM took over rather decisively in 1999 thanks
to cell phone usage, and the only one of the top four architectures to increase unit volume in
2001 was PowerPC. However, all is not quite what it seems: the reality is that the market is
bifurcating into a low-power segment dominated by ARM, and a high-performance segment
which will be fought over by Intel's XScale (formerly StrongARM), which also increased unit
volume, albeit on a relatively small base, last year, and the PowerPC architecture. The
implications of this for the remaining RISC architectures, SPARC and SH in particular, are a bit
grim.
RISC DESIGN PHILOSOPHY
In the mid 1970s researchers (particularly John Cocke) at IBM (and similar projects elsewhere)
demonstrated that the majority of combinations of these orthogonal addressing modes and
instructions were not used by most programs generated by compilers available at the time. It
proved difficult in many cases to write a compiler with more than limited ability to take
advantage of the features provided by conventional CPUs.
It was also discovered that, on microcoded implementations of certain architectures, complex

operations tended to be slower than a sequence of simpler operations doing the same thing. This
was in part an effect of the fact that many designs were rushed, with little time to optimize or
tune every instruction, but only those used most often. One infamous example was
the VAX's INDEX instruction
As mentioned elsewhere, core memory had long since been slower than many CPU designs. The
advent of semiconductor memory reduced this difference, but it was still apparent that
more registers (and later caches) would allow higher CPU operating frequencies. Additional
registers would require sizeable chip or board areas which, at the time (1975), could be made
available if the complexity of the CPU logic was reduced.
Yet another impetus of both RISC and other designs came from practical measurements on real-
world programs. Andrew Tanenbaum summed up many of these, demonstrating that processors
often had oversized immediates. For instance, he showed that 98% of all the constants in a
program would fit in 13 bits, yet many CPU designs dedicated 16 or 32 bits to store them. This
suggests that, to reduce the number of memory accesses, a fixed length machine could store
constants in unused bits of the instruction word itself, so that they would be immediately ready
when the CPU needs them (much like immediate addressing in a conventional design). This
required small opcodes in order to leave room for a reasonably sized constant in a 32-bit
instruction word.
Since many real-world programs spend most of their time executing simple operations, some
researchers decided to focus on making those operations as fast as possible. The clock rate of a
CPU is limited by the time it takes to execute the slowest sub-operation of any instruction;
decreasing that cycle-time often accelerates the execution of other instructions.[4] The focus on
"reduced instructions" led to the resulting machine being called a "reduced instruction set
computer" (RISC). The goal was to make instructions so simple that they
could easily be pipelined, in order to achieve asingle clock throughput at high frequencies.
Later it was noted that one of the most significant characteristics of RISC processors was that
external memory was only accessible by a load or store instruction. All other instructions were
limited to internal registers. This simplified many aspects of processor design: allowing
instructions to be fixed-length, simplifying pipelines, and isolating the logic for dealing with the
delay in completing a memory access (cache miss, etc) to only two instructions. This led to RISC
designs being referred to as load/store architectures
EARLY RISC
The first system that would today be known as RISC was the CDC 6600 supercomputer,
designed in 1964, a decade before the term was invented. The CDC 6600 had a load-store
architecture with only two addressing modes (register+register, and register+immediate constant)
and 74 opcodes (whereas an Intel 8086 has 400). The 6600 had eleven pipelined functional units
for arithmetic and logic, plus five load units and two store units; the memory had multiple banks
so all load-store units could operate at the same time. The basic clock cycle/instruction issue rate
was 10 times faster than the memory access time. Jim Thornton and Seymour Cray designed it as
a number-crunching CPU supported by 10 simple computers called "peripheral processors" to
handle I/O and other operating system functions.[9] Thus the joking comment later that the
acronym RISC actually stood for "Really Invented by Seymour Cray".
Another early load-store machine was the Data General Nova minicomputer, designed in 1968
by Edson de Castro. It had an almost pure RISC instruction set, remarkably similar to that of
today's ARM processors; however it has not been cited as having influenced the ARM designers,
although Novas were in use at the University of Cambridge Computer Laboratory in the early
1980s.
The earliest attempt to make a chip-based RISC CPU was a project at IBM which started in
1975. Named after the building where the project ran, the work led to the IBM 801 CPU family
which was used widely inside IBM hardware. The 801 was eventually produced in a single-chip
form as theROMP in 1981, which stood for 'Research OPD [Office Products Division] Micro
Processor'. As the name implies, this CPU was designed for "mini" tasks, and when IBM
released the IBM RT-PC based on the design in 1986, the performance was not acceptable.
Nevertheless the 801 inspired several research projects, including new ones at IBM that would
eventually lead to their POWER system.
The most public RISC designs, however, were the results of university research programs run
with funding from the DARPA VLSI Program. The VLSI Program, practically unknown today,
led to a huge number of advances in chip design, fabrication, and even computer graphics.
UC Berkeley's RISC project started in 1980 under the direction of David Patterson and Carlo H.

Sequin, based on gaining performance through the use of pipelining and an aggressive use of a
technique known as register windowing. In a normal CPU one has a small number of registers,
and a program can use any register at any time. In a CPU with register windows, there are a huge
number of registers, e.g. 128, but programs can only use a small number of them, e.g. 8, at any
one time. A program that limits itself to 8 registers per procedure can make very fast procedure
calls: The call simply moves the window "down" by 8, to the set of 8 registers used by that
procedure, and the return moves the window back. (On a normal CPU, most calls must save at
least a few registers' values to the stack in order to use those registers as working space, and
restore their values on return.)
The RISC project delivered the RISC-I processor in 1982. Consisting of only 44,420 transistors
(compared with averages of about 100,000 in newer CISC designs of the era) RISC-I had only
32 instructions, and yet completely outperformed any other single-chip design. They followed
this up with the 40,760 transistor, 39 instruction RISC-II in 1983, which ran over three times as
fast as RISC-I.
At about the same time, John L. Hennessy started a similar project called MIPS at Stanford
University in 1981. MIPS focused almost entirely on the pipeline, making sure it could be run as
"full" as possible. Although pipelining was already in use in other designs, several features of the
MIPS chip made its pipeline far faster. The most important, and perhaps annoying, of these
features was the demand that all instructions be able to complete in one cycle. This demand
allowed the pipeline to be run at much higher data rates (there was no need for induced delays)
and is responsible for much of the processor's performance. However, it also had the negative
side effect of eliminating many potentially useful instructions, like a multiply or a divide.
In the early years, the RISC efforts were well known, but largely confined to the university labs
that had created them. The Berkeley effort became so well known that it eventually became the
name for the entire concept. Many in the computer industry criticized that the performance
benefits were unlikely to translate into real-world settings due to the decreased memory
efficiency of multiple instructions, and that that was the reason no one was using them. But
starting in 1986, all of the RISC research projects started delivering products.
LATER RISC
Berkeley's research was not directly commercialized, but the RISC-II design was used by Sun
Microsystems to develop the SPARC, by Pyramid Technology to develop their line of mid-range
multi-processor machines, and by almost every other company a few years later. It was Sun's use
of a RISC chip in their new machines that demonstrated that RISC's benefits were real, and their
machines quickly outpaced the competition and essentially took over the
entire workstation market.
John Hennessy left Stanford (temporarily) to commercialize the MIPS design, starting the
company known as MIPS Computer Systems. Their first design was a second-generation MIPS
chip known as the R2000. MIPS designs went on to become one of the most used RISC chips
when they were included in the PlayStation and Nintendo 64 game consoles. Today they are one
of the most common embedded processors in use for high-end applications.
IBM learned from the RT-PC failure and went on to design the RS/6000 based on their new
POWER architecture. They then moved their existing AS/400 systems to POWER chips, and
found much to their surprise that even the very complex instruction set ran considerably faster.
POWER would also find itself moving "down" in scale to produce the PowerPC design, which
eliminated many of the "IBM only" instructions and created a single-chip implementation. Today
the PowerPC is one of the most commonly used CPUs for automotive applications (some cars
have more than 10 of them inside). It was also the CPU used in most Apple Macintosh machines
from 1994 to 2006. (Starting in February 2006, Apple switched their main production line
to Intel x86 processors.)
Almost all other vendors quickly joined. From the UK similar research efforts resulted in
the INMOS transputer, the Acorn Archimedes and the Advanced RISC Machine line, which is a
huge success today. Companies with existing CISC designs also quickly joined the
revolution. Intel released thei860 and i960 by the late 1980s, although they were not very
successful. Motorola built a new design called the 88000 in homage to their famed CISC 68000,
but it saw almost no use and they eventually abandoned it and joined IBM to produce the
PowerPC. AMD released their 29000 which would go on to become the most popular RISC
design of the early 1990s.
Today the vast majority of all 32-bit CPUs in use are RISC CPUs, and microcontrollers. RISC
design techniques offers power in even small sizes, and thus has become dominant for low-
power 32-bit CPUs. Embedded systems are by far the largest market for processors: while a
family may own one or two PCs, their car(s), cell phones, and other devices may contain a total
of dozens of embedded processors. RISC had also completely taken over the market for larger
workstations for much of the 90s (until taken back by inexpensive PC-based solutions). After the
release of the Sun SPARCstation the other vendors rushed to compete with RISC based solutions
of their own. The high-end server market today is almost completely RISC based and the #1 spot
among supercomputers as of 2008 is held by IBM's Roadrunner system, which uses Power
Architecture-based Cell processors[10] to provide most of its computing power, although many
other supercomputers use x86 CISC processors
ARM System-on-Chips ARM9
Samsung S3C24xx
Based on ARM9 cores. The main member of this family that is relevant to RISC OS is the
S3C2440, based on the ARM920T core. However, interest has been expressed in some low-cost
netbooks based on the S3C2450, a successor to the S3C2443 (which is an upgraded version of
the S3C2440.)
S3C2440 Features
 ARM920T CPU @ 300, 400, or 533 MHz (400 in A9home)

 640x480 24-bpp LCD controller (not used in A9home)
 AC97 audio support
 USB 1.1 host and device support
S3C2450 Features
 ARM926EJ CPU @ 400 or 533 MHz

 1024x1024 24-bpp LCD controller
 5.1 channel and AC97 audio support
 USB 1.1 host and 2.0 device support
 ATA-6 (PATA) support
Used in
 MenQ EasyPC E790, among other similar Chinese clones
Anyka AK7802
Based on ARM926EJ core, clocked at 248 or 266 MHz. Used in some cheap netbooks, but not
suitable for RISC OS applications due to poor performance relative to other SoCs currently
available, and poor documentation. Windows CE is the only supported OS.
Features
 ARM926EJ CPU @ up to 266 MHz

 800x480 LCD controller
 USB OTG and host support (unknown version)
Freescale i.MX515
Another Cortex-A8-based SoC, this one appears to be better suited towards netbook use than the
OMAP3, due to including PATA support. However, maximum display resolution is lower.
Features
 ARM Cortex-A8 CPU @ 800 MHz

 3D accelerator
 1280x800 24-bpp primary display output
 USB 2.0 OTG and host support
 SD/SDIO support
 ATA-6 (PATA) support

Afraz Nawaz

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Afraz Nawaz

Uploaded by

Copyright:

Available Formats

HISTORY

RISC DESIGN PHILOSOPHY

It was also discovered that, on microcoded implementations of certain architectures, complex

UC Berkeley's RISC project started in 1980 under the direction of David Patterson and Carlo H.

ARM System-on-Chips ARM9

 ARM920T CPU @ 300, 400, or 533 MHz (400 in A9home)

 ARM926EJ CPU @ 400 or 533 MHz

 MenQ EasyPC E790, among other similar Chinese clones

 ARM926EJ CPU @ up to 266 MHz

 ARM Cortex-A8 CPU @ 800 MHz

You might also like