Professional Documents
Culture Documents
INTRODUCTION
CENTRAL :\1EMORY
CPITRAL
PROCESSOR (CP)
PERIPHERAL
PROCESSOR (PP)
CENTRAL
MEMORY
DISC STORAGE
DATA COMMUNICATIONS
~ COMON CARRIERS
PER IPHERALS
389
390
SECONDARY
MEr1lRY
ACCESS PORTS
INTERLEAVED
HIGH-SPEED OR
MED lUM-SPEED
MEMORY MODULES
r1E~()PY
CONTROL
UNIT
PRIMARY
MEMORY
ACCESS PORTS
(MCU)
r------7---------- L
------ 1
:M"~~~6~L
EXTENSION
(OPTIONAl)
r-----l
PRIMARY
MEMORY
PORTS
{~
~
I
/\
i$$
~clJ
L _____ ..J
r---------,
PRIMARY
MEMORY
PORTS
{ ::I //1//TI'1'\. \,
I
I
9 Ti:
1~/:661
MBU
MBU
I I
f3~:
L ________ =.!
:
AU
TWO-PIPFLINE CP
AU
FOUP-PIPFL INE" CP
:.::;c
.;~
FLOATING ADD
FIXED MULT
RECEIVER REGISTER
I
L ___
-,
"
I
I
I
EXPONENT SUBTRACT
I
I
~r
I
I
I
ALIGN
:--,
MULTIPLY
L___ -...,
~,
I
_...1
ADD
NORMALIZE
391
ACCUMULATE
I
I
I
I
--- _-1
OUTPUT
~,
RESULT
RESULT
Arithmetic unit
(2) exponent subtract, (3) align, (4) add, (5) normalize, (6)
multiply, (7) accumulate, and (8) output. Figure 4 shows how
different sections of the AU are utilized for execution of
particular instructions; i.e., floating point addition and fixed
point multiplication.
An AU is a 64-bit parallel operating unit for most scalar
and vector instructions. Exceptions are double length
multiply and all types of division. In these circumstances
various combinations of the components of the AU are
392
393
394
E
X
P
A
N
D
E
R
M
E
M
0
R
Y
H/1:g~~tttE~ND
DISC INTERFACE
UNIT
HIT
H/~O~~~'tt~:ND
DISC INTERFACE
UNIT
HIT
25M WORDS
500K WORDS/SEC.
11:J=H~:tctE ~ND
DISC INTERFACE
UNIT
500K WORDS/SEC.
Hi1:g:ir\~CtE ~ND
DISC INTERFACE
UNIT
HIT
SOOK WORDS/SEC.
25M WORDS
25M WORDS
500K WORDS/SEC.
TEXT EDITING
CRTS (TWo)
r - - - - - CP- - - - --,
I
I
I
I
I
I
TWO 1500
CARD MIN.
CARD READER
THREE 1200
LINE MIN.
LINE PRINTER
TWO 100
CARD MIN.
PUNCHES
OPERATOR
COMM.
TWO CRTS
..J
TAPE
SWITCHING
UNIT
6 DUAL DENSITY
9 TRACK 800 1600
BPI TAPE DRIVES
}
TAPE CONTROLLER
CHANNEL NUMBER 1
SECONDARY STORAG
3 DUAL DENSITY
7 TRACK 556 800
BPI TAPE DRIVES
CHANNEL NUMBER 2
SECONDARY STORAGE
(A) 114219B
DO
DO
DO
10
10
10
10
K=l, 50
J =1,50
1=1,50
(2)
Z=X*Y
(3)
395
DO
DO
100
(2)
100 K=l,lO
100 1=1,144
(#3B8, B2)
(#3CO, B2)
(#3C8, B2)
(#3DO, B2)
(#3D8, B2)
(#3EO, B2)
(#3E8, B2)
(#3FO, B2)
VAF
VMF
VSF
VMF
VAF
VMF
VSF
VMF
MAXIMIZING PERFORMANCE
Experience thus far has shown that for the applications
that have been considered by ASC users the most costeffective performance is realizable when the capabilities of
ASC Fortran and the optimizing compiler are used. Although
particular sequences of code can be found wherein hand
coding will improve the speed of execution, for the broad
range of programs where much applications code is involved,
compiler-generated object code is the best choice. American
National Standard Institute (ANS) Fortran is completely
sufficient, and vector instructions are readily produced from
this Fortran. ASC extensions to the Fortran are sometimes
found to be useful, not to provide unique access to some hardware feature but to simplify notation involved in writing the
program so that the programmer can deal more directly with
the mathematics of the application.
The ASC system design allows easy user access to performance enhancement through the use of additional central
processor "pipes." Compiler software is responsible for both
the generation of vector instructions and the partitioning of
these vector operations over multiple pipes. Protection of the
user from vector hazard conditions is carried out by the
compiler. Partitioning of scalar instructions for multiple pipes
is carried out by the CP hardware. Extensive checks are made
by hardware to protect the user from illegal scalar conditions
that might occur. For mixtures of vector instructions and for
mixtures of scalars and vectors, the compiler prevents illegal
conditions by the use of directive instructions for the CP to
operate in either parallel mode (FORK) or sequential mode
(JOIN). Thus, the burden is on the system instead of the
user. Programs compiled for one-pipe ASC's will execute
correctly on multiple-pipe systems. Performance \\1.ll be
increased via a recompilation for the multiple-pipe machine.
Some typical examples of efficient code produced from
present applications \\1.11 illustrate the optimization level
provided by the system. Table I shows the type of instruction
generated by the compiler from a typical triple-nested DO
LOOP.
(1) gives the Fortran source with three levels of indexing,
(2) is an alternate notation that could be used, and
(3) is the single vector instruction produced.
396
64-BIT
RESULTS/SEC
RESULTS/SEC
RESULTS/SEC
9.2 X 19
5.3 X 10 6
4.0 X 10 6
64 X 10
64 X 106
64 X 10 6
37 X 10 6
21 X 10 6
16 X 10 6
64-BIT
RESULTS/SEC
6
ADD
MULTIPLY
DOT PRODUCT
16 X 10
16 X 10 6
16 X 10 6
MODEL
S/360 MODEL
S/360 MODEL
6500
6600
S/370 MODEL
8/360 MODEL
HITAC 8800
S/360 MODEL
7600
S/360 MODEL
32-BIT
RELATIVE SPEED
65
75
165
91
95
195
1.5
1.5
2.5
3.5
5
5
7
8
8
* Data taken from Table E, page 546, Program for the study conference
. . TE, BuJletin of the }.. mcric:1n ~9fctcGrG
on the Modeling ~!....speets of G6A
logical Society, Vol. 54 No.6, June, 1973.
ACKNOWLEDGMENTS
It would not he possible t.o acknowledge all the contributors
to the development of the ASC; but particular recognition
397