Chapter01 Exercises

dce
2010
COMPUTER ARCHITECTURE CE2010

BK
TP.HCM
Faculty of Computer Science and Engineering Department of Computer Engineering Nam Ho
dce
2010
Chapter 1
Computer Abstraction and Technology - Exercises
Adapted from Computer Organization and Design, 4th Edition, Patterson & Hennessy, 2008
Computer Architecture Chapter 1
2010, Dr. Dinh Duc Anh Vu
dce
2010
Pitfall: MIPS as a Performance Metric

MIPS: Millions of Instructions Per Second
Doesnt account for
Differences in ISAs between computers Differences in complexity between instructions
Instruction count MIPS = Execution time 106 Instruction count Clock rate = = Instruction count CPI CPI 106 6 10 Clock rate
CPI varies between programs on a given CPU

Computer Architecture Chapter 1 2010, Dr. Dinh Duc Anh Vu 3
dce
2010
Using MIPS
There are three problems with MIPS:
MIPS specifies the instruction execution rate but not the capabilities of the instructions MIPS varies between programs on the same computer MIPS can vary inversely with performance (see the next example)
dce
2010
Exercise 1
Consider the machine with the following three instruction classes and CPI:
CPI for this instruction class A B C 1 2 3
CPI
Now suppose we measure the code for the same program from two different compilers and obtain the following data:
Code from Compiler 1 Compiler 2 Instruction count in (billons) for each instruction class A B C 5 1 1 10 1 1
Assume that the machines clock rate is 500 MHz. Which code sequence will execute faster according to execution time ? According to MIPS?
2010, Dr. Dinh Duc Anh Vu 5
dce
2010
Exercise 1
Using the formula: CPU clock cycles
Clock Cycles = (CPIi Instruction Count i )
i=1 n
Execution time = CPU clock cycles/clock rate
Compiler 1:
CPU clock cycles = (5 x 1 + 1 x 2 + 1 x 3) x 109 = 10 x 109 cycles Execution time = (10 x 109)/500 x 106 = 20 seconds
Compiler 2:
CPU clock cycles = (10 x 1 + 1 x 2 + 1 x 3) x 109 = 15 x 109 cycles Execution time = (15 x 109)/(500 x 106) = 30 seconds
Therefore compiler 1 generates a faster program
dce
2010
Exercise 1
Instruction count MIPS = Execution Time 106
Compiler 1: Compiler 2:
(5 + 1 + 1) 109 MIPS = = 350 6 20 10 (10 + 1 + 1) 109 MIPS = = 400 6 30 10
Although compiler 2 has a higher MIPS rating, the code from generated by compiler 1 runs faster
dce
2010
Exercise 2
PowerPC G4 can execute 4 instructions/cycle with a clock rate 867 MHz. What is the MIPS rate ? Answer:
CPI = 1/4 MIPS = 1/(CPI x Cycle Time x 106) MIPS = Clock Rate/CPI x 106 MIPS = 4 x 867 = 3468
dce
2010
Speedup - Amdahls Law

Speedup = Execution Time Old --------------------------------Execution Time New
1
Tbefore improved = --------------Timproved
Speedup = ----------------------------------, n is the improvement factor 1 fenhanced + fenhanced/n Taffected fenhanced = ----------------, Tbefore improved = Taffected + Tunaffected Tbefore improved
Timproved =
Taffected + Tunaffected improvement factor
dce
2010
Exercise 3
Consider the implementation of a ISA. The clock rate is 2Ghz. What is the execution time of the following program ?
Arith Instructions CPI 500 1 Store 50 5 Load 100 5 Branch 50 2
Execution Time = (500 1 + 50 5 + 100 5 + 50 2) 0.5 109 = 675 ns If the number of load instructions can be reduced by one-half, what is the speedup and the CPI? Execution Time = (500 1 + 50 5 + 50 5 + 50 2) 0.5 109 = 550 ns Speedup = 675/550 = 1.22 CPI = Execution Time x Clock rate/ Instruction Count CPI = 550 x 10-9 x 2 x 109/700 = 1.57
10
dce
2010
Exercise 4
Speedup = Execution Time Old --------------------------------Execution Time New
1
Tbefore improved = --------------Timproved
Prove that Speedup = ---------------------------------1 fenhanced + fenhanced/n n is the improvement factor Taffected fenhanced = ---------------Tbefore improved Tbefore improved = Taffected + Tunaffected
Taffected Timproved = + Tunaffected improvement factor

dce
2010
Exercise 5
A common transformation required in graphics processors is square root. Implementations of floating-point (FP) square root vary significantly in performance, especially among processors designed for graphics. Suppose FP square root (FPSQR) is responsible for 20% of the execution time of a critical graphics benchmark. One proposal is to enhance the FPSQR hardware and speed up this operation by a factor of 10. The other alternative is just to try to make all FP instructions in the graphics processor run faster by a factor of 1.6; FP instructions are responsible for half of the execution time for the application. Compare these two design alternatives.
12
dce
2010
Exercise 5
SpeedupFPSQR
1 = = 1.22 = 0.2 0.82 (1 0.2) + 10 1
SpeedupFP =
1 = = 1.23 0.5 0.8125 (1 0.5) + 1.6
13
dce
2010
Geometric Mean
GM =
n
Execution time ratio

i =1
The advantage of the geometric mean is that

It is independent of the running times of the individual programs And it doesnt matter which computer is used for normalization.
The drawback to using geometric means of execution times is that

They violate our fundamental principle of performance measurement, they do not predict execution time
14
dce
2010
Geometric Mean
A and B have the same performance. Performance of C is 0.63 of A or B. Unfortunately, the total execution time of A is 10 times longer than that of B, and B in turn is about 3 times longer than C.
2010, Dr. Dinh Duc Anh Vu 15
dce
2010
Exercise 6
Show that the Geometric Mean doesnt matter which computer is used for normalization.
16
dce
2010
Fallacy: MFLOPS
MFLOPS: Million Floating-point Operations Per Second Fallacy: MFLOPS is a consistent and useful measure of performance. For example, the Cray C90 has no divide instruction, while the Intel Pentium has divide, square root, sine, and cosine
17
dce
2010
Exercise 7
Consider the programs in the following table, running on a processor with clock rate = 3GHz. Find the MFLOPS for the program a, b.
Program a: FPop = 106 0.4 = 4 105, clockcylesfp = CPI No. FP instr. = 4 105 Tfp = 4 105 0.33 109 = 1.32 104 then MFLOPS = 3.03 103 Program b: FPop = 3 106 0.4 = 1.2 106, clockcylesfp = CPI No. FP instr. = 0.70 1.2 106 Tfp = 0.84 106 0.33 109 = 2.77 104 then MFLOPS = 4.33 103

Chapter01 Exercises

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter01 Exercises

Uploaded by

Copyright:

Available Formats

dce

COMPUTER ARCHITECTURE CE2010

Faculty of Computer Science and Engineering Department of Computer Engineering Nam Ho

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Pitfall: MIPS as a Performance Metric

CPI varies between programs on a given CPU

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Computer Architecture Chapter 1

Execution time = CPU clock cycles/clock rate

Therefore compiler 1 generates a faster program

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

(5 + 1 + 1) 109 MIPS = = 350 6 20 10 (10 + 1 + 1) 109 MIPS = = 400 6 30 10

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Speedup - Amdahls Law

Tbefore improved = --------------Timproved

Taffected + Tunaffected improvement factor

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Tbefore improved = --------------Timproved

Taffected Timproved = + Tunaffected improvement factor

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

1 = = 1.23 0.5 0.8125 (1 0.5) + 1.6

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Execution time ratio

The advantage of the geometric mean is that

The drawback to using geometric means of execution times is that

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Computer Architecture Chapter 1

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

Computer Architecture Chapter 1

2010, Dr. Dinh Duc Anh Vu

You might also like