Professional Documents
Culture Documents
Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Tcnico e
2011-11-14
1 / 24
Outline
Simple Example: Opportunities for Parallelism Speedup and Overheads Application Areas Parallel Systems
2011-11-14
2 / 24
No good?
2011-11-14
3 / 24
No good?
2011-11-14
3 / 24
No good?
2011-11-14
3 / 24
No good?
2011-11-14
3 / 24
No good?
2011-11-14
3 / 24
2011-11-14
4 / 24
S =p
(tparallel =
2011-11-14
4 / 24
S =p
(tparallel =
2011-11-14
4 / 24
S =p
(tparallel =
Yes!
2011-11-14
4 / 24
S =p
(tparallel =
Yes!
2011-11-14
4 / 24
S =p
(tparallel =
Yes!
2011-11-14
4 / 24
2011-11-14
5 / 24
data transfers (or more generally, communication among tasks) task startup / nalize load balancing inherent sequential portions of computation
2011-11-14
5 / 24
lim S(p, f ) =
1 f
2011-11-14
6 / 24
lim S(p, f ) =
1 f
f=0%
f=5%
f=10%
f=20%
2011-11-14
6 / 24
2011-11-14
7 / 24
2011-11-14
7 / 24
2011-11-14
7 / 24
Application Areas
Why bother with parallel computation?
2011-11-14
8 / 24
Application Areas
Why bother with parallel computation? Continued demand for greater computational power from many dierent domains! Two major classes of problems in parallel computation:
Global Environmental/Ecosystem Modeling Biomechanics and biomedical imaging Fluid dynamics Molecular nanotechnology Nuclear power and weapons simulations
2011-11-14
9 / 24
Numerical weather forecasting Computer graphics / animation Basic Local Alignment Search Tool (BLAST) in bioinformatics Monte-Carlo methods Genetic algorithms
2011-11-14
10 / 24
Atmospheric conditions (temperature, pressure, humidity, etc) for each cell are computed as a function of neighbors cell conditions in this and previous time intervals.
2011-11-14
11 / 24
2011-11-14
12 / 24
2011-11-14
12 / 24
2011-11-14
12 / 24
2011-11-14
12 / 24
Each body has a given position, velocity, acceleration, that needs to be computed for every time interval.
Each body attracts (and/or repels) every other body. For n bodies, there are a total of n2 interactions that need to be accounted for.
Example: a galaxy has more than 1011 stars, leading to more than 1022 oating point operations for each time interval!
2011-11-14
13 / 24
Processor Evolution
2011-11-14
14 / 24
Supercomputer Evolution
2011-11-14
15 / 24
2011-11-14
16 / 24
First Peta system available in 2009! Estimate of humans brain computational power: 1014 neural connections at 200 calculations per second ) 20 PFLOPS
Jos Monteiro (DEI / IST) e Parallel and Distributed Computing 2 2011-11-14 17 / 24
Types of Supercomputers
Processor Arrays (SIMD)
Name associated with vector processing, very popular in early supercomputers.
2011-11-14
18 / 24
Types of Supercomputers
Processor Arrays (SIMD)
Name associated with vector processing, very popular in early supercomputers.
Multicore (SMP)
Set of processors sharing a common main memory.
2011-11-14
18 / 24
Types of Supercomputers
Processor Arrays (SIMD)
Name associated with vector processing, very popular in early supercomputers.
Multicore (SMP)
Set of processors sharing a common main memory.
2011-11-14
18 / 24
Types of Supercomputers
Processor Arrays (SIMD)
Name associated with vector processing, very popular in early supercomputers.
Multicore (SMP)
Set of processors sharing a common main memory.
Clusters
Processors with individual main memory linked together using InniBand, Quadrics, Myrinet, or Gigabit Ethernet connections.
COW / NOW: Cluster / Network Of Workstations Beowulf: cluster made of PCs running Linux using TCP/IP (COTS: Commodity-O-The-Shelf)
2011-11-14
18 / 24
Types of Supercomputers
Processor Arrays (SIMD)
Name associated with vector processing, very popular in early supercomputers.
Multicore (SMP)
Set of processors sharing a common main memory.
Clusters
Processors with individual main memory linked together using InniBand, Quadrics, Myrinet, or Gigabit Ethernet connections.
COW / NOW: Cluster / Network Of Workstations Beowulf: cluster made of PCs running Linux using TCP/IP (COTS: Commodity-O-The-Shelf)
Constellation
MPP / cluster where each node is a multicore.
Jos Monteiro (DEI / IST) e Parallel and Distributed Computing 2 2011-11-14 18 / 24
2011-11-14
19 / 24
2011-11-14
20 / 24
2011-11-14
21 / 24
Warehouse-size Computers
2011-11-14
22 / 24
Warehouse-size Computers
2011-11-14
22 / 24
Multicores
Sample of todays multicore processors: AMD
Opteron: dual, quad, hex, 8-, 12-cores Phenom: dual, quad, hex cores
Intel
Core i7: six hyperthreaded cores Dunnington (Xeon): six cores
Sun
Niagara: 8 cores; 8-way ne-grain multithreading per core
IBM
Power 7: dual, quad, hex, 8-core Cell: 1 PPC core; 8 SPEs w/ SIMD parallelism
Jos Monteiro (DEI / IST) e Parallel and Distributed Computing 2 2011-11-14 23 / 24
Next Class
levels of parallelism
2011-11-14
24 / 24