You are on page 1of 34

Exascale Computing:

Applications Performance and


Energy Efficiency

Horst Simon
Lawrence Berkeley National Laboratory
and UC Berkeley

Santa Barbara Summit on Energy Efficiency


April 26, 2011
Overview

•  Exascale Applications in Computational Science


•  From 1999 to 2009: Evolution from Teraflops to Petaflops
computing

•  From 2010 to 2020: Key Technology Changes towards


Exaflops Computing

2
A decadal DOE plan for providing exascale
applications and technologies for DOE mission
needs

Rick Stevens and Andy White, co-chairs


Pete Beckman, Ray Bair-ANL; Jim Hack, Jeff Nichols, Al Geist-
ORNL; Horst Simon, Kathy Yelick, John Shalf-LBNL; Steve
Ashby, Moe Khaleel-PNNL; Michel McCoy, Mark Seager, Brent
Gorda-LLNL; John Morrison, Cheryl Wampler-LANL; James
Peery, Sudip Dosanjh, Jim Ang-SNL; Jim Davenport, Tom
Schlagel, BNL; Fred Johnson, Paul Messina, ex officio

3
Process for identifying exascale applications and technology
for DOE missions ensures broad community input

•  Town Hall Meetings April-June 2007


•  Scientific Grand Challenges Workshops
Nov, 2008 – Oct, 2009
–  Climate Science (11/08)
–  High Energy Physics (12/08
–  Nuclear Physics (1/09)
–  Fusion Energy (3/09)
–  Nuclear Energy (5/09)
–  Biology (8/09)
–  Material Science and Chemistry (8/09)
–  National Security (10/09)
MISSION IMPERATIVES
–  Cross-cutting technologies (2/10)
•  Exascale Steering Committee
–  “Denver” vendor NDA visits 8/2009
–  SC09 vendor feedback meetings
–  Extreme Architecture and Technology
Workshop 12/2009
•  International Exascale Software Project
–  Santa Fe, NM 4/2009; Paris, France 6/2009; FUNDAMENTAL SCIENCE
Tsukuba, Japan 10/2009
4
DOE mission imperatives require simulation and
analysis for policy and decision making

•  Climate Change: Understanding,


mitigating and adapting to the effects of
global warming
–  Sea level rise
–  Severe weather
–  Regional climate change
–  Geologic carbon sequestration
•  Energy: Reducing U.S. reliance on foreign
energy sources and reducing the carbon
footprint of energy production
–  Reducing time and cost of reactor design
and deployment
–  Improving the efficiency of combustion
energy systems
•  National Nuclear Security: Maintaining a
safe, secure and reliable nuclear stockpile
–  Stockpile certification
–  Predictive scientific challenges
–  Real-time evaluation of urban nuclear
detonation
Accomplishing these missions requires exascale resources
5
Exascale simulation will enable fundamental
advances in basic science
ITER
•  High Energy & Nuclear Physics
–  Dark-energy and dark matter
–  Fundamentals of fission fusion reactions
•  Facility and experimental design
–  Effective design of accelerators ILC
–  Probes of dark energy and dark matter Hubble image
of lensing
–  ITER shot planning and device control
•  Materials / Chemistry
–  Predictive multi-scale materials modeling:
observation to control Structure of
–  Effective, commercial technologies in nucleons
renewable energy, catalysts, batteries and
combustion
•  Life Sciences
–  Better biofuels
–  Sequence to structure to function
These breakthrough scientific discoveries
and facilities require exascale applications
and resources 6
Advanced Computation in Energy Science
at LBNL

Global Scale Reactive Transport


Modeling of CH4 hydrates (M. Reagan)

Pore Scale Reactive Transport


Modeling of CO2 sequestration
(D. Trebotich)

Molecular Dynamics Simulations of


Natural Nanofluids (I. Bourg)

First-Principles Calculations of
Materials Genome (K. Persson)

7
Advanced Computation in Energy
Science

Global Scale Reactive Transport


Modeling of CH4 hydrates (M. Reagan)

Pore Scale Reactive Transport


Modeling of CO2 sequestration
(D. Trebotich)

Molecular Dynamics Simulations of


Natural Nanofluids (I. Bourg)

First-Principles Calculations of
Materials Genome (K. Persson)

8
First-principles Methods Predictive Power

Ionic diffusion! Electronic State! Phase diagrams!


Exp: DLi = 4.4 x10-6 cm2 s-1 !
!"##"$%&'(")&*&+,-%&.//01&23#41&+5661&789&
::;7<8&=:<<7>&

glass forming region!


stage III stage I+II!
+II!

Stage II! Stage I!

Particle morphology! Solubility in Water!

H2O covered! 1K.Dokko, et al, J. Mater. Chem. 17,


4803 (2007).!
Tang et al, JACS. 132, 596 (2010).!

O
covered!

9
Leveraging the Information Age

The ‘Google’ of Materials Science LBNL

Scalable
Infrastructure
Web

NERSC
Invert the Design Problem Automation
(MongoDB)
Engine

API
Computing ~200 compounds/day

Share the information 10


Already at Work for Li-ion Cathodes

No Li containing carbono phosphates AT


ALL are known in nature!

Completely new
New phosphate class of materials
discovered @MIT synthesized based
through on computational
computations predictions

“Sidorenkites”: Li3M(CO3)(PO4) (M = Fe, Mn, Ni, Co)

Real battery data !

Courtesy of Ceder at al, MIT Courtesy of Ceder at al, MIT


11
Towards a Materials Genome

Computing all properties of all inorganic materials

LBNL EETD MIT


Kristin Persson Gerbrand Ceder
Michael Kocher Anubhav Jain
Shyue Ping Ong
Geoffroy Hautier

LBNL NERSC
Kathy Yelick LBNL CRD
David Skinner
Juan Meza
Eli Dart David Bailey
Shreyas Cholia
Alex Kaiser
Daniel Gunter Maciej Haranczyk
Annette Greiner
Zhenji Zhao

12
Overview

•  Exascale Applications in Computational Science


•  From 1999 to 2009: Evolution from Teraflops to Petaflops
computing

•  From 2010 to 2020: Key Technology Changes towards


Exaflops Computing

13
Extreme Scale Computing Today

Top500 list, November 2010


Machine Place Speed On list
(max) Since
Jaguar ORNL 1.75 PF 2009 (2)
Hopper LBNL 1.05 PF 2010 (5)
Roadrunner LANL 1.04 PF 2008 (7)
Cielo SNL/LANL 0.81 PF 2010 (10)
Intrepid ANL 0.458 2009 (13)
PF
NNSA
Advanced Simulation and Computing Dawn LLNL 0.415 2009 (16)
PF

Computational
Science ASC and ASCR provide much more than machines:
•  Applications (Computational Science)
Computer Applied •  Algorithms (Applied Mathematics)
Science Mathematics
•  Systems (Computer Science)
•  Integration (SciDAC, Campaigns)

14
Jaguar: Most Powerful Computer in the U.S. in 2011
Most Powerful Computer in the World in 2009

!"#$%&"'()'*#+,"% -.//-%!0%
!"#$%&'&%&()"' *++',-'
./#0'#123%' 4+'5-'
5)(3%##()#' 6678'
12%3)4.%-556% 5(9%)' :;<='>?'
1-%3)4.%-525%
15
ASCI Red: World’s Most Powerful
Computer in 1999

12%3)4.%2666%

!"#$%&"'()'*#+,"% /.278%90%
!"#$%&'&%&()"' 4;646',-'
./#0'#123%' 46;=',-'
5)(3%##()#' <6<@'
5(9%)' @=+'0?'

16
Comparison:
Jaguar (2009) vs. ASCI Red (1999)

•  739x performance (LINPACK)


•  267x memory Parallelism and faster
processors made about
equal contributions to
•  800x disk performance increase

•  24x processors/cores
•  8.2x power
Significant increase
in operations cost
Essentially the same architecture and software
environment
17
Traditional Sources of Performance
Improvement are Flat-Lining (2004)

•  New Constraints
–  15 years of exponential
clock rate growth has
ended

•  Moore’s Law reinterpreted:


–  How do we use all of those
transistors to keep
performance increasing at
historical rates?
–  Industry Response:
#cores per chip doubles
every 18 months instead
of clock frequency!

Figure courtesy of Kunle Olukotun, Lance


Hammond, Herb Sutter, and Burton Smith!

18
5%)A()&2B3%'.%C%D(1&%B$'
8/.@@%!0<)&=>%
100 Pflop/s

10 Pflop/s -.7:%!0<)&=>%

1 Pflop/s

100 Tflop/s AB?% /2.2-%90<)&=>%

10 Tflop/s

1 Tflop/s
3C2%
2.2:%90<)&=>%
100 Gflop/s
76.:%;0<)&=>%
10 Gflop/s 3C755%

1 Gflop/s

100 Mflop/s 855%?0<)&=>%


5)(E%3$%F'5%)A()&2B3%'.%C%D(1&%B$'

100 Pflop/s

10 Pflop/s
AB?%
1 Pflop/s

100 Tflop/s
3C2%
10 Tflop/s
1 Tflop/s
3C755%
100 Gflop/s

10 Gflop/s

1 Gflop/s
100 Mflop/s
G(B3H))%B3"'I%C%D#'

Jack‘s Notebook
Moore’s Law reinterpreted

•  Number of cores per chip will double every two years


•  Clock speed will not increase (possibly decrease)
•  Need to deal with systems with millions of
concurrent threads
•  Need to deal with inter-chip parallelism as well as
intra-chip parallelism

22
,($2D'5(9%)'I%C%D#'J0?K'A()',L5=++'#"#$%&#'
5(9%)'MN3/%B3"'J>O(1#P?2QK'A()'F/R%)%B$''
5)(3%##()'S%B%)2T(B#'
Koomey’s Law

•  Computations per
kWh have improved
by a factor about 1.5
per year
•  “Assessing Trends
in Electrical
Efficiency over
Time”, see IEEE
Spectrum, March
2010

25
Overview

•  Exascale Applications in Computational Science


•  From 1999 to 2009: Evolution from Teraflops to Petaflops
computing

•  From 2010 to 2020: Key Technology Changes towards


Exaflops Computing

26
Trend Analysis

•  Processors and Systems have become more energy


efficient over time
–  Koomey’s Law shows factor of 1.5 improvement in
computations/kWh
•  Supercomputers have become more powerful over time
–  TOP500 data show factor of 1.86 increase of
computations/sec per system
•  Consequently power/system increases by about 1.24 per
year
•  Based on these projections: 495 Pflop/s Linpack-Rmax
system with 60 MW in 2020

27
We won’t reach Exaflops with the current approach

From Peter
Kogge, DARPA
Exascale Study

28
… and the power costs will still be staggering

1000
System Power (MW)

100

10

1
2005 2010 2015 2020

From Peter Kogge,


DARPA Exascale Study

29
Potential System Architecture Targets

System 2010 “2015” “2018”


attributes
System peak 2 Peta 200 Petaflop/sec 1 Exaflop/sec
Power 6 MW 15 MW 20 MW
System memory 0.3 PB 5 PB 32-64 PB
Node performance 125 GF 0.5 TF 7 TF 1 TF 10 TF
Node memory BW 25 GB/s 0.1 TB/sec 1 TB/sec 0.4 TB/sec 4 TB/sec
Node concurrency 12 O(100) O(1,000) O(1,000) O(10,000)
System size 18,700 50,000 5,000 1,000,000 100,000
(nodes)
Total Node 1.5 GB/s 20 GB/sec 200 GB/sec
Interconnect BW
MTTI days O(1day) O(1 day)

30
Comparison:
“2018” vs. Jaguar (2009)

•  500x performance (peak)


All performance increase
•  100x memory is based on more
parallelism
•  5000x concurrency
•  3x power
Keep operating cost
about the “same”

Significantly different architecture and software


environment

31
DOE Exascale Technology Roadmap

•  Report from DOE Exascale Architecture and


Technology Workshop, San Diego, Dec. 2009
http://extremecomputing.labworks.org/hardware/reports/
FINAL_Arch&TechExtremeScale1-28-11.pdf

•  Paper by Donsanj, Morrison, and Shalf at:


http://www.nersc.gov/projects/reports/technical/
ShalfVecpar2010.pdf

32
Where do we get 1000x performance
improvement for 10x power?

1.  Processors
2.  On-chip data movement
3.  System-wide data movement
4.  Memory Technology
5.  Resilience Mechanisms

33
Summary

•  Major challenges are ahead for extreme computing


– Power
– Parallelism
– … and many others not discussed here

•  We will need completely new approaches and


technologies to reach the Exascale level

•  This opens up many new opportunities for computer


scientists and applied mathematicians

34