You are on page 1of 54

Elmer Open source finite element software for multiphysical problems

Elmer
Parallel computing with Elmer
Elmer Team
CSC IT Center for Science Ltd.
Elmer Course CSC, 9-10.1.2012
Outline
Parallel computin concepts
Parallel computin !it" Elmer

Preprocessin !it" Elmer#rid

Parallel sol$er ElmerSol$er%mpi

Postprocessin !it" Elmer#rid and ElmerPost


Introductor& e'ample( )lo! t"rou" pipe *unction
Parallel computing concepts
Parallel computing concepts
Parallel computation means e'ecutin tas+s concurrentl&.

, tas+ encapsulates a se-uential proram and local data, and its interface
to its en$ironment.

.ata of t"ose ot"er tas+s is remote.


.ata dependenc& means t"at t"e computation of one tas+ re-uires data from
an anot"er tas+ in order to proceed.
Parallel computers
S"ared memor&

,ll cores can access t"e


!"ole memor&.
.istri/uted memor&

,ll cores "a$e t"eir o!n


memor&.

Communication /et!een
cores is needed in order to
access t"e memor& of ot"er
cores.
Current supercomputers com/ine
t"e distri/uted and s"ared
memor& approac"es.
Parallel programming models
0essae passin 12pen0PI3

Can /e used /ot" in distri/uted and s"ared memor& computers.

Prorammin model allo!s ood parallel scala/ilit&.

Prorammin is -uite e'plicit.


T"reads 1pt"reads, 2pen0P3

Can /e used onl& in s"ared memor& computers.

Limited parallel scala/ilit&.

Simpler or less e'plicit prorammin.


Execution model
Parallel proram is launc"ed as a set of independent, identical processes

T"e same proram code and instructions.


Can reside in different computation nodes.
2r e$en in different computers.
Current CP45s in &our !or+stations

Si' cores 1,0. 2pteron S"an"ai3


0ulti-t"readin

)or e'ample, 2pen0P


6i" performance Computin 16PC3

0essae passin, for e'ample


2pen0PI
General remarks about parallel computing
T"e si7e of t"e pro/lem remains
constant.
E'ecution time decreases in proportion
to t"e increase in t"e num/er of cores.
Strong parallel scaling
Increasin t"e si7e of t"e pro/lem.
E'ecution time remains constant, !"en
num/er of cores increases in proportion
to t"e pro/lem si7e.
Weak parallel scaling
Parallel computing with Elmer
.omain decomposition.
,dditional pre-processin step called mes"
partitionin.
E$er& domain is runnin its o!n ElmerSol$er.
ElmerSol$er%mpi parallel process
communication
0PI
8ecom/ination of results to ElmerPost output.
Parallel computing with Elmer
2nl& selected linear ale/ra met"ods are
a$aila/le in parallel.
Iterati$e sol$ers 19r&lo$ su/space
met"ods3 are /asicall& t"e same.

)rom direct sol$ers, onl& 040PS e'ists in


parallel 1uses 2pen0P3
:it" t"e additional 6;P8E pac+ae, some
linear met"ods are a$aila/le onl& in
parallel.
Parallel computing with Elmer
Preconditioners re-uired /& iterati$e met"ods
are not t"e same as serial.
)or e'ample, IL4n.

0a& deteriorate parallel performance.


.iaonal preconditioner is t"e same as parallel
and "ence e'"i/its e'actl& t"e same /e"a$ior
in parallel as in serial.
Parallel computing with Elmer
Preprocessin

.e$elopin of automated
mes"in alorit"ms.
Postprocessin

Parallel postprocessin

Para$ie!
Grand challenges
Parallel workflow
Scaling of wall clock time with dofs in
the cavity lid case using GMRES+ILU0.
Simulation y !uha Ruokolainen" #S#
and visuali$ation y Matti G%&hn" #S# .
Examples of parallel scaling of Elmer
Examples of parallel scaling of Elmer
0 1< =2 >? <> ?0 9< 112 12? 1>>
0
@00
1000
1@00
2000
2@00
=000
=@00
>000
Cores
T
i
m
e

A

s
Serial mes" structure of Elmer

eader file contains eneral dimensions( mes"."eader.

!ode file contains coordinate and o!ners"ip of nodes( mes".nodes.

Elements file contains compositions of /ul+ elements and o!ners"ips


1/odies3( mes".elements.

"oundary file contains compositions of elements and o!ners"ips


1/oundaries3 and dependencies 1parents3 /oundar& elements(
mes"./oundar&.
Parallel preprocessing with Elmer
Parallel mes" structure( Apartitionin.nA

eader file# part.1."eader, part.2."eader, B part.n."eader

!odes# part.1.nodes, part.2.nodes, B part.n.nodes

Elements# part.1.elements, part.2.elements, B part.n.elements

"oundary elements( part.1./oundar&, part.2./oundar&, B


part.n./oundar&

Shared nodes between partitions# part.1.s"ared, part.2.s"ared, B


part.n.s"ared
Parallel preprocessing with Elmer
T"e /est !a& to partition

Serial mes" Elmer#rid parallel mesh


General syntax

ElmerGrid 2 2 existing.mesh [partoption]


Two principal partitioning techniques

Along cartesian axis (simple geometries or topologies

!ET"# li$rary
Parallel preprocessing with Elmer
0inimi7es communication /et!een computation nodes.
0inimi7es t"e relati$e num/er of mes" elements /et!een
computation nodes.
E-ual load /et!een computation nodes.
$deal mesh partitioning
.irectional decomposition

Elmer#rid 2 2 dir -partition Nx Ny Nz F


-partition 2 2 1 0
-partition 2 2 1 1
element-!ise
nodal
Parallel computing with Elmer
.irectional decomposition

Elmer#rid 2 2 dir -partition C' C& C7 ) -partoder n' n& n7

.efines t"e orderin direction 1components of a $ector3.


Parallel computing with Elmer
4sin 0ETIS li/rar&

Elmer#rid 2 2 dir -metis C 0et"od


-metis 4 0
Part0es".ual
Part0es"Codal
-metis 4 1
Parallel computing with Elmer
0ETIS

Elmer#rid 2 2 dir -metis C 0et"od


-metis 4 2
Part#rap"9!a&
Part#rap"8ecursi$e
-metis 4 3
Parallel computing with Elmer
0ETIS

Elmer#rid 2 2 dir -metis C 0et"od


-metis 4 4
Part#rap"P9!a&
Parallel computing with Elmer
6alo-elements

Elmer#rid 2 2 dir -metis C 0et"od


-"alo

Cecessar&, if usin discontinuous


#aler+in met"od.

Puts D"ost cellE on eac" side of t"e


partition /oundar&.
Parallel computing with Elmer
0ore parallel options in Elmer#rid

-indirect( creates indirect connections.

-periodic )' )& )7( declares t"e periodic coordinate directions for
parallel mes"es.

-partoptim( aressi$e optimi7ation to node s"arin.

-part/! minimi7e t"e /and!idt" of partiotion-partition couplins.


Parallel computing with Elmer
0pirun -np C ElmerSol$er%mpi.
0i"t c"ane on ot"er platforms.
0i"t need a "ostfile.
Ceeds an C-partitioned mes".
Ceeds EL0E8S2LFE8%ST,8TIC)2, !"ic" contains t"e
name of t"e command file.
2ptional li/raries

6;P8E

040PS
Parallel %ersion of ElmerSol%er
.ifferent /e"a$iour of IL4 preconditioner.

Cot a$aila/le parts at partition


/oundaries.

Sometimes !or+s. If not,use 6;P8E


Linear S&stem 4se 6&pre G Loical True

Parallel %ersion of ElmerSol%er
,lternati$e preconditioners in 6;P8E

ParaSails 1sparse appro'imate in$erse preconditioner3


Linear S&stem Preconditionin G Strin DParaSailsE

Hoomer,0# 1,le/raic multirid3


Linear S&stem Preconditionin G Strin DHoomer,0#E
Parallel %ersion of ElmerSol%er
,lternati$e sol$ers

Hoomer,0# 1,le/raic 0ultirid3


Linear S&stem Sol$er G DIterati$eE
Linear S&stem Iterati$e 0et"od G DHoomer,0#E

0ultifrontal parallel direct sol$er 1040PS3


Linear S&stem Sol$er G D.irectE
Linear S&stem .irect 0et"od G 0umps
Parallel %ersion of ElmerSol%er
Elmer !rites results in partition!ise.
name.0.ep, name.1.ep, ..., name.1n-13.ep
ElmerPost fuses resultfiles into one.
Elmerrid 1@ = name

)uses all time steps 1also non-e'istin3 into a sinle file called name.ep
1o$er!ritten, if e'ists3.
Special option for onl& partial fuse
-sa$einter$al start end step

T"e first, last and t"e 1time3step of fusin parallel data.


Parallel postprocessing
$ntroductory example# &low through a pipe
'unction
)lo! t"rou" pipe *unction.
Houndar& conditions(
1. v
in
G 1 cmAs
2. Cone
=. v
in
G 1 cmAs
>. and @. no-slip
1u
i
G 0 mAs3 on !alls.
(escription of the problem
1
=
2
@
>
Partition of mes" !it" Elmer#rid.
Elmer#rid 2 2 mes" -out part%mes" -scale
0.01 0.01 0.01 -metis > 2
Scales t"e pro/lem from cm m.
Creates a mes" !it" > partitions
/& usin Part#rap"8ecursi$e
option of 0ETIS-li/rar&.
Preprocessing
Header
Mesh DB "." "flow"
End
Simulation
Coordinate System = "Cartesian 3D"
Simulation Type ="Steady"
utput !nter"als = !nte#er $
%ost &ile = &ile "parallel'flow.ep"
utput &ile = &ile "parallel'flow.result"
ma( output le"el = !nte#er )
End
Sol%er input file
Sol"er $
E*uation = "+a"ier-Sto,es"
ptimi-e Bandwidth = .o#i/al True
.inear System Sol"er = !terati"e
.inear System Dire/t Sol"er = Mumps
Sta0ili-ation Method = Sta0ili-ed
+onlinear System Con"er#en/e Toleran/e = 1eal $.2E-23
+onlinear System Ma( !terations = !nte#er 32
+onlinear System +ewton 3fter !terations = !nte#er $
+onlinear System +ewton 3fter Toleran/e = 1eal $.2E-23
End
Sol%er input file
Body $
+ame = "fluid"
E*uation = $
Material = $
Body &or/e = $
!nitial Condition = $
End
E*uation $
3/ti"e Sol"ers4$5 = $
Con"e/tion = Computed
End
Sol%er input file
!nitial Condition $
6elo/ity $ = 1eal 2.2
6elo/ity 7 = 1eal 2.2
6elo/ity 3 = 1eal 2.2
%ressure = 1eal 2.2
End
Body &or/e $
&low Body&or/e $ = 1eal 2.2
&low Body&or/e 7 = 1eal 2.2
&low Body&or/e 3 = 1eal 2.2
End
Sol%er input file
Material $
Density = 1eal $222.2
6is/osity = 1eal $.2
End
Boundary Condition $
+ame = "lar#einflow"
Tar#et Boundaries = $
+ormal-Tan#ential 6elo/ity = True
6elo/ity $ = 1eal -2.2$
6elo/ity 7 = 1eal 2.2
6elo/ity 3 = 1eal 2.2
End
Sol%er input file
2ut!ard pointin normal
Boundary Condition 7
+ame = "lar#eoutflow"
Tar#et Boundaries = 7
+ormal-Tan#ential 6elo/ity = True
6elo/ity 7 = 1eal 2.2
6elo/ity 3 = 1eal 2.2
End
Boundary Condition 3
+ame = "smallinflow"
Tar#et Boundaries = 3
+ormal-Tan#ential 6elo/ity = True
6elo/ity $ = 1eal -2.2$
6elo/ity 7 = 1eal 2.2
6elo/ity 3 = 1eal 2.2
End
Sol%er input file
2
=
Boundary Condition )
+ame = "pipewalls"
Tar#et Boundaries475 = ) 8
+ormal-Tan#ential 6elo/ity = &alse
6elo/ity $ = 1eal 2.2
6elo/ity 7 = 1eal 2.2
6elo/ity 3 = 1eal 2.2
End
Sol%er input file
>
@
Sa$e t"e sif-file !it" name parallel_flow.sif
:rite t"e name of t"e sif-file into EL0E8S2LFE8%ST,8TIC)2
Commands for $uori.csc.fi(
Parallel run
module swit/h %r#En"-p#i %r#En"-#nu
module load elmer9latest
sallo/ate -n ) --ntas,s-per-node=)
--mem-per-/pu=$222 -t 22:$2:22 ;p
intera/ti"e
srun ElmerSol"er'mpi
2n an usual 0PI platform(
mpirun ;np ) ElmerSol"er'mpi
C"ane into t"e mes" director&.
8un Elmer#rid to com/ine results.
Elmer#rid 1@ = parallel%flo!
Launc" ElmerPost.
Load parallel%flo!.ep.
)ombining the results
*dding heat transfer
Sol"er 7
E(e/ Sol"er = 3lways
E*uation = "Heat E*uation"
%ro/edure = "HeatSol"e" "HeatSol"er"
Steady State Con"er#en/e Toleran/e = 1eal 3.2E-23
+onlinear System Ma( !terations = !nte#er $
+onlinear System Con"er#en/e Toleran/e = 1eal $.2e-<
+onlinear System +ewton 3fter !terations = !nte#er $
+onlinear System +ewton 3fter Toleran/e = 1eal $.2e-7
.inear System Sol"er = !terati"e
.inear System Ma( !terations = !nte#er 822
.inear System Con"er#en/e Toleran/e = 1eal $.2e-<
Sta0ili-ation Method = Sta0ili-ed
*dding heat transfer
.inear System =se Hypre = .o#i/al True
.inear System !terati"e Method = Boomer3M>
Boomer3M> Ma( .e"els = !nte#er 78
Boomer3M> Coarsen Type = !nte#er 2
Boomer3M> +um &un/tions = !nte#er $
Boomer3M> 1ela( Type = !nte#er 3
Boomer3M> +um Sweeps = !nte#er $
Boomer3M> !nterpolation Type = !nte#er 2
Boomer3M> Smooth Type = !nte#er <
Boomer3M> Cy/le Type = !nte#er $
End
*dding heat transfer
Material $
?
Heat Capa/ity = 1eal $222.2
Heat Condu/ti"ity = 1eal $.2
End
Boundary Condition $
?
Temperature = 1eal $2.2
End
Boundary Condition 3
?
Temperature = 1eal @2.2
End
ParaFie! is an open-source, multi-platform data anal&sis
and $isuali7ation application /& 9it!are.
.istri/uted under ParaFie! License Fersion 1.2
Elmer5s 8esult2utputSol$e module pro$ides output as
FT9 4nstructured#rid files.

$tu

p$tu 1parallel3
Postprocessing with Para+iew
Output for Para+iew
Sol"er 3
E*uation = "1esult utput"
%ro/edure = "1esultutputSol"e" "1esultutputSol"er"
E(e/ Sol"er = 3fter Sa"in#
utput &ile +ame = Strin# "flowtemp"
utput &ormat = 6tu
Show 6aria0les = .o#i/al True
S/alar &ield $ = %ressure
S/alar &ield 7 = Temperature
6e/tor &ield $ = 6elo/ity
End
Para+iew
Para+iew

You might also like