You are on page 1of 21

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/258312743

Open Foam on a Ubuntu 12.04LTS MPI Cluster

Data November 2013

CITATIONS READS

0 2,559

8 authors, including:

Megat Harun Al Rashid Megat Ahmad Azraf Azman


Malaysian Nuclear Agency Malaysian Nuclear Agency
57 PUBLICATIONS 22 CITATIONS 24 PUBLICATIONS 1 CITATION

SEE PROFILE SEE PROFILE

Hafizal Yazid
Malaysian Nuclear Agency
38 PUBLICATIONS 25 CITATIONS

SEE PROFILE

All content following this page was uploaded by Megat Harun Al Rashid Megat Ahmad on 10 March 2017.

The user has requested enhancement of the downloaded file.


January 31, 2013

Open Foam on a
Ubuntu 12.04LTS MPI Cluster

Megat Harun Al Rashid bin Megat Ahmad*, Azraf Azman,


Mohd. Rizal Mamat @ Ibrahim, Hafizal Yazid and Anwar Abdul Rahman

*megatharun.[at].gmail.com
Once, real free men built cluster from scratch....

. and then came along password administrator


Introduction

This is not a book. This is kinda like an essayed tutorial, a tutorial in making, not complete, which
we will update, from time to time. This tutorial stems from the experiences of building our first
basic Beowulf cluster almost three years ago, which was used to test run parallel Quantum
Espresso program and which later we forgot how to build. We therefore take the liberty (which
means we sacrificed our playing time) to build another one to recall back our memory and skills but
this time with a much complete features (automatic mount executable folder, system monitor,
scheduler etc.) which we hope can easily be replicated by anyone who knows how to pronounce
Linux (lnks) correctly. The narration of the tutorial is in first- and second-person so innocent
reader may feel a little bit confuse but fear not, through self-training (do the research, educate
oneself approach), we hope readers can build their own cluster for their own computational works
(and become self-depreciating gurus i.e. true experts). Now lets be serious and start following the
tutorial.

This quick tutorial will show nerds wannabe how to build a Beowulf type cluster computer in its
simplest form, i.e. with only one master and one node only. This is useful as the cluster can be used
to test run a program or software by parallel processing so that problems or issues when running
parallel computation can be sorted out before real run of the program/software on much larger
cluster. The operating system used in this manual is Ubuntu 12.04LTS (Precise Pangolin or PP).
The computers used are using 64-bit x86 architecture. The master is a Compaq Presario CQ40
Laptop with AMD Turion X2 (two dual-cores processor) and 2GB of RAM named as
megatharun and the node is a HP workstation (xw4300) with Pentium 4HT (physically one core
but two cores thru hyperthreading) and 1GB of RAM named as nausicaa. The heterogeneous
nature of the cluster is important as this would allow us to observe the effect of heterogeneous
system has on the parallel computation especially when a large cluster build from various
computers is used to build Beowulf type cluster. A switch (DLink 10/100MB) is used to connect
these computers. One port of the switch is used to connect to the internet therefore both computers
can access the internet. OpenMPI is used as the message passing interface.

Installation of Ubuntu 12.04LTS operating system on master and nodes.

Ubuntu is based on the Debian Linux distribution. The image of the operating system can be
downloaded from http://www.ubuntu.com/download/. In this manual we used the desktop AMD64
installer. The AMD64 term means it can be used on both 64-bit AMD and Pentium based
computers. Installation can be done in a variety of ways and for this cluster, installation was carried
out by using a bootable USB.

After installation, internet connection can be setup by connecting LAN/network to the switch. PP
usually detect this automatically (if the PP does connect automatically, open Edit Connections and
set wired connection to automatic DHCP on the iPV4 settings folder and also by selecting the
Connect Automatically). These are shown in Figure 1a to 1c.

________________________________________________________________________________

Note:

The first cluster was named nausicaa by the first author, yes, the first author of this tutorial, i.e. me. The name came
from a Japanese feature length anime by Hayao Miyazaki titled Nausicaa and the Valley of the Wind which is about a
princess named Nausicaa who can subconsciously communicate and partially control a large herds of devastating
creatures named Ohmu. Now you can see the relation: nausicaa (master) and ohmu (nodes). Of course, the first author
likes the movie very very much and has seen it many many times. That is how the name nausicaa comes into usage.
________________________________________________________________________________
Figure 1a

Figure 1b
Figure 1c

It is advisable to update the PP using the Update Manager. The Update Manager can be
opened by clicking dash home (Figure 2a -2c) and type Update Manager. The Unity interface will
show the Update Manager icon. Open it by clicking the icon. Select the fastest server
downloading service by clicking settings button and choose the downloading server of choice in
Ubuntu Software folder (in our case we used United States server). Click Check and later click
Install Updates button.
Figure 2a

Figure 2b
Figure 2c

Installation of necessary programs (compilers, networking interfaces)

Installation of program packages that will be used for building the cluster can be done using apt-
get application but before that, we need to change the UNIX password of PP. This can be done by
opening a terminal by pressing Ctrl+Alt+t. Type the followings after shell prompt:

~$ sudo passwd

A password which is the user password that was set during the installation of PP will be asked. Type
the password, say 1234. You will be later ask to type the new UNIX password twice, this would be
your root's password.

Enter new UNIX password:


Retype new UNIX password:

This password is important when you want to access the other computer later on.

Using the apt-get application, the necessary program packages can be installed and these
packages must be the same version on master and nodes (yes, the packages need to be installed on
master and nodes. Is it possible to just install them in master? Yes, that possible, but we will refrain
from discussing that now, so the innocents would not get thoroughly confused). To install the
OpenMPI packages type in the terminal:

~$ sudo apt-get install openmpi-bin openmpi-common libopenmpi1.3 libopenmpi-dev


This will install the OpenMPI libraries plus the MPI compilers (C,C++ and FORTRAN).
It is also advisable to install the GNU compilers for C, C++ and FORTRAN. This can be done by
installing build-essential:

~$ sudo apt-get install build-essential

Under PP, the package executables are usually installed at /usr/bin/ folder and libraries are at
/usr/lib.

Each slave nodes must has ssh server so that it can be controlled by the master and this can be
installed by

~$ sudo apt-get install openssh-server

whereas for the master need to be a client to each nodes or servers and this can be done by installing
on master the ssh client

~$ sudo apt-get install openssh-client

(it is also possible for both master and nodes to be installed with both ssh server and client
and this allow each computers to communicate individually to each other)

Network configuration

Figure 4 shows you how your hosts file may look like. The file is located in the /etc/ folder
and can be viewed in terminal by typing (e.g. for the master) :

~$ pico /etc/hosts

127.0.0.1 localhost
127.0.1.1 megatharun

# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

in which megatharun is the name of the master computer in our case. This file need to be edited
but before editing the "inet addr" needs to be known. The "inet addr" can be viewed by
typing:

~$ ifconfig | grep "inet addr"

which will give something like this:

inet addr:10.10.2.204 Bcast:10.10.3.255 Mask:255.255.254.0


inet addr:127.0.0.1 Mask:255.0.0.0
The IP address in hosts file for hostname therefore need to be changed to 10.10.2.204
and hostname for node need to be added with its IP address increase by adding one to the last
number like below:

127.0.0.1 localhost
10.10.2.204 megatharun
10.10.2.205 nausicaa

# The following lines are desirable for IPv6 capable hosts


::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

subsequent nodes IP address numbers are something like 10.10.2.206 for node2,
10.10.2.207 node 3, etc. All nodes must be listed in this file. The hosts file in the nodes must
also be alike in master.

The interfaces file in the /etc/network/ folder also need to be edited by adding the
ethernet interface to make the IP address static. This can be first check using ifconfig at
terminal:

~$ ifconfig

which will show something like this

eth0 Link encap:Ethernet HWaddr 00:1e:ec:a3:1b:2b


inet addr:10.10.2.204 Bcast:10.10.3.255 Mask:255.255.254.0
inet6 addr: fe80::21e:ecff:fea3:1b2b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:12 errors:0 dropped:0 overruns:0 frame:0
TX packets:43 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1092 (1.0 KB) TX bytes:6808 (6.8 KB)
Interrupt:45 Base address:0xc000

lo Link encap:Local Loopback


inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:3445 errors:0 dropped:0 overruns:0 frame:0
TX packets:3445 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:581028 (581.0 KB) TX bytes:581028 (581.0 KB)

This means the interface is eth0 and as the inet addr is 10.10.2.204 (from previous
section), it needs to be specifically added in the interfaces file under eth0 section
(broadcast AND gateway are optional):
~$ pico /etc/network/interfaces

auto lo
iface lo inet loopback

by adding the specifics below:

auto eth0
iface eth0 inet static
address 10.10.2.204
netmask 255.255.254.0
broadcast 10.10.3.255

After the file has been modified and saved, the network needs to be restarted:

~$ sudo /etc/init.d/networking restart

The master and all nodes interfaces files need to be reconfigured as above and the network
restarted (The addresses on the nodes will start from 10.10.2.205, 10.10.2.206, etc.).

The network can be tested by ping to the other computers (nodes), e.g.

~$ ping 10.10.2.205 (ping from master's terminal )


~$ ping 10.10.2.204 (ping from nausicaa's terminal)

or by ssh:

~$ ssh megatharun (e.g. from nausicaa (node) to master or vice versa, as root user)

User creation and passwordless ssh

A common user name must be used on all computers when running program with OpenMPI. Let
set the name of the common user as mpiuser (and the password) on master and nodes, in terminal
type:

~$ sudo useradd -d /home/mpiuser -m mpiuser

~$ sudo passwd mpiuser

The mpiuser is thus created on the home directory. While on the master, a ssh public/private key
pairs with password (your own's password) can be generated using file
/home/mpiuser/.ssh/id_dsa (by agreeing to all requests):

~$ ssh-keygen -t dsa

To make sure each machine knows that mpiuser is authorized to log into them, copy the
generated public key as authorized keys:

~$ cp /home/mpiuser/.ssh/id_dsa.pub /home/mpiuser/.ssh/authorized_keys

If the /home/mpiuser folder in the master is not shared than each node should know that the
mpiuser on master is authorized to log into them by copying the generated public key to each node:
~$ cp /home/mpiuser/.ssh/id_dsa.pub mpiuser@nausicaa:.ssh/authorized_keys

File permissions need to be changed on master (and all nodes if mpiuser folder not shared):

~$ chmod 700 /home/mpiuser/.ssh

~$ chmod 600 /home/mpiuser/.ssh/authorized_keys

Test whether ssh working with the password that has been set (without being root):

~$ ssh nausicaa (e.g. ssh from master to node)

~$ ssh megatharun (e.g. ssh from node to master)

The OpenMPI use ssh to connect between machine when running the program parallel and
therefore passwordless connection is required. We can use ssh-agent to remember the password
while logged in as mpiuser:

~$ eval `ssh-agent`

and inform the ssh-agent the password for ssh key:

~$ ssh-add ~/.ssh/id_dsa

while logging in test ssh to the node:

~$ ssh nausicaa

Password will not be asked and also do this on the node.

________________________________________________________________________________

Note:

1. If you are login as mpiuser, you may not be able to use sudo command. mpiuser can be set to be able to
use the sudo command by login as root in terminal and add mpiuser as user that can use sudo command;

~$ adduser mpiuser sudo (this would set mpiuser the permission to use sudo as stated in /etc/sudoers file)

2. When login as mpiuser, if bash is not available, you will get in terminal just the $ sign:

~$

bash can be activated by typing bash:

~$ bash

and this is what you will see (in master):

mpiuser@megatharun~$
________________________________________________________________________________________________
Running parallel OpenFoam using OpenMPI

Running require installation of the OpenFoam program and the Paraview software. Installation
steps for Ubuntu can be found on http://www.openfoam.org/download/ubuntu.php website that
shows installation using the apt-get application whereas manual installation steps can be found
on http://www.openfoam.org/download/source.php. In this tutorial we will do the installation by
using the apt-get application.

The OpenFoam program and the Paraview software need to be installed on both master and all
nodes. Installation through terminal is as follows:

~$ VERS=$(lsb_release -cs)

~$ sudo sh -c echo deb http://www.openfoam.org/download/ubuntu $VERS main >


/etc/apt/sources.list.d/openfoam.list

~$ sudo apt-get update

These are to update the apt-get package list to include the new download repository location for
OpenFoam. Installation of OpenFoam and Paraview can now be carried out:

~$ sudo apt-get install openfoam211 (211 refers to version 2.1.1)

~$ sudo apt-get install paraviewopenfoam3120

Both OpenFoam and Paraview will be installed in /opt directory. In order to use the installed
OpenFOAM package, the line source /opt/openfoam211/etc/bashrc need to be added
to the last line of .bashrc file in mpiuser folder

~$ pico .bashrc

the line is added after the last word fi. The installation can be check by closing the terminal and
opening a new terminal and typing:

~$ icoFoam -help

If there is no errors remarks, just the usage comment, then the installation is complete. The
OpenFoam can be tested by running any of the example in the tutorial. In order to do that, a run
directory need to be created. Login as mpiuser and create the project directory within the
$HOME/OpenFoam directory named <USER>-2.1.1 (e.g. mpiuser-2.1.1 for user mpiuser
and OpenFoam version 2.1.1) and create a directory named run within it and this can be done by
typing:

~$ mkdir -p $FOAM_RUN

the $FOAM_RUN can be check first by typing:

~$ echo $FOAM_RUN

and you will probably get something like:

~$ /home/mpiuser/OpenFOAM/mpiuser-2.1.1/run
The tutorials can be copied from the $FOAM_TUTORIALS to the newly created project directory:

~$ cp -r $FOAM_TUTORIALS $FOAM_RUN

Again you can see the address of the $FOAM_TUTORIALS directory by using echo command. It
is now possible to run one of the tutorial, e.g. the cavity tutorial:

~$ cd $FOAM_RUN/tutorials/incompressible/icoFoam/cavity (go to the cavity tutorial directory)


~$ blockMesh (building the mesh)
~$ icoFoam (running compiled executable)
~$ paraFoam (opening the paraview to view results)

The cavity paraview results may look like in Figure 3 (after viewing modifications, of course)

Figure 3

These installation steps need to be carried on both master and nodes (but paraview is not
necessary to be installed in nodes). To run an OpenFoam executable in parallel, file profile in
the etc/ folder and files .bashrc and .profile in mpiuser folder need to be added with:

. /opt/openfoam211/etc/bashrc

as the first line, in both master and nodes. Now we can try to test the damBreak tutorial by entering
the damBreak/ tutorial folder and see what is inside:
~$ cd $FOAM_RUN/tutorials/multiphase/interFoam/ras/damBreak

~$ ls -a

. .. 0 Allrun constant system

In the system folder, there is a decomposeParDict file that will dictate the number of parallel
processes a user wants to run. Open the file:

~$ pico system/decomposeParDict

GNU nano 2.2.6 File: decomposeParDict

/*--------------------------------*- C++ -*----------------------------------*\


| ========= | |
| \\ / F ield | OpenFOAM: The Open Source CFD Toolbox |
| \\ / O peration | Version: 2.1.1 |
| \\ / A nd | Web: www.OpenFOAM.org |
| \\/ M anipulation | |
\*---------------------------------------------------------------------------*/
FoamFile
{
version 2.0;
format ascii;
class dictionary;
location "system";
object decomposeParDict;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //

numberOfSubdomains 4; #This means running for 4 parallel processes

method simple;

simpleCoeffs
{
n ( 2 2 1 ); #Multiplication of these numbers should produce 4
delta 0.001;
}

hierarchicalCoeffs
{
n ( 1 1 1 );
delta 0.001;
order xyz;
}

manualCoeffs
{
dataFile "";
}

distributed no;

roots ( );

// ************************************************************************* //
and edit the numberOfSubdomains number and set it to 4. In the simple simpleCoeffs
section set the n numbers to (2 2 1)which will produce 4 when multiplied. Why 4? well, we
have 4 processors/cores and we want to utilize all of them (even though one is just virtual). Next we
execute meshBlock and decomposePar in the terminal:

~$ meshBlock

~$ decomposePar

You will see some messages after running these but most importantly you will get four processor
folders (that would relate to the four processes you will run):

~$ ls -a

. .. 0 Allrun constant processor0 processor1 processor2 processor3 system

These steps need to be carried on both master and nodes. Now it is possible to run the program in
parallel from the master's terminal but before that we need to create a machine file in master that list
down the number of processors/cores to be used:

~$ pico machines

# The Hostfile for Open MPI


# The master node, 'slots=2' is used because it is a dual-processor machine.
megatharun cpu=0
# The following slave nodes are single processor machines (but set to 2 because of hyperthreading)
nausicaa cpu=2
(Please remember that # indicates that the sentence after it is just comment and will not be
executed). It is possible to use slots instead cpu. We can also rank the specific cores that we
want to use in the rankfile, but we need to see the processor architecture first. This can be done
using the hwloc-ls command:

~$ hwloc-ls

Figure 4a: megatharun's processor Figure 4b: nausicaa's processor architecture


architecture

Physically there is only one core at nausicaa's processor but with two addresses (PU P#0 and PU
P#1) whereas in megatharun's processor both are physically separated, indicating two cores.
Therefore the rankfile can be created and formatted like this:

~$ pico rankfile

rank 0=megatharun slot=p0


rank 1=megatharun slot=p1
rank 2=nausicaa slot=p0
rank 3=nausicaa slot=p1

where p0 to p1 indicate the address of the cores (see Figure 4a and 4b). Running parallel can be
started by running in master's terminal this typical command:

~$ mpirun -np 4 --hostfile machines --rankfile rankfile interFoam -parallel > log &

It is also possible not to use the rankfile as slotting will be carried out automatically unless we want
to specify how computation is distributed in the cluster:

~$ mpirun -np 4 --hostfile machines interFoam -parallel > log &


The -np stands for the number of processes that we want to run as specified in
decomposeParDict file. If the number of processes is larger than the number of cores available
than the processes will queue. The interFoam is the name of the damBreak executable that we
ran. In our case, we took liberty to do some tests by running two to four processors, and here is the
results:

Time:
ET execution time; CT clock time; - core used

master ET ET ET
162.3s 17.65s 24.75s
node CT CT CT
191s 19s 31s

master ET ET* ET**


91.18s 41.16s 147.84s
node CT CT* CT**
92s 46s 172s

*run using 100MB/1000MB switch (TP-LInk 100MB/1000MB).


**run using rankfile.
________________________________________________________________________________

Note:

1. If you asked the first author, what is ET and CT? He would probably answered ET stands for extra-terresterial
and CT stands for crashed-test, well, obviously he did not know. So please ask the other authors.

2. You can see that using lower number of processors gave much better computation time. Why?
Because...hhmmm...well...if you need to run a program using cluster, that program would probably be large so that the
computation time is larger and it is only then it would be practical to run it using cluster, then you will see it will take
much lower time to compute. Smaller program computation time can be limited by communication rate between
computers, I do not think I explain it well, but I think you get the ideas. In this tutorial, we just want to show how to
build cluster and run OpenFoam parallel.
_______________________________________________________________________________

After parallel running finished, you will find only processor0 and processor1 folders in
master have the results files whereas only processor2 and processor3 folders in node have
the results files. This indicates that works have been distributed. If you open System Monitor
program and click the Resources folder in both master and node, you will see all processors
running at 90-100% capacity during runtime with network history showing higher rate
communication. In our case the rate is about ~600 KiB/s (Please use the Unity to locate System
Monitor).

If there are many nodes than it may not be practical to install the OpenMPI compilers and
OpenFoam in all nodes. We can avoid that by mounting the OpenFoam folder in master on all the
nodes. Therefore, only the master need the compilers and executables. This can be done by using
the nfs implementation. The nfs can installed using the apt-get application. In the master's
terminal:
~$ sudo apt-get install nfs-kernel-server nfs-common

and on the node's terminal:

~$ sudo apt-get install nfs-common

An empty OpenFoam folder also need to be created at node and allow others to access (read and
write). In case of the master, it is advisable also to create an empty OpenFoam folder first,
mounting it first to the node and later copy all the OpenFoam files, folders and executables to run
the program parallel (please refer to OpenFoam installation). Thus to follow this you may want to
delete first the OpenFoam folder and all its contents which you have created early before, in both
master and node. In the node's terminal:

~$ mkdir -p OpenFoam

~$ chmod ugo+rwx OpenFoam (or chmod 777 OpenFoam)

So for the master, in the terminal:

~$ mkdir -p OpenFoam

~$ chmod ugo+rwx OpenFoam (or chmod 777 OpenFoam)

The permission set must be 77* so that user and group can read and write, the last number (*) can
be set according to your convenience (as we cut out the internet line after the cluster is fully
operational we do not mind setting the * as 7). After this, edit the /etc/exports file:

~$ sudo pico /etc/exports

by adding this after the last line

/home/mpiuser/OpenFoam nausicaa(rw,sync,fsid=0,crossmnt,no_subtree_check)

The nfs server needs to be restarted, in the terminal type:

~$ sudo /etc/init.d/nfs-kernel-server restart

Mounting at the OpenFoam folder at node can be done using this command at the node's terminal:

~$ sudo mount -t nfs megatharun:/home/mpiuser/OpenFoam /home/mpiuser/OpenFoam

Later on please copy all OpenFoam folders, files and executables to the empty OpenFoam
directory on the master and change the permission:

~$ cd OpenFoam

~$ mkdir -p mpiuser-2.1.1/run
The tutorials can be copied from the $FOAM_TUTORIALS to the newly created project directory:

~$ cp -r $FOAM_TUTORIALS OpenFoam/mpiuser-2.11/run/

~$ sudo -R chmod 777 OpenFoam OR sudo -R chmod ugo+rwx OpenFoam

You will find all the copied folders, files and executables in node under the OpenFoam directory.
To unmount the folder in node, just type:

~$ sudo umount OpenFoam/

It is actually more practical to automatically mount the folder on node and this can be done by
editing the /etc/fstab file (on nodes only):

~$ sudo pico /etc/fstab

by adding this after the last line

megatharun:/home/mpiuser/OpenFoam /home/mpiuser/OpenFoam nfs user, rw,auto 0 0

The OpenFoam folder will be automatically mounted in the node after booting the master. The
folder can only be unmount by root user. Parallel run can be carried as before, that is blockMesh
and decomposePar need to be run first in damBreak folder and the same mpirun command
can be used. You will find out all the processor folders have the computation results and here is
the computational time results:

Time:
ET execution time; CT clock time; - core used

master ET ET ET ET
71.67s 4.12s 7.17s 37.5s
node CT CT CT CT
91s 5s 27s 38s

________________________________________________________________________________

Note:

1. If there are many nodes, it would be better to use IP address in exports file like:
/home/mpiuser/OpenFoam 10.10.2.205/209(rw,sync,fsid=0,crossmnt,no_subtree_check)
which means nodes 205 to 209 can mount the folder.

2. nfs and ssh both stands for no freaking system and sick shell, well actually not, they are acronyms for
network file system and secure shell.
________________________________________________________________________________
Conclusion

Well, we believe that is all for now. What we told here are the configuration of working cluster
based on our system. Different system (like those using .rpm or even other .deb distribution or
BSD) may need different configurations, but some parts definitely will be similar, if not the same. In
future, we will work on the scheduler and with larger number of nodes (and possibly hybrid of
OpenMPI and OpenMP). Now, any agency or institute that wants to establish cluster infrastructure,
we can provide the quotation.

________________________________________________________________________________

Note:

If anyone build his/her cluster based on our tutorial, and the cluster works, please acknowledge us by listing this tutorial
in reference, otherwise we would be very very very angry.
________________________________________________________________________________

View publication stats

You might also like