You are on page 1of 7

Using the COMSOL version 4.

3 Cluster Computing Feature at the UB CCR


About This Document
This document provides a step-by-step tutorial for using the Cluster Computing feature of COMSOL
Multi-physics on the UB CCR rush cluster.
Why would I want to run cluster computing feature?
The cluster computing feature of COMSOL allows users to run a model in parallel from within the userfriendly COMSOL GUI. This allows for users to monitor the progress of COMSOL as it goes about solving
a computationally intensive or memory intensive model. This might be desirable if one were in the
process of building a complex model and wanted to make sure the model development was proceeding
successfully. A user can run an intermediate version of their model (e.g. after adding a new multi-physics
option) and check to see if the new addition looks to be working ok. If problems are observed, the user
can halt the model via the GUI and make changes as needed.
Why wouldnt I want to run cluster computing feature?
The cluster computing feature isnt appropriate for production-style model runs. These are model runs
or parameter sweeps for which the user is already very confident in the model that they have
developed. Basically, if you are finished with model development and just need to run a complex model
or parameter sweep, then the Cluster Computing option is not appropriate. Instead, you should run
COMSOL in batch mode via submitting a slurm job. Example slurm scripts are available off the front-end.
The choice of script depends on which COMSOL license you are using, as given below in Table 1:
Table 1: COMSOL Group Information

COMSOL License
natashal group
comsol group
jmjornet group

Example SLURM Script


/util/slurm-scripts/slurmCOMSOL-natashal
/util/slurm-scripts/slurmCOMSOL-ub
/util/slurm-scripts/slurmCOMSOL-jmjornet

Module Name
comsol/4.3
comsol-ub/4.3a
comsol-jmjornet/4.3b

What are some limitations of the using the Cluster Computing option at CCR?
(1) UB CCR uses a front-end firewall to prevent its compute nodes from being directly exposed to the
outside world. As a result the COMSOL Cluster Computing option can only be used via running the
COMSOL GUI from the front-end, from one of the CCR compute nodes, or from the remote visualization
node. If the GUI is run from an external machine (e.g. a users personal laptop or desktop PC) it will not
be able to communicate with the compute nodes and the Cluster Computing feature will not work.
(2) An interactive SLURM job for launching the COMSOL server must be up and running prior to
launching the COMSOL GUI. This might mean that users will have to wait for some time before being
able to get to work, since the interactive job might be queued by the CCR resource manager depending
on the system load at the time the job is submitted. One way to avoid long waits is to submit the
interactive job to the debug partition. However, this will limit the number of nodes that can be
requested and will also limit the walltime for the Cluster Computing job to 1 hour.

How do I use the COMSOL Cluster Computing feature at UB CCR?


There are several steps involved. First, users should become familiar with logging into the UB CCR frontend machine (rush.ccr.buffalo.edu). Training is available at: http://ccr.buffalo.edu/support/UserGuide.html.
Alternatively, users may wish to launch COMSOL from the remote visualization (viz) node. Instructions
for accessing the node are given here: http://ccr.buffalo.edu/support/research_facilities/remote-visualization.html.
The rest of this guide assumes users are familiar with connecting to the CCR front-end or viz node and
are able to navigate a Linux command line interface (e.g. via commands like cd, ls, pwd, cat, etc.)
Step 1 Launch a COMSOL server on each compute node
Compute nodes are the processors which the COMSOL GUI will end up using to solve the model using
parallel processing. We need to request the desired number and type of compute nodes from the UB
CCR resource manager. Then we need to launch a comsol server on these nodes so that they can
interface with the COMSOL GUI.
Open a new ssh connection to the CCR front-end or launch a terminal from within the desktop of the
remote visualization node. From the command prompt, submit a request for an interactive job using the
fisbatch command. For example:
$ fisbatch --partition=debug --time=01:00:00 --nodes=2 ntasks-per-node=12 --mem=48000

Will request two compute nodes from the debug partition with 12 processors per node and 48GB of
RAM per node. A total of 24 processors will be used in solving the model. Information on the type of
nodes available (including partitions, processor counts, amount of memory, and SLURM constraints) is
available via the snodes command. Type snodes help at the command prompt for usage
information.
After entering the fisbatch command you will have to wait a bit (or possibly longer) for the scheduler to
process your request. Once the desired nodes become available the scheduler will automatically log you
into one of the compute nodes (this is known as the head node) and youll be provided with a
command line prompt. From this prompt, youll launch the COMSOL server software on each of the
requested nodes. Do this by entering the following sequence of commands (replace
your_comsol_module with the appropriate module for your group, see Table 1):
$ cd $SLURM_SUBMIT_DIR
$ srun hostname | sort | uniq > nodes.comsol
$ module load your_comsol_module
$ comsol nn 2 np 12 f nodes.comsol server
Be sure to match the value of the -nn (number of nodes) argument with the actual number of nodes
requested by the previous fisbatch command. Also be sure to match the np (number of processors per
node) argument with the actual number of tasks per node (ntasks-per-node) requested by the previous

fisbatch command. In this example, these values are 2 and 12, respectively. When you first run the
server you may be prompted for a username and password. If this happens, enter your UB CCR
username and password. Now the comsol server should launch and you should see output similar to the
following appear in the terminal:
Node 0 is running on host: k16n13a.ccr.buffalo.edu
Node 0 has address: k16n13a.ccr.buffalo.edu
Node 1 is running on host: k16n12b.ccr.buffalo.edu
Node 1 has address: k16n12b.ccr.buffalo.edu
COMSOL 4.3 (Build: 151) started listening on port 2036
Use the console command 'close' to exit the application

You may now minimize (but do not close) the comsol server terminal window. In the next step we will
run the COMSOL client GUI and connect it to these compute nodes.
Step 2 Launch the COMSOL GUI via comsol client
Open a second connection to the UB CCR front-end machine, or open a second terminal in the remote
visualization desktop. In the resulting terminal window, enter the following commands to launch the
COMSOL client GUI from the front-end. Replace your_comsol_module with the version that is
appropriate for your group (see Table 1, above):
$ module load your_comsol_module
$ comsol client
The COMSOL GUI splash screen will appear, followed by a dialog box that prompts for information about
the server node. An example is given below:

In the server text box, enter the name of the head compute node. This corresponds to the name of
Node 0 in the comsol server output (see above). For this example, the head node is
k16n13a.ccr.bufalo.edu. For the port number text box, enter the port number that the server is
listening on. This is provided in the output from comsol server command (see above) and in this example
the value is 2036. Enter your UB CCR username and password and click the ok button. Now the
COMSOL GUI will load and will connect to the compute nodes behind-the-scenes.

Step 3 Open a model and add Cluster Computing study option


The next step is to open your COMSOL model and add a Cluster Computing option to your desired
study. If you have initially developed your COMSOL model on a laptop, desktop PC or workstation PC
you will need to transfer the corresponding .mph file over to the CCR storage area. For example, this can
be done using FileZilla. Training for this is available at: http://ccr.buffalo.edu/support/UserGuide.html
This example uses the buoyancy_free model located at: /projects/ccrstaff/lsmatott/comsol/buoy. In the
COMSOL GUI click on the Show icon of the Model Builder. It is circled in red in the figure below:

After clicking the Model BuilderShow icon a drop-down list will appear. In this list, make sure the
Advanced Study Options box is checked. It is circled in red in the figure below:

In the Model Builder area, highlight the name of the study that youd like to run in parallel. Then,
right-click and select Cluster Computing from the resulting drop-down list. This will add a Cluster
Computing node to the selected study, as shown on the following page.

After: Cluster Computing


node is added to the study.

Click on the newly added Cluster Computing node. This will open the Cluster Computing tab to the
right of the Model Builder area, as shown below:

In the Batch Settings area of the Cluster Computing tab, do the following:
(1) Select General from the drop-down list of Cluster types
(2) Uncheck the MPD is running box
(3) In the Host file: text box type the full path to the location of the nodes.comsol file that was
created in Step 1 (see above)
(4) Leave the Bootstrap server textbox blank
(5) In the Rsh textbox, type /usr/bin/ssh
(6) In the Number of nodes textbox, enter the number of compute nodes requested in Step 1 (see
above). For this example, the value is 2.
(7) In the Filename box, enter the full path to the .mph model file that you have opened. In this
example, the value is: /projects/ccrstaff/lsmatott/comsol/buoy/buoyancy_free.mph

(8) In the Directory box, enter the full path to the directory where the .mph model file is located.
In this example, the value is: / projects / ccrstaff /lsmatott/comsol/buoy
(9) Uncheck the Specify external COMSOL batch directory path box
(10)Uncheck the Specify external COMSOL installation directory path box
(11)Uncheck the Use batch license box
When all fields are filled out correctly, click the Save button. It is circled in red in the figure below. For
this example, the completed Cluster Computing configuration tab is given on the following page.

Step 4 Run the model and monitor progress


Now that the model has been configured to use the cluster compute nodes, you can run the model by
clicking on the usual compute icon ( ) for the selected study. Alternatively you can press the F8 key.
However, now the COMSOL solver will run on the compute nodes instead of on the client node that is
displaying the GUI!
You can monitor the progress of the cluster computation in the same ways that you would monitor a
non-cluster computation. For example, you can click on the Progress tab in the lower area of the GUI
below the Graphics area. This is shown below for the buoyancy_free example.

Step 4a (optional) Monitor compute nodes using ccrusrviz


UB CCR provides a job visualization tool that can be used to monitor the parallel performance of the
COMSOL Cluster Computing solver. To launch this tool, open a new connection to the CCR front-end
and type the following command in the terminal:
$ /util/ccrjobvis/ccrusrviz
If you are running other jobs besides a COMSOL cluster computing job, you should use the following
command instead:
$ /util/ccrjobvis/slurmjobvis <job_id>
Where <job_id> is the job number returned by the fisbatch command issued in step 1.
These commands launch a GUI that monitors CPU, memory and network utilization on the compute
nodes assigned to the COMSOL Cluster Computing feature. This tool should be launched prior to
running COMSOL compute for a given study. For the buoyancy_free example, the job visualization
graphic should look something like the figure given below once COMSOL compute is launched for the
study:

You might also like