Professional Documents
Culture Documents
User's Guide
Version 1.24
Sep. 14, 2015
Advanced Center for Computing and
Communication
RIKEN
Revision date
1.4
2010.02.19
Change content
1.1 Outline of the System modified
2.1.5 Access to the RIKEN network from RICC added
2.2 Account and Authentication modified
2.6 Login environment modified
3.1 Available file area modified
3.2.3 Local disk area (work area) modified
4. How to create jobs modified
5.1.1.3.1 Example of chain job added
5.1.1.4 Major options for job submission modified
5.1.1.6 Software resource modified
5.1.1.10 Script file for Batch job modified
5.1.2 Confirm job information modified
5.1.3 Operate job modified
5.2 Interactive Job modified
6.6 FTL Syntax modified
8 How to use Archive system modified
9.2 RICC Mobile Portal modified
10.1 Product manual modified
1.5
2010.03.17
1.6
2010.04.01
1.7
2010.06.07
1.8
2010.07.30
1.9
2010.08.25
1.10
2010.11.19
1.11
2011.05.02
1.12
2011.08.17
1.13
2012.4.2
1.14
2012.7.13
1.15
2012.11.26
1.16
2013.1.11
1.17
2013.4.1
1.18
2013.5.14
1.19
2013.9.2
1.20
2013.10.16
1.21
2014.08.04
1.22
2015.04.01
1.23
2015.04.27
1.24
2015.09.14
Contents
Introduction ............................................................................................................................................ 1
1. Outline of the System ........................................................................................................................ 2
1.1 Outline of the System..................................................................................................................... 2
1.2 Hardware outline ............................................................................................................................ 3
1.3 Software Overview ......................................................................................................................... 5
1.4 Maintenance .................................................................................................................................. 5
1.5 Usage categories ........................................................................................................................... 6
2. How to Access ................................................................................................................................... 7
2.1 Login Flow ...................................................................................................................................... 7
2.2 Account and Authentication ......................................................................................................... 20
2.3 Update Password......................................................................................................................... 21
2.4 Access RICC ................................................................................................................................ 24
2.5 Login environment ....................................................................................................................... 27
2.6 File transfer .................................................................................................................................. 28
3. File Area ............................................................................................................................................ 31
3.1 Available file area ......................................................................................................................... 31
3.2 Type of available file area ............................................................................................................ 32
4. How to create jobs ........................................................................................................................... 33
4.1 Outline of Compilation / Linkage .................................................................................................. 33
4.2 Compilation / Linkage for GPGPU program ................................................................................ 38
4.3 Library management .................................................................................................................... 40
4.4 Linkage of Math library................................................................................................................. 41
4.5 Job Freeze Function .................................................................................................................... 42
5. How to execute Job ......................................................................................................................... 45
5.1 Batch job / Interactive batch job ................................................................................................... 46
5.2 Interactive Job .............................................................................................................................. 84
6. FTL (File Transfer Language) ......................................................................................................... 85
6.1 Introduction .................................................................................................................................. 85
6.2 Transfer input file ......................................................................................................................... 86
6.3 Transfer input directory ................................................................................................................ 87
6.4 Transfer output file ....................................................................................................................... 88
6.5 FTL Basic Directory ..................................................................................................................... 90
6.6 FTL Syntax ................................................................................................................................... 92
Copyright (C) RIKEN, Japan. All rights reserved.
-v-
INTRODUCTION
In this Users Guide, we explain usage of the Supercomputer System (RICC, RIKEN Integrated Cluster
of Clusters) installed at RIKEN. Please read this document before you start using the system. This
Users Guide is available for reference and download at the following homepage. The contents of this
Users Guide are subject to change.
https://ricc.riken.jp
Shell scripts and other examples in this Users Guide are available in the following directory on RICC.
ricc.riken.jp:/usr/local/example
Please send your inquiry on programming consultation, such as usage methods, debugging, paralleling
or tuning programs and any questions about RICC to the following e-mail address.
Email: hpc@riken.jp
No portion of this document may be copied, reproduced, or distributed in any way, or by any means,
without permission.
Users are able to edit, compile, link programs, submit batch jobs and obtain computed results through
Login Server (ricc.riken.jp). Also each computing server can run interactive jobs which is necessary for
users to debug their programs. In addition, users can access the system from non-RIKEN network
through VPN and can use the system as if they were on the RIKEN network.
Users are able to login RICC on the RIKEN network by the ssh or the scp, etc. In addition, RICC
provides the web portal site, RICC Portal, which allows users to access RICC by web browser on the
PC. Users are able to edit, compile, link programs, submit batch jobs and obtain computed results on
RICC Portal.
In RICC, users home directories are located in the high speed magnetic disk device. Users can access
files in their home directories from Login Server, Multi-purpose Parallel Cluster. When executing batch
jobs on Massively Parallel Cluster, users need to transfer necessary files from their home directories to
local disks of Massively Parallel Cluster and return computed results back to their home directories.
These operations can be performed easily by commands in shell scripts used when submitting batch
jobs.
All systems of RICC are available to login by the issued RICC user accounts, the RICC passwords and
the passphrases of the public-key based authentication method. The passphrases can be generated on
RICC Portal.
https://ricc.riken.jp/cgi-bin/hpcportal.2.2/index.cgi?LMENU=SYSTEM
Massively Parallel
Multi-purpose Parallel
Front End
Cluster (MPC)
Cluster(UPC)
using SSD(SSC)
system
OS
Compiler
Fujitsu compiler
Intel Parallel Studio XE Composer Edition for Fortran and C++ Linux
Library
Application
GOLD/Hermes
Q-Chem
Q-Chem,
GaussView
GOLD/Hermes
1.4 Maintenance
Basically RICC is 24/7 operation, but emergent maintenance is performed if needed. We make every
effort to inform users of the maintenance in advance.
General Use
Quick Use
For more information, please refer to 4. Usage Categories in RIKEN Supercomputer System Usage
Policy, which is available on the following URL.
http://accc.riken.jp/en/supercom/application/usage-policy/
-policy/
1.5.1 Available computation time
Available computation time is different by projects. Use the listcpu command to check allotted
computation time, used computation time and date of expiry of allotted computation time. When used
computation time reaches 100%, jobs cannot be submitted.
[username@ricc1:~] listcpu
[Q00100] Study of parallel programs
Limit(h)
Used(h)
Use(%)
Date of expiry
---------------------------------------------------------------------Total
402000.0
80400.0
20.0%
2016/03/31
+- mpc
80000.0
+- upc
400.0
+- ssc
0.0
[explanation]
Limit(h)
Used(h)
Use(%)
Date of expiry
2. How to Access
2.1 Login Flow
The login flow for RICC system from account application to login as folllows:
When the account is issued, the e-mail with the client certficate attachment is sent. After installing the
client certificate on your PC, access to the RICC Portal. You can login to the front end servers via SSH
by registering your ssh public key on the RICC Potal.
2.1.2.2 Mac
Install the client certificate ACCC sent you by e-mail.
Double click the client certificate provided by ACCC.
Enter
the
passowrd
for
https://ricc.riken.jp
3. Click [LOGIN]
3. Enter a passphrase
4. Retype the same
passphrase
5. Select [SSH-2RSA]
6. Select OS(Software) type
7. Click [Generate Key]
ricc.riken.jp
* Public-key pairs can be generated as many as users want. Also, registered public keys
generated in past time are not deleted by generation of public-key pairs.
Copyright (C) RIKEN, Japan. All rights reserved.
- 15 -
2.1.3.3 Generate a public key on the terminal (Mac, Linux, etc.) (for advanced users)
(*) If you generate a public key in the way of Generate public key and private key on RICC
Portal2.1.3.2 , please skip this section.
(1) Use the ssh-keygen command on the terminal to generate a public-key pair on the terminal.
2.
3.
Enter passphrase
4.
ricc.riken.jp
https://ricc.riken.jp
1. Click [Setting]
Mac(OS X): Start Terminal. Execute the cat command to display the public key.
UNIX / Linux: Start terminal emulater. Execute the cat command to display the public key..
(Note) If the ssh-keygen command is executed with no argument at step (1), a public
key is stored in ~/.ssh/id_rsa.pub file.
Command example: $ cat ~/.ssh/id_rsa.pub
1. Display the generated public
key.
$ cat "public-key-file"
2. Copy the content
3. Click [save]
https://ricc.riken.jp
Move to [Delete Public Key] window.
1. Click [Setting]
2. Click [Key Management]
3. Click [Delete Public Key]
Click [logout]
Purpose to access
ricc.riken.jp
riccgv.riken.jp
Usaual access
GaussView use note 1
username greatwave.riken.jp
After login HOKUSAI-GreatWave, login to the RICC front end servers enabling SSH Agent
forwarding (-A option).
[username@greatwave:~]$ ssh -A
username ricc.riken.jp
Protocol
Account
Password
RICC Portal
HPSS
Virtual terminal
https
pftp3
ssh
(scp/sftp)
RICC
user
account
(specified in the
application form)
RICC password
RICC password
Public-key passphrase (specified
by user) 4
1:
pftp is the special commands to transfer files between users home directories and the Archive
system.
pftp is an enhanced command of the ftp. The pftp can be used in the same way of the ftp.
2:
Public-key passphrase is specified by a user when a pair of a public key and a private key is
generated. A pair of a public key and a private key can be generated in RICC Portal. Please refer
to 2.1.3 Generate / Register public key and private key.
(2)
https://ricc.riken.jp
2.3.2 Password updating procedure
(1) Login RICC Portal
3. Click [LOGIN]
1. Click [Setting]
2. Click [Password Update]
Condition of password:
- At least 6 characters
- Not simple
(e.g. dictionary word)
Click [logout]
2.4
Access RICC
2.4.1 Login
Use ssh service to login RICC from PC / WS. The ssh command for UNIX / Mac (OS X) and PuTTY for
Windows are recommended. PuTTY is available on the following website.
http://www.chiark.greenend.org.uk/~sgtatham/putty/
A)
username greatwave.riken.jp
Displayed only
.
at first-time login.
Displayed only
at first-time login.
[username@ricc1:~]
B)
For Windows
1.
2.
22
3.
For the first-time login, following security alert window is shown. Click [Yes].
This alert is now shown at future logins.
4.
2.4.2 Logout
Enter exit or logout at prompt. Logout process might take a little time for post processing (writing the
history file).
Also, original skeleton files are available in the following directory of Login Server.
ricc.riken.jp:/usr/local/example/skel
B)
For Windows
Login RICC with WinSCP. Files can be transferred by drag & drop after login.
1.
1.
2.
Click [Advanced]
4. Open [Authentication]
5. Enter following items
Private key filePrivate key file
6. Clock [OK]
2.
100% |***********************|
file-size
Also, files can be uploaded / downloaded on RICC Portal using web browser.
However, upload / download function of RICC Portal cannot transfer multiple files.
3. File Area
3.1 Available file area
Available file areas are following.
Area
Area name
Size
Device
homenote 1
data
local disk
(work area)
archive
/home
/data
/work
2.2PB
depends on clusternote 2
4TB/user
(4TB~52TB)/Project
computing node
/arc
2PB
ReadOnly
40GB/core
10GB/core
30GB/core
Login
Server
Massively Parallel
Cluster
Multi-purpose Parallel
Cluster
homenote
data
local disk
(work area)
O
O
O
O
O
O
O
O
O
(for prestaging)
O
(scratch area for job)
O
(scratch area for job)
archive
- : Not available
Usage of home area is limited to less than 2.2PB per user by Quota.
Local disk area can be used by users jobs and the files are deleted when the jobs finish.
For Massively Parallel Cluster, the area is limited to less than 40GB per core. The more cores a job
uses, the more capacity the job can use. For example, a job using 4 cores can use the area up to
160GB.
For Multi-purpose Parallel Cluster, Cluster for single jobs using SSD, the area can be used for scratch
area while running jobs.
Compilation / Linkage for GPGPU program is done on GPGPU Compile Server (accel). For more
information, please refer to 4.2 Compilation / Linkage for GPGPU program.
command
machine-option
[option]
file [...]
parallel)
xpfrt
machine-option
-pc
option
file
Meaning
-c
-g
-I
-L
add directory to the list of directories in which the linker searches for libraries
-l
-o
Meaning
Fujitsu
Intel compiler
compiler
-high (*1)
-Kfast
In
addition
to
-O2
basic
optimizationloop unrolling,
configuration
multiple
change
of
etc.
are
loop,
performed.
-low
Basic optimization
-O1
-none
No optimization
-O0
Meaning
Fujitsu compiler
Intel compiler
-auto_parallel
-Kparallel
-parallel
-auto_parallel_info
-Kpmsg
-par-report
-KOMP
-openmp
parallelization
-omp
Machine
PC Clusters
Parallel
Math library
Parallel library
Math library
BLAS
MPI
ScaLAPACK
LAPACK
PVM
SSL II
SSL II
IMSL
Table 4-5 Available library
If a machine to run modules is specified in CLTK user configuration file (${HOME}/.cltkrc),
machine-option (-pc) can be omitted for compilation / linkage.
* Option in command line has a priority over one in CLTK configuration file.
Value
pc
Meaning
generate module for PC Clusters
There are cautions on compilation / linkage of thread parallel programs or MPI parallel programs. For
more information, please refer to product manuals. On how to refer to product manuals, please refer to
0
Manual.
1.
2.
2.
2.
xpfrt
1.
[option]
file [...]
http://www.nvidia.com/object/cuda_home_new.html
1.
2.
3.
Compile / link GPGPU programs (CUDA Fortran programs) (use of PGI compiler)
[username@upc0000 ~] f90 [-pgi] [OPTION] file [...]
4.
Compile / link GPGPU programs (CUDA MPI Fortran programs) (use of PGI compiler)
[username@upc0000 ~] mpif90 [-pgi] ta=nvdia -Mcuda file [...]
(*) Without machine-option on accel, PGI compiler is used by default.
----#
srun ./a.out
Example of GPGPU programs (CUDA MPI Fortran programs) Job script file:
[username@ricc1:~] vi go.sh
#!/bin/sh
#------ qsub option --------#
#MJS: -accelex
#MJS: -proc 2
#MJS: -cwd
#---- Program execution ----#
mpirun
-np 2
./multi.exe
[note]
Jobs can be submitted on Login Server (ricc1-4).
Jobs can not be submitted on GPGPU Compile Server (accel).
Specify accel as hardware resource to submit jobs using GPGPU (CUDA programs).
When -accel is specified, the job consumes 1 CPU (4 cores) as resource.
Specify -accelex as the hardware resource if you want to use 1 node exclusively. In this case, each
process consumes 8 cores resource.
A job which uses 2 or more GPGPU consumes 1 node (8cores) per process.
1.
4.4.1 BLAS
Specify blas option to link BLAS library. Specify blas_t option to link BLAS library for thread
parallel.
1.
4.4.2 LAPACK
Specify lapack option to link LAPACK library. Specify lapack_t option to link LAPACK library for
thread parallel.
1.
4.4.3 ScaLAPACK
Specify scalapack option to link ScaLAPACK library. Specify scalapack_t option to link
ScaLAPACK library for thread parallel.
1.
4.4.4 SSLII
For PC Clusters, SSL II and C-SSL II are available. Specify SSL2 option to link SSL II or C-SSL II
library.
1.
2.
Link an object compiled by Fortran77 for PC Clusters and SSL II library for thread parallel.
[username@ricc1:~] f77 pc SSL2 o ssl2thread.out ssl2thread_f.o
3.
4.
Link an object compile by C for PC Clusters and SSL II library for thread parallel.
[username@ricc1:~] cc pc SSL2 o ssl2thread.out ssl2thread_c.o
4.5.2 Jobs excluded from the targets of the Job Freeze Function
The Job Freeze Function cannot always freeze all jobs. The jobs and their internal information
described below are excluded from Job Freeze targets. An attempt to freeze or defrost such jobs may
fail. Even if one of the said jobs has been frozen and defrosted successfully, its operation may be
unpredictable.
Shell scripts
When a job uses script language (perl, python, etc.), the Job Freeze will fail.
Interactive job
output.*
----#
start=1
end=100
if [ -f ${QSUB_REQID}_index ]; then
start=`cat ${QSUB_REQID}_index`
fi
for (( i = $start; i <= $end; i = i + 1 )); do
echo $i > ${QSUB_REQID}_index
mpirun -stdinfile input.${i} ./a.out > output.${i}
cp output.${i} input.$((i+1))
done
rm ${QSUB_REQID}_index
Occupancy of
Purpose
Batch job
Interactive job
Start of execution
resource
When resources
Yes
are allocated
No(time sharing
with other
Immediate
interavtive jobs)
Bulk job
Purpose
Procedure of submission
order
Execue Jobs in same script
Coupled
calculation job
same time
go.sh
Example of chain jobs which use output file for input file of the next job.
Prepare scripts
Prepare scripts which transfer output file (output.x) of the previous job by FTL and use it for input file
of the next job. (go1.sh, go2.sh, go3.sh)
go1.sh
#!/bin/sh
#MJS:
#MJS:
#MJS:
#MJS:
-proc 8
-time 1:00:00
-eo
-cwd
#BEFORE: a.out
#AFTER: output.1
mpirun ./a.out -o output.1
go2.sh
#!/bin/sh
#MJS:
#MJS:
#MJS:
#MJS:
-proc 8
-time 1:00:00
-eo
-cwd
#BEFORE: a.out
#BEFORE: output.1
#AFTER: output.2
mpirun ./a.out -i output.1 -o output.2
go3.sh
#!/bin/sh
#MJS:
#MJS:
#MJS:
#MJS:
-proc 8
-time 1:00:00
-eo
-cwd
#BEFORE: a.out
#BEFORE: output.2
#AFTER: output.3
mpirun ./a.out -i output.2 -o output.3
Submit chain job
Specify the prepared scripts files separated by a comma without white space.
[username@ricc1:~] qsub go1.sh,go2.sh,go3.sh
input.1
input.2
input.3
output.${MJS_BULKINDEX}
For above-mentioned example, bulk Job is given to bulk ID 145678 and each subjob is given bulk
Index ID 1, 2, 3. The environment variables, input file and output file name of each subjob are the
following.
Bulk ID
Bulk Index ID
1
145678
Environment variables
MJS_BULKID=145678
MJS_BULKINDEX=1
MJS_BULKID=145678
MJS_BULKINDEX=2
MJS_BULKID=145678
MJS_BULKINDEX=3
input.1
output.1
input.2
output.2
input.3
output.3
Request-name.eXXXXX.jms ---
---
Request-name.eXXXXX
---
Caution
In PC cluster and MDGRAPE-3 Cluster, when some batch jobs which write more than hundreds
megabyte data to standard output/error finish at the same time, it takes long time to transfer standard
output/error files because the load of job scheduler becomes high. The all user's job termination
processes are delayed due to this influence.
Therefore, if the size of standard output/error is large, please redirect them to normal files as below.
! FTL command
# Redirect output to
# $MJS_REQID.log
! FTL command
qsub
-i
End job
If job submission is successful, the system return a notification of submission completion to the prompt
and the job becomes in waiting status on the terminal until the job starts. When resources are allocated,
a notification of start of execution is displayed and resources are available to use. If the user does not
any operation on the terminal for 10 minutes, the user will logout automatically.
If blank characters are included in the current directory path, an error message is displayed. In that
case, please change the directory name including blank characters.
B) Hardware resources:
C) Software resources:
5.1.3.1
Meaning
-proc
<PROCNO>
-thread
<THREANO>
-mem[ory]
-hdd
-time
<MEMSIZE>[kb
|mb|gb]
<HDDSIZE>[kb
|mb|gb]
default unit: gb
<hh:mm:ss> |
<sssss>
-eo
merge)
(*) Invalid for interactive batch job
Output statistical information of the job to standard output
(default: not output)
-oi
-me
-mu
email
address
-r
<REQNAME>
-rerun
[ Y | N ]
-chaindel
[ Y | N ]
-cwd
-comp[iler]
<COMPTYPE>
fj
Fujitsu compiler
intel
Intel compiler
gcc
GNU compiler
pgi
nvidia
-para[llel]
<PARALLEL>
fjmpi
Fujitsu MPI
xpf
XPFortran
mpt
pvm
<PRJ-ID>
-B
<N>-<M>[:S]
-bmb
send)
Send an email when all bulk subjobs start (default: not send)
-bmab
-bme
-bmae
[ftl | share]
-pc
-mpc
-upc
-accel
-accelex
-ssc
(*) If the pc option is specified and no software resource(e.g. g03, -adf and so on) is specified
when submitting a job to PC Clusters, the job is executed on either Massivelly Parallel Cluster or
Multi-purpose Parallel Cluster.
However, because home area(/home) and data area(/data) are not shared in Massively Parallel
Cluster, files are transferred by FTL (reffer to 6 FTL (File Transfer Language)) at the start/end time.
Therefore, the location of the output file is different between Massively Parallel Cluster and
Multi-purpose Parallel Cluster.
Example
#!/bin/sh
#MJS:
#MJS:
#MJS:
#MJS:
-pc
-proc 1
-eo
-cwd
#FTLDIR: $MJS_CWD
srun ./a.out > output.log
The output.log will be created as follows:
* Executed on Massively Parallel Cluster:
$MJS_CWD/REQUEST-ID/output.log.0
Number of cores, amount of memory and elapsed time depend on hardware resource.
Number
Hardware
resource
(*1)
-pc
(PC Clusters)
of
available
Max
elapsed
Quick
General
time to
amount of memory to
Use
Use
specify(*3)
specify
to specify
1128
1128
72 H
129256
129512
24 H
default 40GB
(max.
320GB)
5133803
6H
2128
2128
72 H
129256
129512
24 H
Clusters)
5138192
6H
-upc/-accel
-accelex
(Multi-purpose
1128
1128
72 H
129256
129512
24 H
513800
6H
-mpc
(Massively
Parallel
Parallel Cluster
default 2,600MB
(max. 20,800MB)
default 1,200MB
(max. 9,600MB)
default 2,600MB
(max. 20,800MB)
default 10GB
(max. 80GB)
default 40GB
(max.
320GB)
default 10GB
(max. 80GB)
/GPGPU (*5) )
Table 5-5 Available hardware resource
Caution
(*1)
One hardware resource must be specified. (Two or more hardware resources cannot to be
specified.)
(*2)
(*3)
If -time option is omitted when job is submitted, maximum elapse time is set according to the
Amount of memory per process can be specified up to maximum value according to hardware
resource in the table. When amount of memory more than default is specified, use computation
time based on a number of cores occupied according to specified amount of memory.
(Example) In case of executing 2 cores in parallel job which is specified 30GB amount of memory
per process on Large Memory Capacity Server
Specified amount of memory per process 30GB = default amount of memory 15GB x 2
It is equivalent to 2 cores amount of memory.
Computation time of 2 cores x 2 cores in parallel job = The job uses 4 cores computation time
(*5)
On GPGPU, when -accelex is specified, the job is executed exclusively occupying a node per
process. So, computation time for cores of the nodes is used regardless of a specified number of
cores.
When -accel is specified, the amount of the occupation of the resource occupies one CPU (for
4cores).
(*6)
This cluster is 12 cores per node unlike other clusters. Pleae take care when you spacify a
parallel number.
Massively Parallel Cluster, Multi-purpose Parallel Cluster have 2 CPU (4 core / CPU) per computing
node. Jobs using 1 core share CPU. However, parallel jobs which specified two or more cores occupy
CPU (4 cores). Therefore, if a job occupies more cores than specified, computation time is used
accordingly.
CPU(4 core)
core
CPU(4 core)
core
Execution software
-g03
Gaussian03
-g09
Gaussian09
-g03nbo
NBO 5.G
-g09nbo
NBO 5.9
-g09nbo6
NBO 6.0
-adf
ADF2013.01
-adf2010
ADF2010.02
-gamess
GAMESS(socket)
-gamess_mpi
GAMESS(socket)
-amber8
Amber8
-amber10
Amber10
-amber11
Amber11
-amber12
Amber12
-amber14
Amber14
-ansys
ANSYS
-clustalw
ClustalW
-blast
BLAST
-hmmer
HMMER
-fasta
FASTA
-cluster3
CLUSTER 3.0
-qchem
Q-Chem 4.1
Based on specified software resource, number of processes to specify or elapsed time to specify is
different from one according to hardware resource. Available resources to specify are as follows.
**
(
means the same value according to hardware resource as shown in Table 5-5 Available hardware
resource)
Software
resource
-g03
-g09
-g03nbo
-g09nbo
-g09nbo6
-adf
-adf2010
-gamess
-gamess_mpi
-amber10
-amber11
-amber12
-amber14
Hardware resource to
specify
Number of
processes to
specify (*1)
Number of
threads to
specify (*2)
Max.
Elapsed
time
Amount of
memory
to specify
-pc/-upc/-ssc
**
**
**
-pc/-upc/-ssc
**
**
**
-pc/-upc/-ssc
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
**
-pc/-mpc/-upc/
-ssc
-pc/-mpc/-upc/
-ssc
-pc/-upc/-ssc
-pc/-upc/-accel/
-accelex/-ssc
-pc/-upc/-accel/
-accelex/-ssc
-pc/-upc/-accel/
-accelex/-ssc
-clustalw
-pc/-upc/-ssc
**
**
**
-blast
-pc/-upc/-ssc
**
**
**
-hmmer
-pc/-upc/-ssc
**
**
**
-fasta
-pc/-upc/-ssc
**
**
**
-cluster3
-pc/-upc/-ssc
**
**
(*2)
(*3)
The number of ANSYS Solver license is 1. Therefore, only one job using ANSYS can be
executed at the same time.
Meaning
How to specify
Specify 0 65535. Default is 100.
-pri
Priority to a job
-start_time
Example:
#MJS:-start_time 2009/10/01-09:00
Caution
(*1) If specified resources cannot be secured by specified time, status changes from WAIT(WIT) to
TIME OVER(TOV). Jobs in TOV status can be deleted but do not start.
-proc
<PROC-NO>
A number of cores specified by PROC-NO are allocated for processes for a job. If this option is omitted,
PROC-NO is set to 1. Specify a number of processes to execute a parallel job with interprocess
communication such as MPI program or XPFortran program.
Caution
If PROC-NO x THREAD-NO of "-thread <THREAD-NO>" at the next section exceeds the maximum
number of cores to specify, a job submission error occurs. Please specify a proper number of cores to
submit a job.
-thread
<THREAD-NO>
A number of cores specified by THREAD-NO are allocated for threads for a job. If this option is omitted,
THREAD-NO is set to 1. Specify a number of threads to execute a parallel job generating threads.
-mem
<MEMSIZE>[kb|mb|gb]
Amount of memory per process specified by MEMSIZE is secured for job execution. Units which can be
specified are kb, mb or gb (default: mb (PC Clusters/MDGRAPE-3 Cluster), gb (Large Memory
Capacity Server)). Blank characters must not be put between MEMSIZE and a unit. If this option is
omitted, default of amount memory is set. (Please refer to Table 5-4 Hardware resource list.)
(Example 1)
-mem 800mb
-->
(Example 2)
-mem 8gb
-->
-hdd
<HDDSIZE>[kb|mb|gb]
HDD size per process is specified by HDDSIZE for job execution. Units which can be specified are kb,
mb or gb. Blank characters must not be put between HDDSIZE and a unit. This option is for users who
need large size of local disk area. If this option is omitted, default of local disk size is set.
(Example 1) -hdd 2000mb
-time <ELAPSETIME>
Execute job within specified elapse time. When the job does not end within the elapse time, the job is
forcibly deleted. This prevents the job wasting resources when the job goes into an infinite loop, etc. If
this option is omitted, maximum elapse time is set according to the number of cores assigned to the job
(refer to 5.1.3.2 Hardware resources Table 5-5 Available hardware resource). Elapsed time specified by
ELAPSETIME is set in format of HH:MM:SS (HH: hour, MM: minute, SS: second) or SSSSS (SSSSS:
second).
(Example 1)
-time 24:10:10
-->
(Example 2)
-time 3600
-->
3600 seconds
(Example 3)
-time 59:01
-->
[Backfill function]
The job scheduler determines priorities of jobs based on the users' usage of resources and determines
the job which starts next. However, the job scheduler starts other low-priority jobs so long as they don't
delay the highest priority job by backfill function. Therefore, it is possible that job starts earlier if proper
ELAPSETIME is specified for the job.
Copyright (C) RIKEN, Japan. All rights reserved.
- 61 -
-eo
Merge a standard error output file with a standard output file. If this option is omitted, a standard error
output file and a standard output file are generated separately.
-mb
At the start of the job, an email is sent to address in the application form.
-me
At the end of the job, an email is sent to address in the application form.
-r
<REQNAME>
Execute a job as request name REQNAME. If this option is omitted, request name of a job is script file
name. Blank character must not be included in request name.
5.1.3.5.10 Specify if subsequent jobs are deleted when the chain job ends abnormally
-chaindel [Y|N]
Specify if subsequent jobs are deleted when the chain job ends abnormally (Y: delete, N: not delete).
Caution
A) If the option is omitted, default is chaindel Y (delete subsequent jobs). However, if rerun Y
(rerun the job) is specified, default is chaindel N (execute subsequent jobs).
B) Job submission with both rerun Y and chaindel Y (delete subsequent jobs) fails with the
following error message.
qsub: ERROR: 0016: invalid options: cannot enable -chaindel Y and -rerun Y at the same time.
C) This option is valid for chain job. It is ignored for non chain job.
D) It is possible to submit jobs which are specified chaindel and jobs which are not specified
chaindel as a chain job.
Copyright (C) RIKEN, Japan. All rights reserved.
- 62 -
-cwd
A Job executes the script in the home directory by default. Specify cwd option to execute the script in
the directory where the job is submitted.
-project
<PROJECT-NO>
Specify a project number for job execution. Project numbers are ID issued when the applications are
permitted by the administrator. (Users who have one project number don't need this option. This option
is for users who have two or more projects.)
A default project number can be specified by the variable MJS_QSUB_PROJECT in the .cltkrc (CLTK
user configuration file) located in home directory.
Example) Edit the .cltkrc file
[username@ricc1:~] vi $HOME/.cltkrc
MJS_QSUB_PROJECT = G00001
<--
-B
<N>-<M>[:S]
Submit a batch job as bulk jobs. Specify a range of Bulk Index ID by <N>-<M>. Steps of Bulk Index ID
can be specified by S. please refer to 5.1.1.3 Submit bulk job.
Example) submit 50 sub jobs as bulkjob
[username@ricc1:~] qsub B 1-50 go.sh
Bulk Request 5381290.jms Submitted to MJS.
[username@ricc1:~] qstat
REQID
NAME
STAT
ELAPSE START-TIME
CORE
-------------------------------------------------------------------5381290[1].jms
go.sh
RUN
5381290[2].jms
go.sh
RUN
5381290[3].jms
go.sh
RUN
5381290[4].jms
go.sh
RUN
5381290[5].jms
go.sh
RUN
5381290[6].jms
go.sh
RUN
5381290[7].jms
go.sh
RUN
5381290[8-50].jms go.sh
QUE
NAME
STAT
ELAPSE START-TIME
CORE
-------------------------------------------------------------------5381341[1].jms
go.sh
RUN
5381341[3].jms
go.sh
RUN
5381341[5].jms
go.sh
RUN
5381341[7].jms
go.sh
RUN
5381341[9].jms
go.sh
RUN
5381341[11].jms
go.sh
RUN
5381341[13].jms
go.sh
RUN
5381341[15,17,19,21,23,25].jms go.sh
QUE
-bmb
When any one bulk subjob starts, an email is sent to address in the application form.
-bmab
When all bulk subjobs start, an email is sent to address in the application form.
-bme
When any one bulk subjob ends, an email is sent to address in the application form.
-bmae
When all bulk subjob end, an email is sent to address in the application form.
Meaning
srun
Serial program
Thread parallel program (maximum number of threads: 8)
mpirun
xpfrun
---- Note ---(*1) Options of execution commands are not necessary since number of cores (MPI parallel or thread
parallel) and resources to use are already specified at job submission.
Example 1)
srun
Example 2)
mpirun
Example 3)
xpfrun
Meaning
srun
Serial program
Thread parallel program (maximum number of threads: 32)
mpirun
---- Note ---(*1)Options of execution commands are not necessary since number of cores (MPI parallel or
thread parallel) and resources to use are already specified at job submission.
Example 1)
srun
Example 2)
mpirun
: 8 cores
- Amount of memory
: 1200MB
- Elapsed time
: 10 H
: Yes
: Yes
: Yes
[username@ricc1:~] vi go-pc.sh
#!/bin/sh
#------ qsub option --------#
#MJS: -pc
#MJS: -proc 8
#MJS: -eo
#MJS: -rerun Y
#MJS: -cwd
Execute job
Note) Files in the directory where the job is submitted are transferred to computing nodes' local disk
area automatically in advance of job execution. Also, files in the directory on computing nodes are
collected after job execution. Except for Massively Parallel Cluster, this command is ignored.
When FTLDIR is used, unnecessary files may be transferred. Also, existence of files is checked after
job execution even though there is no file to be transferred. For large scale parallel jobs, as the costs
may be high, please use BEFORE and AFTER instead of FTLDIR.
For more information on BEFORE and AFTER, please refer to 6 FTL (File Transfer Language).
: 8 cores
- Amount of memory
: 30GB
- Elapsed time
: 10 H
: Yes
: Yes
: Yes
[username@ricc1:~] vi go-ax.sh
#!/bin/sh
#------ qsub option --------#
#MJS: -ax
#MJS: -proc 8
#MJS: -eo
#MJS: -rerun Y
#MJS: -cwd
Execute job
Meaning
(none)
-d
-m
-p
-e
-w
-project
Display job of specified project (for users who have two or more projects)
NAME
STAT
ELAPSE START-TIME
CORE
------------------------------------------------------------------12342.jms
go.sh
RUN
12348.jms
go.sh
QUE
12412[1].jms
bulk.sh
RUN
12412[2].jms
bulk.sh
RUN
12412[3-10].jms
bulk.sh
REQID :
QUE
REQUEST-ID
* Bulk job (running): "Bulk ID"."Bulk Index ID"
* Bulk job (waiting): "Bulk ID".["start of Bulk Index ID"-"end of Bulk Index ID"]
NAME:
STAT:
ELAPSE:
START-TIME:
CORE:
REQID
NAME
STATUS
ELAPSE START-TIME
CORE SUBMIT-DIR
-------------------------------------------------------------------12342.jms
go.sh
RUN
$HOME/JOB
12348.jms
go.sh
QUE
$HOME/JOB
bulk.sh
RUN
1 $HOME/JOB
12412[2-10].jms bulk.sh
QUE
12412[1].jms
$HOME/JOB
NAME
STATUS
ELAPSE
START-TIME
CORE
MEMORY
-------------------------------------------------------------------12342.jms
go.sh
RUN
500M
12348.jms
go.sh
QUE
--
NAME
STAT
ELAPSE START-TIME
CORE
PRI
-------------------------------------------------------------------12342.jms
go.sh
RUN
100
12348.jms
go.sh
QUE
100
Job priority
NAME
STATTIME
ENDTIME
CORE
MEM(*) SUBMIT-DIR
-------------------------------------------------------------------12321.jms
go.sh
896
500M
4649.ax
go.sh
20G
12324.jms
go.sh
896
500M
$HOME/JOB1
$HOME/axjob
$HOME/JOB
500M
$HOME/JOB
500M
$HOME/JOB
NAME
STAT
CORE
MEM
ESTIMATE
REASON
-------------------------------------------------------------------13574.jms
go.sh
QUE
1024
--
< 6hrs
Insufficient cores
13575.jms
sim.sh
QUE
--
> 24hrs
Insufficient license
4695.ax
go.sh
QUE
16
--
< 12hrs
Insufficient memory
13577.jms
go.sh
QUE
1024
--
> 24hrs
Insufficient cores
13577.jms
go-1.sh
QUE
1024
--
> 3days
Insufficient cores
13577.jms
go-2.sh
QUE
1024
--
< 3days
Chain job
*****************************************************************************
The estimation time is transitorily changed by the job execution or submission.
*****************************************************************************
ESTIMATE:
REASON:
< 6hrs
<12hrs
<24hrs
< 3days
> 3days
Insufficient memory
Insufficient license
License is insufficient
Chain job
REQID
ELAPSE START-TIME
NAME
STAT
CORE
--------------------------------------------------------------------12342.jms
go.sh
RUN
12:34
07/28 12:00
12348.jms
go.sh
QUE
--:--
--/-- --:--
qcat
[-o|-e|-s]
REQID
Option
Meaning
(none)
-e
-o
-s
qstat
[-x|-uc|-um]
Option
Meaning
-x
-uc
-um
Item
Meaning
H_RESOURCE
Hardware resource
MAX_CORE/J
MAX_CORE/P
SUBMIT
ELAPSE
MEMORY
RUN
QUEUED
S_RESOURCE[pc(mpc)]
S_RESOURCE[pc(upc)]
S_RESOURCE[pc(accel)]
Available
software
resources
for
Multi-purpose
Parallel
resources
for
Multi-purpose
Parallel
Cluster(GPU)
S_RESOURCE[pc(accelex)
]
Available
software
Cluster(GPU)
MAX_PROC/J
MAX_THREAD/J
Application_NAME
USE/MAX
RATIO(USED/ALL)
-----------------------------------------------------------------------mpc
*********************************-------
84.9%(3232/3888)
upc
****************************************
100.0%(0800/0800)
ssc
*********************-------------------
54.6%(0118/0216)
Display current usage of cores in each system. RATIO(USED/ALL) means ratio of use (%), number
of cores in use and max. number of cores.
prstat
[-r]
Option
Meaning
-r
Sort request id
REQID
NAME
STAT
ELAPSE START-TIME
CORE
---------------------------------------------------------------------userA
1234567.jms
go.sh
RUN
01:40
02/15 13:40
128
userB
1234599.jms
test.sh
RUN
01:46
02/15 13:34
userC
1234600.jms
run.sh
QUE
--:--
--/-- --:--
256
userC
1234601.jms
run.sh
QUE
--:--
--/-- --:--
512
userD
1234301.jms
run-d.sh
RUN
49:40
02/13 13:34
32
Item
Meaning
USER
NAME
USER
STAT
ELAPSE START-TIME
CORE
---------------------------------------------------------------------1234301.jms
run-d.sh
userD
RUN
49:40
02/13 13:34
32
1234567.jms
go.sh
userA
RUN
01:40
02/15 13:40
128
1234599.jms
test.sh
userB
RUN
01:46
02/15 13:34
1234600.jms
run.sh
userC
QUE
--:--
--/-- --:--
256
1234601.jms
run.sh
userC
QUE
--:--
--/-- --:--
512
qls
REQID [@RankNO]
[OPTION]
qget
REQID[@RankNO] SRC ..
[DEST]
From Login Server, get specified REQUEST-ID and process (@RankNO) 's files on computing nodes'
local disk area to home area.
Example) Get a file (resultfile) of rank 0 of a running job (REQUEST-ID 13579.jms)
[username@ricc ~]$ qls 13579.jms -l
total 48
-rwxr-xr-x 1 username group 46555 Jul
22 13:45 resultfile
/tmp/result
Option
Meaning
-del
mpirun ./b.out
qdel
qd
[-K|-collect]
[-K|-collect]
REQID
Option
Meaning
(none)
Cancel a job
-K
Cancel a job and delete standard output / error output file (except for Large
Memory Capacity Server).
-collect
Cancel a job and collect files on computing nodes (except for Massively
Parallel Cluster).
NAME
STAT
ELAPSE
START-TIME
CORES
-------------------------------------------------------------------12342.jms
go.sh
RUN
12:34
12348.jms
go.sh
RUN
0:20
12356.jms
go.sh
RUN
0:05
07/28
12:00
07/28- 00:14
07/28- 00:29
12412[1].jms
bulk.sh
RUN
0:05
07/28
00:29
12412[2].jms
bulk.sh
RUN
0:05
07/28
00:29
12348[3-10].jms bulk.sh
QUE
--:--
--/--
--:--
12348.jms
5963.ax
-K
12342.jms
Specify collect option to cancel the job and collect files the job generated on computing nodes of
PC Clusters.
[usernane@ricc ~] qdel
-collect
12356 .jms
12412.jms
12412[1].jms
12412[1,3,5-10].jms
REQID
NAME
STAT
ELAPSE
START-TIME
CORES
------------------------------------------------------------------1
12342.jms
go.sh
RUN
12:34
07/28
12:00
12348.jms
go.sh
RUN
0:20
07/28
00:14
3 12412[1].jms
bulk.sh
RUN
0:05
07/28
00:29
4 12412[2].jms
bulk.sh
RUN
0:05
07/28
00:29
5 12348[3-10].jms bulk.sh
QUE
--:--
--/--
--:--
Enter NO of the job to cancel. If two or more jobs are to be cancelled, specify them separated by
comma (,) or using hyphen (-). If all jobs are to be cancelled, enter "all". Enter "q" or "quit" to quit
the qd command.
qd: input NO: 2
qdel
qd
-e
-e
REQID
REQNAME
START-TIME
END-TIME
CORES
MEM
SUBMIT-DIR
-----------------------------------------------------------------12321.jms
go.sh
07/21 08:00
07/28 14:21
896
500M
$HOME/JOB1
12324.jms
go.sh
07/28 12:00
07/28 13:09
896
500M
$HOME/JOB
-e
12321.jms
REQID
REQNAME
START-TIME
END-TIME
CORES
MEM
SUBMIT-DIR
----------------------------------------------------------------------1 12321.jms
go.sh
07/21 08:00
07/28 14:21
2 12322.jms
go.sh
07/25 12:40
07/28 16:04
3 12324.jms
go.sh
07/28 12:00
07/28 13:09
896
8
128
500M
$HOME/JOB1
20G
$HOME/JOB2
1.2G
$HOME/JOB3
Enter NO of the job to cancel. If two or more jobs are to be cancelled, specify them separated by
comma (,), blank character or hyphen (-). If all jobs are to be cancelled, enter "all". Enter "q" or
"quit" to quit the qd command.
qalter -p
<PRIORITY>
<REQID>
-p
200
12343.jms
PC Clusters
Number of
Number of
Max.
Amount of
processes to
threads to
Elapsed time to
memory to
specify
specify
specify
specify
32
4 hour
2GB
Meaning
srun
Serial program
Thread parallel program (max. number of threads: 8)
mpirun
xpfrun
If programs are executed not using above commands, the programs are executed on Login Server and
this may cause adverse effect on the system. When executing interactive jobs, above commands need
to use. Also, to execute programs of script language such as Perl or Python as interactive jobs, please
specify pc option. Furthermore, if a job requires input from keyboard, please specify pty to make
buffering of standard output off.
example 1)
Input files
Specify name of files to transfer as "Input files", rank (0 to number of processes -1) as "RANK-LIST"
and name of computing node's directory as "Computing node's directory".
If "RANK-LIST and Computing node's directory" is not specified, specified input files are transferred to
computing nodes in the same directory configuration of Login Server.
Login Server
computing node
rank: 0
.....
rank: n-8
..
..
rank: 7
computing node
rank: n-1
shared area
local area
local area
Dir:/home/username/job
Dir:/home/username/job
Dir:/home/username/job
File:input
File:input
File:input
Fig. 6-1 Transfer input files from Login Server to computing nodes
Input directories
Specify name of directories to transfer as "Input directories", rank (0 to number of processes -1) as
"RANK-LIST" and name of computing node's directory as "Computing node's directory".
All files in the specified input directories are transferred.
If "RANK-LIST and Computing node's directory" is not specified, specified input directories are
transferred to computing nodes in the same directory configuration of Login Server.
Login Server
computing node
rank: 0
.....
rank: n-8
..
..
rank: 7
computing node
rank: n-1
shared area
local area
local area
Dir: /home/username/job/bin
Dir:/home/username/job/bin
Dir:/home/username/job/bin
Fig. 6-2 Transfer input directory from Login Server to computing nodes
Output files
Specify name of output files to transfer as "Output files", rank (0 to number of processes -1) as
"RANK-LIST" and name of Login Server's directory as "Login Server's directory".
If "RANK-LIST and Login Server's directory" is not specified, specified output files are transferred to
Login server in the same directory configuration of computing nodes.
Login Server
rank: 0
.....
rank: n-8
..
..
rank: 7
shared
area
NFS
computing node
computing node
rank: n-1
local area
local area
Dir:/home/username/job
Dir:/home/username/job
Dir:/home/username/job
File:output.0
File:output.0
File:output.n-1
Dir:/home/username/job
File:output.n-1
Fig. 6-3 Transfer output files from computing nodes to Login Server
If each output file has the same name among computing nodes, it is possible to avoid overwriting by
adding rank number ( the first rank number in a node) to output file name.
Login Server
computing node
computing node
rank: n-8
.....
..
..
rank: 0
rank: 7
rank: n-1
local area
local area
Dir:/home/username/job
Dir:/home/username/job
Dir:/home/username/job
File:output.0
File:output
File:output
shared
area
NFS
Dir:/home/username/job
File:output.n-8
Files (not include directories) in the FTL basic directory on Login Server
--> recognized as input files and transferred to computing nodes
Files (not include directories) in the FTL basic directory on computing nodes
--> recognized as output files and transferred to Login Server
Login Server
computing node
computing node
rank: 0
rank: n-8
..
..
.....
rank: 7
rank: n-1
local area
local area
Dir:/home/username/job
Dir:/home/username/job
Dir:/home/username/job
File:input.0
File:input.0
File:input.0
Dir:/home/username/job
Dir:/home/username/job
Dir:/home/username/job
File:input.n-1
File:input.n-1
File:input.n-1
shared area
Login Server
computing node
rank: 0
computing node
rank: n-8
..
..
.....
rank: 7
rank: n-1
local area
local area
Dir:/home/username/job/ReqID
Dir:/home/username/job
Dir:/home/username/job
File:output.0.0
File:output.0
File:output.0
shared area
Dir:/home/username/job/ReqID
File:output.0.n-8
Dir:/home/username/job
Dir:/home/username/job
File:output.n-1
File:output.n-1
Dir:/home/username/job/ReqID
File:output.n-1.0
Dir:/home/username/job/ReqID
File:output.n-1.n-8
Multi-line mode
#<FTL command>
# [RANK-LIST[@directory]:] files[, files... ]
# [RANK-LIST[@directory]:] files[, files... ]
#</FTL command>
files
Use FTL variables $MJS_HOME or $MJS_DATA when specifying absolute path from /home
or /data. On FTL variable, please refer to 6.6.12.5 FTL variable.
RANK-LIST
Specify destination of input files and destination of output files by RANK-LIST. For more
information on RANK-LIST, please refer to 6.6.12.6 RANK-LIST
directory
Use FTL variables $MJS_HOME or $MJS_DATA when specifying absolute path from /home
or /data. On FTL variable, please refer to 6.6.12.5 FTL variable.
Copyright (C) RIKEN, Japan. All rights reserved.
- 92 -
Files which are not contained in /home or /data cannot be specified in FTL command.
Batch job's standard / error output files (extension: .jms) and swap files (extension: .swp) are not
transferred.
Use the ftlchk command to check FTL syntax and existence of file. For more information, please
see ftlchk --man.
Example:
[username@ricc1 ~]$ ftlchk go.sh
=====================
FTL Analysis Result
=====================
Line Type
11 BEFORE
TargetRank Stat
0-15
SourcePath[Login]
$CWD/a.out
DestinationDir[Calc]
--> $CWD
RANK-LIST and directory are optional. If "RANK-LIST and Computing node's directory" is not
specified, specified input files are transferred to computing nodes in the same directory
configuration of Login Server.
#<BEFORE>
#[RANK-LIST[@computing node's directory]:] input file [...]
#</BEFORE>
RANK-LIST and computing node's directory are optional. If "RANK-LIST and Computing node's
directory" is not specified, specified input directories are transferred to computing nodes in the
same directory configuration of Login Server.
#<BEFORE_R>
#[RANK-LIST[@computing node's directory]:] input directory [...]
#</BEFORE_R>
Destination of output files is determined by set of RANK-LIST and Login Server's directory.
RANK-LIST and Login Server's directory are optional. If "RANK-LIST and Login Server's
directory" is not specified, specified output files are transferred to Login Server in the same
directory configuration of computing nodes.
#<AFTER>
#[RANK-LIST[@Login Server's directory]:] output file [...]
#</AFTER>
This is valid for files (collected from 2 or more computing nodes) specified by the AFTER
command.
#FTL_SUFFIX: flag
Item
Value
on
Meaning
Add rank number to output files
flag
off
Use FTL variables $MJS_HOME or $MJS_DATA when specifying absolute path from /home or
/data. On FTL variable, please refer to 6.6.12.5 FTL variable.
(note) When FTLDIR is used, unnecessary files may be transferred. Also, existence of files is checked
after job execution even though there is no file to be transferred. For large scale parallel jobs, as the
costs may be high, please use BEFORE and AFTER instead of FTLDIR.
If this command is not specified, "new" is set for File collect type.
Item
Value
Meaning
new
6.6.7 FTL Syntax (Avoid adding rank number of FTL basic directory)
Avoid adding rank number (first rank number in a node) to output files transferred by the FTLDIR
command using following syntax. Output files will be overwrote when output files on computing nodes
are the same name.
#FTL_NO_RANK: flag
Item
Value
on
Meaning
Not Add rank number to output files
flag
off
Specify a number of digits to add RANK-LIST to files with specified number of digits. This is valid
for FTL variable (*), FTLDIR command and AFTER command (when FTL_SUFFIX is set to
on).
This command is valid for FTL commands (FTL variable, etc.) which have been specified before
this command.
Item
Value
Meaning
number of
0-9
digits
RANK-LIST
Number of digits:
Number of digits:
Number of digits:
Number of digits:
none specified
01
001
10
10
10
10
010
100
100
100
100
100
Use FTL variables $MJS_HOME or $MJS_DATA when specifying absolute path from /home or
/data. On FTL variable, please refer to 6.6.12.5 FTL variable.
#FTL_STAT: flag
Item
Value
Meaning
off
normal
Normal mode.
Output statistic information to standard output.
flag
Detail mode.
detail
Output items
Item
Meaning
ELAPSE(s)
FILE_NUM
FILE_SIZE(KB)
---------------------ELAPSE(s)
BEFORE
FILE_NUM
=============#
---------------------------------FILE_SIZE(KB)
-----------------------------------------------------------------------TOTAL
60
---------------------ELAPSE(s)
30
AFTER
FILE_NUM
16384
---------------------------------FILE_SIZE(KB)
-----------------------------------------------------------------------TOTAL
10
30
16384
#=========================================================#
---------------------ELAPSE(s)
BEFORE
FILE_NUM
=============#
---------------------------------FILE_SIZE(KB)
-----------------------------------------------------------------------TOTAL
60
10
100
RANK: 0-7
60
10
100
---------------------ELAPSE(s)
AFTER
FILE_NUM
---------------------------------FILE_SIZE(KB)
-----------------------------------------------------------------------TOTAL
60
10
10
RANK: 0-7
60
RANK: 8-15
60
#=========================================================#
#FTL_INFO: flag
Item
Value
Meaning
off
before
flag
after
all
Output items
Item
Meaning
TIME
SIZE(KB)
FILE_NAME
File name
Output format
Output file information of each rank. Output format with flag "all" is following. With flag "before"
output is only part of (*1), with flag "after" output is only part of (*2).
#===============
FTL
-------------------
FILE INFORMATION
BEFORE
===============#
---------------------
[RANK: 0-7]
TIME
SIZE(KB)
FILE_NAME
-------------------------------------------------------Jul 16 10:41
14246
Jul 24 10:20
361
/home/username/job/a.out
/home/username/job/input.1
(*1)
[RANK: 8-16]
TIME
SIZE(KB)
FILE_NAME
-------------------------------------------------------Jul 16 10:41
14246
Jul 24 10:20
361
-------------------
/home/username/job/a.out
/home/username/job/input.2
AFTER
----------------------
[RANK: 0-7]
TIME
SIZE(KB)
FILE_NAME
--------------------------------------------------------Jul 16 10:41
14246
/home/username/job/a.out
Jul 24 10:20
361
/home/username/job/input.1
Jul 24 10:25
361
/home/username/job/output
(*2)
[RANK: 8-16]
TIME
SIZE(KB)
FILE_NAME
--------------------------------------------------------Jul 16 10:41
14246
Jul 24 10:20
361
/home/username/job/a.out
/home/username/job/input.2
#=======================================================#
Copyright (C) RIKEN, Japan. All rights reserved.
- 106 -
Example
#<BEFORE>
#! this line is comment
# a.out
! b.out
#</BEFORE>
Example
#<BEFORE>
#
# a.out
#</BEFORE>
#<BEFORE>
# a:b.out
#</BEFORE>
Meta character
Meaning
Example
#<BEFORE>
# input.?
# a*
# bin/exe*/a.out
#</BEFORE>
Variable
Meaning
$MJS_HOME
$MJS_DATA
$MJS_CWD
$MJS_REQID
$MJS_REQNAME
$MJS_BULKINDEX
Bulk Index ID
This is available in file name and directory name.
$MPI_RANK
$XPF_RANK
Example 1
#<AFTER>
# 0@$MJS_CWD: log/output
#</AFTER>
Example 2
#BEFORE: input.$MPI_RANK
Login Server
rank: 0
rank: 8
rank: 7
rank: 15
shared area
local area
local area
Dir:/home/username
Dir:/home/username
File:input.0
File:input.0
..
computing node
..
computing node
...
...
Dir:/home/username
Dir:/home/username
File:input.7
File:input.7
Dir:/home/username
Dir:/home/username
File:input.8
File:input.8
...
...
Dir:/home/username
Dir:/home/username
File:input.15
File:input.15
6.6.12.6 RANK-LIST
Specify destination of input file and source of output file by following descriptions.
If existent ranks and nonexistent ranks are specified at the same time, files are transferred to the
existent ranks but not to the nonexistent ranks.
Item
Format
Meaning
II
13
III
1,3
1-3,5,7
IV
(combination of
Item II, III in this table)
VI
ALL
VII
MASTER
Job type
Range of RANK-LIST
Serial job
MPI parallel job
OpenMP / auto parallel job
Hybrid job
0
0 (number of processes -1)
0
0 (number of processes -1)
ftlgen
<option>
Option
Meaning
-chk
-o <filename>
Specified destination
FTL(POST): Are there any output file ?('y' or 'n'): y If output file exists
FTL(POST): Output file: output.log
Specified destination
directory
FTL(POST): Enter more ?('y' or 'n'): n
#!/bin/sh
output result
7. Development Environment
7.1 Endian conversion
7.1.1 Outline of endian
Endian is a method of how to store a number which consists of multiple bytes into memory. For
example, when number 1234 is stored, a method storing 12 into 1 st byte and 34 into 2nd byte is called
Big Endian. On the other hand, a method storing 34 into 1st byte and 12 into 2nd byte is called Little
Endian.
Endian
RSCC
Big endian
RICC
Little endian
Table 7-1 Endian type of RSCC and RICC
7.2 Debugger
The debugger enables the user to run a program under control of the debugger to verify processing
logic.
The following types of operations can be performed for a serial program and an MPI program of Fortran
and C/C++, and an XPFortran prgram.
The profiler can output following information.
-g
Produce debugging information. If this option is omitted, you cannot diplay the value of variable
and so on.
-Ktl_trt
Link the tool runtime library. This option enables to use debug, profiling and MPI trace functions
at execution of a program. This option is effective by default.
Start debugger
fdb* list
5
double INTEGER i
read(*,*) i
go to (10,20,30) , i
10
go to 90
12
10
13
14
fdb* break 10
#1
Address
#1
0x0000000100000ad0 Enable
Yes
/home/username/sample.f
fdb* p i
Result = 123
fdb* c
Continue program: a.out
The program: a.out terminated.
8. Tuning
8.1 Tuning overview
Modifying program to finish execution faster is called tuning. A series of work of collecting tuning
information, performance evaluation/analysis, modifying source code and performance measurement
etc. is done for the tuning of the program.
At first, find the part where a lot of execution time is spent in the program. Generally, a big tuning effect
is achieved by speeding up the part.
There are following methods to get execuction time information.
Performance evaluation/analysis
iteration
Tuning
change compile option,
modify source code
Fig 8-1 Tuinig overview
8.2.2 C program
The clock function returns approximate value of processing time.
Invoke the clock function
example 1)
#include <time.h>
clock_t start_time, stop_time;
start_time = clock();
...portion to be measured
stop_time = clock();
printf("time
%10.30fn",
(double)(stop_time
start_time)
CLOCKS_PER_SEC;
example 1)
example 2)
cat go.sh
#!/bin/sh
#MJS: -proc 8
#MJS: -cwd
#MJS: eo
#MJS: -oi
#MJS: -time 1:00:00
#BEFORE: a.out
mpirun ./a.out
example 1)
[username@ricc1:~]
()
cat go.sh.o2733417.jms
Allocated Resource
<- allocate resource of entire job
Virtual Nodes
:
8 Node
Before Free Memory
Total Large Page Memory
:
0 Mbyte
Total Normal Page Memory
: 10737418240 Byte
After Free Memory
Total Large Page Memory
:
0 Mbyte
Total Normal Page Memory
: 10737418240 Byte
CPUs
:
8 CPU
Inter-Node Barrier
:
0 Unit
Execmode
:
CHIP_SHare
Elapse time limit
:
3600.000 sec
Used Resource
<- used resource of entire job
Total System CPU Time
:
463 msec
Total User CPU Time
:
470516 msec
Total Large Page Memory
:
0 Mbyte
Total Normal Page Memory
: 342716416 Byte
CPUs
:
8 CPU
Inter-Node Barrier
:
0 Unit
--------------------------------------Virtual Node Information : NODE : mpc0448 <- computing node
Archi Information
:
PG
Allocated Resource
<- allocate resource per process
Before Free Memory
Large Page Memory
:
0 Mbyte
Normal Page Memory
: 1342177280 Byte
After Free Memory
Large Page Memory
:
0 Mbyte
Normal Page Memory
: 1342177280 Byte
Free memory time
:
0 msec
CPUs
:
1 CPU
CPU time limit
:
UNLIMITED
Used Resource
<- used resource per process
Large Page Memory
:
0 Mbyte
Normal Page Memory
: 50913280 Byte
CPUs
:
1 CPU
CPU Time
System time
Max CPU Time
:
236 msec
Total CPU Time
:
236 msec
SBID
ChipID
CPUID
System time
0
0
0
:
236 msec
User time
59426 msec
59426 msec
User time
59426 msec
Elapsed time, breakdown of user CPU time / system CPU time , etc.
example 1)
example 2)
Process
Received + Get
+--------------------------------------------------+
|
########|
16 % Process
0
+--------------------------------------------------+
Percentage of time waiting for a received and get
_________________________________________________________________
Procedures profile
**************************************************************
Application - procedures
**************************************************************
Cost
%
Start
End
-------------------------------------------83
100.0000
--Application
-------------------------------------------51
61.4458
--__GI_memcpy
9
10.8434
369
397
IMB_ass_buf
7
8.4337
--memcpy_nts_asm64a
3
3.6145
--_LowLevel_MutexUnlock
2
2.4096
--_LowLevel_Exchange4
1
1.2048
--intra_Reduce
1
1.2048
--mpigfc_
1
1.2048
--PMPI_Sendrecv
1
1.2048
--_GMP_StopSendTimer
1
1.2048
--Copyright_GMP_Send
(C) RIKEN, Japan. All rights reserved.
_________________________________________________________________
- 122 Loops profile
**************************************************************
Application - loops
example 3)
1. Specify the fpcoll command's option as the profopt option to collect profiling date. Items of
profiling date can be specified by the -I option. Profiling date is created in a directory specified by
the d option.
When executing an application on Massively Parallel Cluster, transfer profiling data to Login Server
by FTL.
$ cat go.sh
#!/bin/sh
#------- qsub option -------#
#MJS: -pc
#MJS: -proc 64
#MJS: -eo
#MJS: -time 10:00
#MJS: -cwd
#------- FTL command -------#
#BEFORE: a.out
#AFTER:
ALL@${MJS_REQID}_prof:profile-data/*
2. Use the fprof command to display profiler information. Items of profiling date to display can be
specified by the -I option. Specify the directory of profiling data by the -d option.
$ fprof -Impi -d 1417379.jms_prof
Spine switch
x 120
x2
x2
x2
Leaf switch
x2
x2
Leaf switch
x 20
Compute node x 20
x 120
x 20
Compute node x 20
x2
x2
Leaf switch
x2
Leaf switch
x 20
Compute node x 20
x 20
Compute node x 20
InfiniBand
Job scheduler of RICC minimize the number of leaf switches connecting computing node, and allocates
the parallel job. However, allocated computing node might be distributed to more reef switches in a
situation where system usage ratio is high because computing nodes allocated to a job executed next
depend on the jobs which finished previously
This difference of allocateion computing nodes may not have an impact on normal job execution but it
may have an impact on job execution of high communication load such as network communication
benchmark test.
9.1 Configuration
If you use the hsi and the htar for the first time in RICC or you use them after RICC password is
updated, use the arc_keytab command to gererate Keytab file for authentication.
You don't need to generate Keytab file from the next time.
Example:
[username@ricc:~] arc_keytab
Getting a KEYTAB file for user: username
Please wait ....
...............
A KEYTAB file was generated successfully.
9.2 pftp
9.2.1 Get file
9.2.1.1 Login
[username@ricc:~] pftp arc
Using /opt/hpss/etc/HPSS.conf
Connected to arc.
220 hpcore FTP server (HPSS 7.1 PFTPD V1.1.1 Tue Jan 19 07:16:29
JST 2010) ready.
Parallel stripe width set to (1).
Name (arc:username):
Login completed
ftp> bye
groupname
4104704
221 Goodbye.
[username@ricc:~]
9.3 hsi
9.3.1 Get file
9.3.1.1 Login
[username@ricc:~] hsi
Username: username
Login completed
'/home/username/testdir/testfile1' : /home/username/testdir/testfile1'
total 6144
-rw-------
1 username
groupname
-rw-------
1 username
groupname
-rw-------
1 username
groupname
A:[RICC]/home/username-> quit
Logout
[username@ricc:~]
9.4 htar
Followings are the restrictions of the htar command.
Directory name is up to 154 characters, file name is up to 99 characters when path name of a
member file is divided into directory name / file name.
(Example) Path name
: /home/username/dir1/dir2/test.data
: test.data
-tf
test.tar
......
HTAR: -rw-r--r--
username/groupname
work/test1
HTAR: -rw-r--r--
username/groupname
work /test2
HTAR: -rw-r--r--
username/groupname
work /test3
-xf
test.tar
https://ricc.riken.jp
https://ricc.riken.jp
3. Click [LOGIN]
On usage of RICC Portal, please click help icons of functions or refer to online manual on RICC Portal.
11. Manual
Access RICC Portal from Web browser. On accessing RICC Portal, please refer to 10 RICC Portal.
After login, click links in [Documentation] in the left of menu to refer online manuals.
1. Click [MAIN]
2. Click
[Documentation]
[Product Manual]
-> Product manuals
are available to refer
11.1.2 Language
Fortran User's Guide
Fortran Language Reference
Fortran Compiler Messages
Fortran Runtime Messages
C User's Guide
C++ User's Guide
C++ Compiler Feature
XPFortran User's Guide
MPI User's Guide
Appendix
1. FTL Examples
Job execution scripts using FTL are introduced in this appendix. FTL commands are in bold.
Following environment variables are used in this appendix.
Environment variable
Value
$MJS_CWD
/home/username/job
$MJS_DATA
/data/username
$MJS_REQID
REQUEST-ID of job
Execute execution module a.out. Transfer output file to job execution directory.
Item
Value
Remark
$MJS_CWD
Execution module
a.out
(none)
Output file
output
$MJS_CWD
$MJS_CWD
a.out
$MJS_CWD
transfer
a.out
$MJS_CWD
output
transfer
output
Item
Value
Remark
$MJS_CWD
Execution module
a.out
Input file
(none)
Output file
output
$MJS_DATA/data
$MJS_CWD
$MJS_CWD
a.out
a.out
transfer
$MJS_DATA/username/data
output
output
transfer
Item
Value
Remark
$MJS_CWD
Transfer directory
bin
Input file
(none)
output file
output
$MJS_CWD
$MJS_CWD
$MJS_CWD
bin
bin
transfer
$MJS_CWD
$MJS_CWD
output
output
transfer
Transfer input file which is necessary for each rank. Execute MPI execution module a.out of 16 cores
in parallel job. Transfer output file of each rank to Job execution directory.
Item
Value
$MJS_CWD
Execution module
a.out
Input file
input.0 - input.15
output file
output.0 - output.15
Destination of output
Remark
$MJS_CWD
file
input.
0
transfer
input.
..
$MJS_CWD
transfer
a.out
15
..
input.
input.
input.
input.
..
.
input.
..
15
Hostnode 0(rank 0 - 7)
$MJS_CWD
output.
0
..
.
$MJS_CWD
output.
7
output.
transfer
..
.
output.
7
..
.
output.
15
output.
transfer
..
.
output.
15
Use FTL variable ($MPI_RANK) for 1.2.1 sample 1 (16 cores in parallel job) case.
Item
Value
Remark
$MJS_CWD
Execution module
a.out
Input file
input.0 - input.15
output file
output.0 - output.15
$MJS_CWD
Execute MPI execution module a.out of 16 cores in parallel job. Transfer output files of the same
name on each rank to Job execution directory avoiding overwritng.
Item
Value
Remark
$MJS_CWD
Execution module
a.out
Input file
(none)
output file
output
$MJS_CWD
Hostnode 0(rank 0 - 7)
$MJS_CWD
Host: Login Server
$MJS_CWD
a.out
transfer
a.out
Hostnode 1(rank 8 - 15)
$MJS_CWD
transfer
a.out
Hostnode 0(rank 0 - 7)
$MJS_CWD
output.
$MJS_CWD
output
transfer
output
transfer
Output file name is added rank number (first rank of the node) in advance of file transfer.
Execute MPI execution module a.out of 16 cores in parallel job. Transfer output file of each rank (rank
number is 3 digits) to Job execution directory.
Item
Value
Remark
$MJS_CWD
Execution module
a.out
Input file
(none)
output file
output.000 - output.015
$MJS_CWD
It is the same as 1.2.3 sample 3 (Transfer files of same file name avoiding overwriting).
Hostnode 0(rank 0 - 7)
$MJS_CWD
output.
000
..
.
$MJS_CWD
output.
007
output.
transfer
000
..
output.
007
..
.
output.
015
transfer
output.
008
..
.
output.
015
Transfer input file which is necessary for each rank. Execute MPI execution module a.out of 16 cores
in parallel job. Transfer output file of each rank to Job execution directory.
Item
Value
Remark
$MJS_CWD
$MJS_CWD
Execution module
a.out
Input file
input
output file
output.0 - output.15
$MJS_CWD
a.out
$MJS_CWD
a.out
input
input
Hostnode 1(rank 8 - 15)
$MJS_CWD
transfer
a.out
input
Hostnode 0(rank 0 - 7)
$MJS_CWD/$MJS_REQID
output.
0.0
..
.
$MJS_CWD
output.
7.0
output.
transfer
..
output.
7
..
.
output.
15.8
output.
transfer
..
.
output.
15
Transfer input file which is necessary for each rank. Execute MPI execution module a.out of 16 cores
in parallel job. Transfer output file of each rank to Job execution directory.
Item
Value
Remark
$MJS_CWD
$MJS_CWD
Execution module
a.out
Input file
input
output.0 - output.15
output file
Input
input
file
is
updated
$MJS_CWD
Hostnode 0(rank 0 - 7)
$MJS_CWD/$MJS_REQID
output.
0.0
..
.
$MJS_CWD
output.
7.0
output.
transfer
input.
..
output.
7
input
input.
..
.
output.
15.8
output.
transfer
..
.
output.
15
input
1.4 Others
1.4.1 Execute job using temporary directory
Item
Value
Remark
$MJS_CWD
Execution module
a.out
Input file
(none)
output file
output
$MJS_CWD
Make directory
Hostnode 0(rank 0 - 7)
$MJS_CWD
Host: Login Server
tmp
make
It is the same as 1.2.3 sample 3 (Transfer files of same file name avoiding overwriting).
It is the same as 1.2.3 sample 3 (Transfer files of same file name avoiding overwriting).
Transfer the same input file to each rank. Execute MPI execution module a.out of 16 cores in parallel
job.
Item
Value
Remark
$MJS_CWD
Execution module
a.out
Input file
input.0 - input.15
output file
output
$MJS_CWD
The
same
input
file
is
Hostnode 0(rank 0 - 7)
$MJS_CWD
a.out
..
input.
Host: Login Server
input.
15
$MJS_CWD
transfer
a.out
input.
0
..
.
input.
15
$MJS_CWD
transfer
a.out
input.
..
input.
15
Hostnode 0(rank 0 - 7)
$MJS_CWD
$MJS_CWD
output
output
transfer