Getting started with the I2BC cluster
Bonus objective
Building on your PBS submission script with more PBS options & introduction to PBS queues.
About PBS queues
The nodes on the cluster all belong to different groups of nodes. Thus, PBS has different queues in order to direct jobs to these various groups.
The group of nodes everyone automatically has access to is the COMMON
group, for which you have to use the common
queue. It’s the default queue that OpenPBS uses if you don’t specify any.
Other useful options for qsub
There are many options you can use with qsub
to control resources and other things, for example:
option | function |
---|---|
-N jobname | to specify a job name (no spaces or special characters please) |
-l walltime=HH+:MM:SS | to specify the maximum running time (default: 2hrs) |
-q queuename | to specify the queue to submit your job to |
-j oe | error messages and standard output are merged into a single file |
-o /path/to/output.log | name of the file to save standard output to |
-e /path/to/error.log | name of the file to save error messages to |
More options in the “cheat sheet” tab on the intranet and the official documentation of PBS
You can also look through the qsub
manual in-line with: man qsub
(use the arrows to surf through the page and type q
to quit)
For example:
#! /bin/bash
#PBS -N hello
#PBS -l mem=100Mb
#PBS -l ncpus=1
#PBS -l walltime=00:10:00
#PBS -j oe
#PBS -o /home/john.doe/hello.log
echo "hello world"
sleep 60s
Other useful options for qstat
There are many options you can use with qstat
too, among them, you have the following:
option | function |
---|---|
-u username | show jobs for only this user |
-n | show node information & resources |
-w | show wider columns |
-1 | print each job on a single line |
-x | show finished jobs too |
-f | full display for long format |
More options in the “cheat sheet” tab on the intranet and the official documentation of PBS
You can also look through the qstat
manual in-line with: man qstat
(use the arrows to surf through the page and type q
to quit)
My personal favorite is qstat -u $USER -nw1
.
For example:
john.doe@cluster-i2bc:/home/john.doe$ qstat -nw1
pbsserver:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
------------------------------ --------------- --------------- --------------- -------- ---- ----- ------ ----- - -----
321295.pbsserver luke.skywalker chrody run_CHIC.sh -- 1 16 64gb 27777 Q -- --
341678.pbsserver luke.skywalker chrody qsubHiCPro.sh -- 19 380 125gb 128:0 Q -- --
341679.pbsserver luke.skywalker chrody qsubHiCPro.sh -- 17 340 125gb 128:0 Q -- --
439857.pbsserver chewbacca lowprio Pilio 1469 1 20 120gb 240:0 R -- node16/0*20
565248.pbsserver srv.cryosparc cryoem cryosparc_P74_* -- 1 -- 0gb 96:00 H -- --
565387.pbsserver princess.leia chrody HiC_Pro_Captur* -- 1 30 200gb 27777 Q -- --
661056.pbsserver luke.skywalker chrody qsub_notebook_* 188674 1 10 64gb 27777 R 1039: node19/1*10
663563.pbsserver luke.skywalker chrody run_Repli.sh -- 1 32 64gb 27777 Q -- --
887864.pbsserver han.solo chrody qsub_notebook.* 3428691 1 10 100gb 27777 R 319:1 node20/0*10
912766.pbsserver root sics incr_CEPH_BIM 1581793 1 10 20gb 1920: R 155:2 node01/0*10
916251.pbsserver obi-wan.kenobi ssfa STDIN 1811155 1 1 2gb 60:00 R 37:33 node29/0
916253.pbsserver obi-wan.kenobi ssfa STDIN 1811649 1 1 2gb 60:00 R 37:31 node29/3
916291.pbsserver obi-wan.kenobi ssfa STDIN 1829084 1 1 2gb 60:00 R 36:35 node29/2
916559.pbsserver padme.amidala common STDIN 4144468 1 1 2gb 48:00 R 12:57 node06/0
916683.pbsserver padme.amidala common STDIN 42294 1 4 50gb 48:00 R 08:42 node06/1*4
As you can see, we now know on what nodes each job is running and the memory that was asked for (Req'd Memory
).
Take home message
1) You can modulate qsub
and qstat
commands using various options. To get more information, you can use the in-line manual with the man
command
2) Nodes belong to different groups and as such, you will have different queues to access them