Last updated: 2025-03-13
2025-01-31
Source: adapted from Emilie Drouineau’s presentation
Material under CC-BY-SA
licence
⇨ Introduction ⇦
Storage system
Computational resources
Jobs
Software
A few tips
A computer cluster:
A computer cluster:
A computer cluster:
A computer cluster:
In summary, a computer cluster is:
Using a computer cluster is:
NB: you can use a cluster but you don’t have to. It will depend on the tasks.
# Step 1 - passerelle
ssh john.doe@passerelle.i2bc.paris-saclay.fr
# Step 2 - cluster (Slurm)
ssh john.doe@slurmlogin.calcul.i2bc.paris-saclay.fr
NB: going through the passerelle is only necessary if you are outside of the I2BC network
=> You’re now on the master node of the cluster!
Introduction
⇨ Storage system ⇦
Computational resources
Jobs
Software
A few tips
All nodes have access to the same file systems (with a few exceptions)
From the cluster, you have access to:
For mid-term & safe storage
/store/EQUIPES/<team accronym>/
: team space/store/USERS/<your login>/
: personal space/store/plateformes/<project name>/
: project space
(on demand)=> all 3 are regularly backed up
(i.e. ls .snapshots/
see intranet
under the “restauration de fichers” tab)
From the cluster, you have access to:
Not for storage; specific to the cluster
/home/<your login>/
=> your “home”
(config).local
folder, etc.)ls .snapshots/
see intranet
under the “restauration de fichers” tab)From the cluster, you have access to:
Not for storage; specific to the cluster
/home/<your login>/
=> your “home”
(config)/data/work/I2BC/
=> collective temporary spaceFrom the cluster, you have access to:
Not for storage; specific to the cluster
/home/<your login>/
=> your “home”
(config)/data/work/I2BC/
=> collective temporary space/scratchlocal
=> node-specific temporary space/scratchlocal/
spaces which are specific
to each node/!\ when using temporary spaces, please:
➜ /store/EQUIPES |
- raw data &
final processed data (for safe keeping) - all data & protocoles that should be accessible by the team and that cannot be easily re-generated |
➜ /store/USERS |
- personal but professional data (e.g. course material for PhD students or MDC, work contracts etc.) |
➜ /home |
- login & config
files (e.g. .bashrc , .condarc
etc.)- installed programmes (i.e. in .local , mamba /conda etc.) |
➜ /data/work/I2BC |
- temporary data that
needs to be accessible from all nodes (e.g. a copy of a small database
fetched from the internet, non-dividable files, etc.) - data that is copied from somewhere else (e.g. for sharing with other teams) |
➜ /scratchlocal |
- temporary data that doesn’t have to be accessible from all nodes (e.g. intermediate files generated by a tool) |
/!\ for temporary spaces, remember to delete your data when you’ve finished
Coming soon… with the migration from Compelent to CEPH storage in 2025
In the mean time, you can use traditionnal linux commands such as:
df -Th
du -ch --max-depth=1 <folder name>
e.g.
Introduction
Storage system
⇨ Computational resources ⇦
Jobs
Software
A few tips
different partitions = different properties
common
: access to all “shared” or “common” nodes
(default)runXX
(XX=2, 4, 8, 16, 32): when several jobs, auto
management to limit to XX jobs running at a time<group-specific>
: priority (or exclusive usage)
for group members who financed these nodessmallgpu
: access to nodes with small mem GPUslowprio
& lowpriogpu
: access to more
nodes but job could be suspended for a while or stop/restarted,
respectivelyIntroduction
Storage system
Computational resources
⇨ Jobs ⇦
Software
A few tips
=> On a cluster, you run jobs through the scheduler
What? | SLRUM |
---|---|
submit a job | sbatch & srun |
list all running or queuing jobs | squeue |
delete a job | scancel |
sbatch
& srun
to reserve resources
sbatch
& srun
to reserve resources
=> you’re given a “job id”
sbatch
& srun
to reserve resources
=> you’re given a “job id”
sbatch [options] myscript.sh [script arguments]
:
“letter in postbox”srun [options] --pty {bash,python,...}
: interactive
sessionCommon options:
--cpus-per-task X
: how many CPUs--mem X[M,G,T]
: how much memory (e.g. 2G
=
2 Gb)--partition X
: which partition
(e.g. common
, lowprio
)--time X
: maximum allocated time (DD-HH:MM:SS)--job-name X
: personnalised job namesqueue
to follow jobs
squeue
to follow jobs
squeue
to follow jobs
squeue
to follow jobs
=> lists all jobs, resources required & used, and their status
Status | Description |
---|---|
(PD) PENDING | Waiting for resources |
(R) RUNNING | Job is running |
(CG) COMPLETING | Job done and finalising |
(CD) COMPLETED | Job completed successfully |
(F) FAILED | Job stopped with non-zero error code |
scancel
to delete a job (if you changed your mind - jobs
terminate on there own otherwise)
scancel
to cancel a job (if you changed your mind - jobs
terminate on there own otherwise)
A set of scripts from slurm-tools are installed on the master node:
About jobs:
jobqueue
: get list of jobs in queue or runningjobhist
: get list of all jobs that have run or are
running or queuingjobinfo <jobid>
: get detailed information for a
single job (that has run or is running)About available resources:
nodeinfo
: list of all partitions and their capacities
(number of nodes, cpus, memory, gpu model if any, compact node list and
state)noderes
: list of all nodes with accessory information
e.g. partitions you can use to access them, state and free
resourcesThere’s also a more interactive view with tsqueue
to
visualise currently running or queueing jobs.
Some programmes are already installed on the nodes…
Use module
to check for them and load them:
module avail
: to list all available softwaremodule load <module name>
: to load a specific
programme/!\ module load
will only work on the slave
nodes!
Never run heavy programmes on the master node…
If your programme isn’t in the list:
pip install --user
or with
conda
/mamba
)❝ All terminals look the same ❞
The terminal prompt is your friend!
Need help with your computer, internet, some software, the cluster…?
Questions on specific bioinformatics tools & practices? Help on setting up an analysis pipeline? Etc.
F
indableA
ccessibleI
nteroperableR
eusableFor example: