Last updated: 2025-10-16
2025-01-31
Source: adapted from Emilie Drouineau’s presentation
Material under CC-BY-SA
licence
⇨ Introduction ⇦
Storage system
Computational resources
Jobs
Software
A few tips
A computer cluster:
A computer cluster:
A computer cluster:
A computer cluster:
In summary, a computer cluster is:
Using a computer cluster is:
NB: you can use a cluster but you don’t have to. It will depend on the tasks.
# Step 1 - passerelle
ssh john.doe@passerelle.i2bc.paris-saclay.fr
# Step 2 - cluster (Slurm)
ssh john.doe@slurmlogin.calcul.i2bc.paris-saclay.frNB: going through the passerelle is only necessary if you are outside of the I2BC network
=> You’re now on the master node of the cluster!
Introduction
⇨ Storage system ⇦
Computational resources
Jobs
Software
A few tips
All nodes have access to the same file systems (with a few exceptions)
NB: /store will turn into /stockage once
the new storage system (CEPH) will be in place
From the cluster, you have access to:
For mid-term & safe storage
/store/EQUIPES/<team accronym>/: team space/store/USERS/<your login>/: personal space/store/plateformes/<project name>/: project space
(on demand)NB: /store will turn into /stockage once
the new storage system (CEPH) will be in place
=> all 3 are regularly backed up
(i.e. ls .snapshots/ see intranet
under the “restauration de fichers” tab)
From the cluster, you have access to:
Not for storage; specific to the cluster
/home/<your login>/ => your “home”
(config).local
folder, etc.)ls .snapshots/ see intranet
under the “restauration de fichers” tab)From the cluster, you have access to:
Not for storage; specific to the cluster
/home/<your login>/ => your “home”
(config)/data/work/I2BC/ => collective temporary spaceNB: /data/work/I2BC will have a change in data retention
policy once the new storage system (CEPH) is in place (automatic
deletion of files older than 3 months)
From the cluster, you have access to:
Not for storage; specific to the cluster
/home/<your login>/ => your “home”
(config)/data/work/I2BC/ => collective temporary space/scratchlocal => node-specific temporary space/scratchlocal/ spaces which are specific
to each node/!\ when using temporary spaces, please:
➜ /store/EQUIPES |
- raw data &
final processed data (for safe keeping) - all data & protocoles that should be accessible by the team and that cannot be easily re-generated |
➜ /store/USERS |
- personal but professional data (e.g. course material for PhD students or MDC, work contracts etc.) |
➜ /home |
- login & config
files (e.g. .bashrc, .condarc
etc.)- installed programmes (i.e. in .local, mamba/conda etc.) |
➜ /data/work/I2BC |
- temporary data that
needs to be accessible from all nodes (e.g. a copy of a small database
fetched from the internet, non-dividable files, etc.) - data that is copied from somewhere else (e.g. for sharing with other teams) |
➜ /scratchlocal |
- temporary data that doesn’t have to be accessible from all nodes (e.g. intermediate files generated by a tool) |
/!\ for temporary spaces, remember to delete your data when you’ve finished
Coming soon… a web site (“VisuCeph”) to follow the general consumption of your spaces, and specific command lines to access rapidly your quota & consumption (depends on CEPH).
In the mean time, you can use traditionnal linux commands such as:
df -Thdu -ch --max-depth=1 <folder name>e.g.
Introduction
Storage system
⇨ Computational resources ⇦
Jobs
Software
A few tips
different partitions = different properties
common: access to all “shared” or “common” nodes
(default)runXX (XX=2, 4, 8, 16, 32): when several jobs, auto
management to limit to XX jobs running at a time<group-specific>: priority (or exclusive usage)
for group members who financed these nodessmallgpu: access to nodes with small mem GPUslowprio & lowpriogpu: access to more
nodes but job could be suspended for a while or stop/restarted,
respectivelyIntroduction
Storage system
Computational resources
⇨ Jobs ⇦
Software
A few tips
=> On a cluster, you run jobs through the scheduler
| What? | SLRUM |
|---|---|
| submit a job | sbatch & srun |
| list all running or queuing jobs | squeue |
| delete a job | scancel |
sbatch & srun to reserve resources
sbatch & srun to reserve resources
=> you’re given a “job id”
sbatch & srun to reserve resources
=> you’re given a “job id”
sbatch [options] myscript.sh [script arguments]:
“letter in postbox”srun [options] --pty {bash,python,...}: interactive
sessionCommon options:
--cpus-per-task X: how many CPUs--mem X[M,G,T]: how much memory (e.g. 2G =
2 Gb)--partition X: which partition
(e.g. common, lowprio)--time X: maximum allocated time (DD-HH:MM:SS)--job-name X: personnalised job namesqueue to follow jobs
squeue to follow jobs
squeue to follow jobs
squeue to follow jobs
=> lists all jobs, resources required & used, and their status
| Status | Description |
|---|---|
| (PD) PENDING | Waiting for resources |
| (R) RUNNING | Job is running |
| (CG) COMPLETING | Job done and finalising |
| (CD) COMPLETED | Job completed successfully |
| (F) FAILED | Job stopped with non-zero error code |
scancel to delete a job (if you changed your mind - jobs
terminate on there own otherwise)
scancel to cancel a job (if you changed your mind - jobs
terminate on there own otherwise)
A set of scripts from slurm-tools are installed on the master node:
About jobs:
jobqueue: get list of jobs in queue or runningjobhist: get list of all jobs that have run or are
running or queuingjobinfo <jobid>: get detailed information for a
single job (that has run or is running)About available resources:
nodeinfo: list of all partitions and their capacities
(number of nodes, cpus, memory, gpu model if any, compact node list and
state)noderes: list of all nodes with accessory information
e.g. partitions you can use to access them, state and free
resourcesThere’s also a more interactive view with tsqueue to
visualise currently running or queueing jobs.
Some programmes are already installed on the nodes…
Use module to check for them and load them:
module avail: to list all available softwaremodule load <module name>: to load a specific
programme/!\ module load will only work on the slave
nodes!
Never run heavy programmes on the master node…
If your programme isn’t in the list:
pip install --user or with
conda/mamba)❝ All terminals look the same ❞
The terminal prompt is your friend!
Need help with your computer, internet, some software, the cluster…?
Questions on specific bioinformatics tools & practices? Help on setting up an analysis pipeline? Etc.
FindableAccessibleInteroperableReusableFor example:
At the national level:
At the “region” level, mesocenters are institut resources that were made accessible to other institutions in the same area
For more information on these resources, see the dedicated wiki page on the Forge
Go to https://bioi2.i2bc.paris-saclay.fr/training/i2bc-cluster/exercises/ and work through the exercises!