Introduction to the I2BC cluster

Last updated: 2025-03-13

Cluster training session

Introduction to the I2BC cluster

2025-01-31

Past & present contributors:

  • Jérôme Leconte, I2BC/CNRS
  • Emilie Drouineau, I2BC/CEA
  • Fadwa El-Khaddar - I2BC/Univ. Paris-Saclay
  • Chloe Quignot - I2BC/CEA
  • Baptiste Roelens, I2BC/CEA
  • Claire Toffano-Nioche - I2BC/CNRS


Source: adapted from Emilie Drouineau’s presentation

CC-BY-SA Material under CC-BY-SA licence

I2BC BIOI2 IFB

Programme

  • ⇨ Introduction ⇦

    • What is a cluster?
    • Why use a cluster?
    • How to connect?
  • Storage system

  • Computational resources

  • Jobs

  • Software

  • A few tips

What is a cluster?

A computer cluster:

  • is a set of computer servers (=nodes) connected together

What is a cluster?

A computer cluster:

  • is a set of computer servers (=nodes) connected together
  • communication between nodes goes through a centralised scheduler

What is a cluster?

A computer cluster:

  • is a set of computer servers (=nodes) connected together
  • communication between nodes goes through a centralised scheduler
  • there are master node(s) (=main nodes) and slave nodes (=workers)

What is a cluster?

A computer cluster:

  • is a set of computer servers (=nodes) connected together
  • communication between nodes goes through a centralised scheduler
  • there are master node(s) (=main nodes) and slave nodes (=workers)
  • all nodes have access to the storage system

What is a cluster?

In summary, a computer cluster is:

Cluster diagramme
Cluster in real life

Why use a cluster?

Why use a cluster?

Using a computer cluster is:

  • more powerful
  • faster (allows parallelisation of tasks over several nodes)
  • more efficient with heavy data
  • frees up your personal computer so you can use it for other stuff ;-)

NB: you can use a cluster but you don’t have to. It will depend on the tasks.

How to connect?

  • Tool: terminal/Windows Power Shell or other thrid party tools
  • Command:
# Step 1 - passerelle
ssh john.doe@passerelle.i2bc.paris-saclay.fr

# Step 2 - cluster (Slurm)
ssh john.doe@slurmlogin.calcul.i2bc.paris-saclay.fr

NB: going through the passerelle is only necessary if you are outside of the I2BC network

How to connect?

=> You’re now on the master node of the cluster!

SLURM welcome message

What we just did?

Programme

  • Introduction

  • ⇨ Storage system ⇦

    • “Partages”
    • “Computing spaces”
    • Quotas
  • Computational resources

  • Jobs

  • Software

  • A few tips

Storage system

All nodes have access to the same file systems (with a few exceptions)

Storage accessibility on the cluster

Storage system

From the cluster, you have access to:

➤ “Partages” (aka “store”)

For mid-term & safe storage

  • /store/EQUIPES/<team accronym>/: team space
  • /store/USERS/<your login>/: personal space
  • /store/plateformes/<project name>/: project space (on demand)

=> all 3 are regularly backed up (i.e. ls .snapshots/ see intranet under the “restauration de fichers” tab)

Storage system

From the cluster, you have access to:

➤ 3 “computing spaces”

Not for storage; specific to the cluster

  • /home/<your login>/ => your “home” (config)

Storage system

From the cluster, you have access to:

➤ 3 “computing spaces”

Not for storage; specific to the cluster

  • /home/<your login>/ => your “home” (config)
  • /data/work/I2BC/ => collective temporary space
  • it’s collective (read/write rights for everyone)
  • for intermediate/temporary files
  • max 20To total
  • no automatic cleaning so it’s up to you!
  • Good habit: work within a folder with your login

Storage system

From the cluster, you have access to:

➤ 3 “computing spaces”

Not for storage; specific to the cluster

  • /home/<your login>/ => your “home” (config)
  • /data/work/I2BC/ => collective temporary space
  • /scratchlocal => node-specific temporary space
  • only you can read/write your stuff
  • for intermediate/temporary files (ideal when lots of I/O operations)
  • variable quota (depends on the node)
  • no automatic cleaning so it’s up to you!
  • Good habit: work within a folder with your login

Storage system

In summary

  • all these spaces are accessible from all nodes (including the master node)
  • except for the /scratchlocal/ spaces which are specific to each node
Storage accessibility on the cluster

/!\ when using temporary spaces, please:

  • work within a folder containing your login
  • clean up after yourselves!

Storage system

Where to save what?

/store/EQUIPES - raw data & final processed data (for safe keeping)
- all data & protocoles that should be accessible by the team and that cannot be easily re-generated
/store/USERS - personal but professional data (e.g. course material for PhD students or MDC, work contracts etc.)
/home - login & config files (e.g. .bashrc, .condarc etc.)
- installed programmes (i.e. in .local, mamba/conda etc.)
/data/work/I2BC - temporary data that needs to be accessible from all nodes (e.g. a copy of a small database fetched from the internet, non-dividable files, etc.)
- data that is copied from somewhere else (e.g. for sharing with other teams)
/scratchlocal - temporary data that doesn’t have to be accessible from all nodes (e.g. intermediate files generated by a tool)

/!\ for temporary spaces, remember to delete your data when you’ve finished

Storage system

Quotas

Coming soon… with the migration from Compelent to CEPH storage in 2025

In the mean time, you can use traditionnal linux commands such as:

  • df -Th
  • du -ch --max-depth=1 <folder name>

e.g.

$ du -ch --max-depth=1  /store/EQUIPES/BIOI2/
92G     /store/EQUIPES/BIOI2/MEMBERS
844M    /store/EQUIPES/BIOI2/ADMIN
3,3G    /store/EQUIPES/BIOI2/FORMATIONS
97G     total

Programme

  • Introduction

  • Storage system

  • ⇨ Computational resources ⇦

    • nodes
    • partitions
  • Jobs

  • Software

  • A few tips

Computational resources

Nodes

  • slave nodes are identical in terms of OS & software

Computational resources

Nodes

  • slave nodes are identical in terms of OS & software
  • BUT are different in terms of number of CPUs, Memory, GPUs etc.

Computational resources

Nodes

  • slave nodes are identical in terms of OS & software
  • BUT are different in terms of number of CPUs, Memory, GPUs etc.
  • nodes belong to 1 or several partitions
NB: the above partitions are invented names for illustrational purposes

Computational resources

Partitions on the I2BC cluster

different partitions = different properties

  • common: access to all “shared” or “common” nodes (default)
  • runXX (XX=2, 4, 8, 16, 32): when several jobs, auto management to limit to XX jobs running at a time
  • <group-specific>: priority (or exclusive usage) for group members who financed these nodes
  • smallgpu: access to nodes with small mem GPUs
  • lowprio & lowpriogpu: access to more nodes but job could be suspended for a while or stop/restarted, respectively

Programme

  • Introduction

  • Storage system

  • Computational resources

  • ⇨ Jobs ⇦

    • SLURM scheduler
    • key commands
    • third-party commands
  • Software

  • A few tips

Jobs

Job = any task you want to run

Scheduler = resource manager = SLURM

  • manages resources
  • allocates jobs to the right nodes
  • monitors job status
  • reports

=> On a cluster, you run jobs through the scheduler


Key commands (native to the scheduler)

What? SLRUM
submit a job sbatch & srun
list all running or queuing jobs squeue
delete a job scancel

Jobs

Key commands (native to the scheduler)

sbatch & srun to reserve resources

Jobs

Key commands (native to the scheduler)

sbatch & srun to reserve resources => you’re given a “job id”

Jobs

Key commands (native to the scheduler)

sbatch & srun to reserve resources => you’re given a “job id”

Jobs

Key commands (native to the scheduler)

  • sbatch [options] myscript.sh [script arguments]: “letter in postbox”
  • srun [options] --pty {bash,python,...}: interactive session

Common options:

  • --cpus-per-task X: how many CPUs
  • --mem X[M,G,T]: how much memory (e.g. 2G = 2 Gb)
  • --partition X: which partition (e.g. common, lowprio)
  • --time X: maximum allocated time (DD-HH:MM:SS)
  • --job-name X: personnalised job name

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

=> lists all jobs, resources required & used, and their status

Status Description
(PD) PENDING Waiting for resources
(R) RUNNING Job is running
(CG) COMPLETING Job done and finalising
(CD) COMPLETED Job completed successfully
(F) FAILED Job stopped with non-zero error code

Jobs

Key commands (native to the scheduler)

scancel to delete a job (if you changed your mind - jobs terminate on there own otherwise)

Jobs

Key commands (native to the scheduler)

scancel to cancel a job (if you changed your mind - jobs terminate on there own otherwise)

Jobs

Third-party commands

A set of scripts from slurm-tools are installed on the master node:

About jobs:

  • jobqueue: get list of jobs in queue or running
  • jobhist: get list of all jobs that have run or are running or queuing
  • jobinfo <jobid>: get detailed information for a single job (that has run or is running)

About available resources:

  • nodeinfo: list of all partitions and their capacities (number of nodes, cpus, memory, gpu model if any, compact node list and state)
  • noderes: list of all nodes with accessory information e.g. partitions you can use to access them, state and free resources

There’s also a more interactive view with tsqueue to visualise currently running or queueing jobs.

Programme

  • Introduction
  • Storage system
  • Computational resources
  • Jobs
  • ⇨ Software ⇦
  • A few tips

Software

Some programmes are already installed on the nodes…

  • Use module to check for them and load them:

    • module avail: to list all available software
    • module load <module name>: to load a specific programme

/!\ module load will only work on the slave nodes!

Never run heavy programmes on the master node…

  • If your programme isn’t in the list:

    1. (preferred option) contact for a general installation
    2. install it yourself in your home (e.g. pip install --user or with conda/mamba)

Programme

  • Introduction
  • Storage system
  • Computational resources
  • Jobs
  • Software
  • ⇨ A few tips ⇦

A few tips

Where am I??

All terminals look the same

The terminal prompt is your friend!

A few tips

Who to contact when you have problems?

  1. Need help with your computer, internet, some software, the cluster…?



  1. Questions on specific bioinformatics tools & practices? Help on setting up an analysis pipeline? Etc.



  1. Come to our next “Q&A in bioinformatics” session (look out for our emails) ;-)

A few tips

Stay FAIR!

  • Findable
  • Accessible
  • Interoperable
  • Reusable

For example:

  • organise your data into projects
  • describe folder contents in a “README” file saved within this folder
  • give meaningful names to your files & folders
  • keep raw data, processed data and scripts in separated (sub-)folders
  • keep track of software versions & parameters that you used
  • version your code with git (+ Forge I2BC or GitLab or GitHub…)