Introduction to the I2BC cluster

Last updated: 2025-03-13

Cluster training session

Introduction to the I2BC cluster

2025-01-31

Past & present contributors:

Jérôme Leconte, I2BC/CNRS
Emilie Drouineau, I2BC/CEA
Fadwa El-Khaddar - I2BC/Univ. Paris-Saclay
Chloe Quignot - I2BC/CEA
Baptiste Roelens, I2BC/CEA
Claire Toffano-Nioche - I2BC/CNRS

Source: adapted from Emilie Drouineau’s presentation

Material under CC-BY-SA licence

Programme

⇨ Introduction ⇦
- What is a cluster?
- Why use a cluster?
- How to connect?
Storage system
Computational resources
Jobs
Software
A few tips

What is a cluster?

A computer cluster:

is a set of computer servers (=nodes) connected together

cluster diagramme

What is a cluster?

A computer cluster:

is a set of computer servers (=nodes) connected together
communication between nodes goes through a centralised scheduler

cluster diagramme

What is a cluster?

A computer cluster:

is a set of computer servers (=nodes) connected together
communication between nodes goes through a centralised scheduler
there are master node(s) (=main nodes) and slave nodes (=workers)

cluster diagramme

What is a cluster?

A computer cluster:

is a set of computer servers (=nodes) connected together
communication between nodes goes through a centralised scheduler
there are master node(s) (=main nodes) and slave nodes (=workers)
all nodes have access to the storage system

cluster diagramme

What is a cluster?

In summary, a computer cluster is:

Why use a cluster?

Computer on fire

Why use a cluster?

Computer is happy

Using a computer cluster is:

more powerful

faster (allows parallelisation of tasks over several nodes)

more efficient with heavy data

frees up your personal computer so you can use it for other stuff ;-)

NB: you can use a cluster but you don’t have to. It will depend on the tasks.

How to connect?

Tool: terminal/Windows Power Shell or other thrid party tools
Command:

# Step 1 - passerelle
ssh john.doe@passerelle.i2bc.paris-saclay.fr

# Step 2 - cluster (Slurm)
ssh john.doe@slurmlogin.calcul.i2bc.paris-saclay.fr

cluster diagramme

NB: going through the passerelle is only necessary if you are outside of the I2BC network

How to connect?

=> You’re now on the master node of the cluster!

What we just did?

cluster diagramme

Programme

Introduction
⇨ Storage system ⇦
- “Partages”
- “Computing spaces”
- Quotas
Computational resources
Jobs
Software
A few tips

Cluster diagramme

Storage system

All nodes have access to the same file systems (with a few exceptions)

Storage system

From the cluster, you have access to:

➤ “Partages” (aka “store”)

For mid-term & safe storage

/store/EQUIPES/<team accronym>/: team space
/store/USERS/<your login>/: personal space
/store/plateformes/<project name>/: project space (on demand)

=> all 3 are regularly backed up (i.e. ls .snapshots/ see intranet under the “restauration de fichers” tab)

Storage system

From the cluster, you have access to:

➤ 3 “computing spaces”

Not for storage; specific to the cluster

/home/<your login>/ => your “home” (config)

it’s personal (only accessible by you)
for hosting configuration or login files
locally installed packages usually go in there (.local folder, etc.)
it’s backed up (i.e. ls .snapshots/ see intranet under the “restauration de fichers” tab)

Storage system

From the cluster, you have access to:

➤ 3 “computing spaces”

Not for storage; specific to the cluster

/home/<your login>/ => your “home” (config)
/data/work/I2BC/ => collective temporary space

it’s collective (read/write rights for everyone)
for intermediate/temporary files
max 20To total
no automatic cleaning so it’s up to you!
Good habit: work within a folder with your login

Storage system

From the cluster, you have access to:

➤ 3 “computing spaces”

Not for storage; specific to the cluster

/home/<your login>/ => your “home” (config)
/data/work/I2BC/ => collective temporary space
/scratchlocal => node-specific temporary space

only you can read/write your stuff
for intermediate/temporary files (ideal when lots of I/O operations)
variable quota (depends on the node)
no automatic cleaning so it’s up to you!
Good habit: work within a folder with your login

Storage system

In summary

all these spaces are accessible from all nodes (including the master node)
except for the /scratchlocal/ spaces which are specific to each node

/!\ when using temporary spaces, please:

work within a folder containing your login
clean up after yourselves!

/data/work/I2BC

Storage system

Where to save what?

➜ `/store/EQUIPES`	- raw data & final processed data (for safe keeping) - all data & protocoles that should be accessible by the team and that cannot be easily re-generated
➜ `/store/USERS`	- personal but professional data (e.g. course material for PhD students or MDC, work contracts etc.)
➜ `/home`	- login & config files (e.g. `.bashrc`, `.condarc` etc.) - installed programmes (i.e. in `.local`, `mamba`/`conda` etc.)
➜ `/data/work/I2BC`	- temporary data that needs to be accessible from all nodes (e.g. a copy of a small database fetched from the internet, non-dividable files, etc.) - data that is copied from somewhere else (e.g. for sharing with other teams)
➜ `/scratchlocal`	- temporary data that doesn’t have to be accessible from all nodes (e.g. intermediate files generated by a tool)

/!\ for temporary spaces, remember to delete your data when you’ve finished

Storage system

Quotas

Coming soon… with the migration from Compelent to CEPH storage in 2025

In the mean time, you can use traditionnal linux commands such as:

df -Th
du -ch --max-depth=1 <folder name>

e.g.

$ du -ch --max-depth=1  /store/EQUIPES/BIOI2/
92G     /store/EQUIPES/BIOI2/MEMBERS
844M    /store/EQUIPES/BIOI2/ADMIN
3,3G    /store/EQUIPES/BIOI2/FORMATIONS
97G     total

Programme

Introduction
Storage system
⇨ Computational resources ⇦
- nodes
- partitions
Jobs
Software
A few tips

Cluster diagramme

Computational resources

Nodes

slave nodes are identical in terms of OS & software

Cluster diagramme

Computational resources

Nodes

slave nodes are identical in terms of OS & software
BUT are different in terms of number of CPUs, Memory, GPUs etc.

Cluster diagramme

Computational resources

Nodes

slave nodes are identical in terms of OS & software
BUT are different in terms of number of CPUs, Memory, GPUs etc.
nodes belong to 1 or several partitions

NB: the above partitions are invented names for illustrational purposes

Computational resources

Partitions on the I2BC cluster

different partitions = different properties

common: access to all “shared” or “common” nodes (default)
runXX (XX=2, 4, 8, 16, 32): when several jobs, auto management to limit to XX jobs running at a time
<group-specific>: priority (or exclusive usage) for group members who financed these nodes
smallgpu: access to nodes with small mem GPUs
lowprio & lowpriogpu: access to more nodes but job could be suspended for a while or stop/restarted, respectively

Programme

Introduction
Storage system
Computational resources
⇨ Jobs ⇦
- SLURM scheduler
- key commands
- third-party commands
Software
A few tips

Cluster diagramme

Jobs

Job = any task you want to run

Scheduler = resource manager = SLURM

manages resources
allocates jobs to the right nodes
monitors job status
reports

=> On a cluster, you run jobs through the scheduler

Key commands (native to the scheduler)

What?	SLRUM
submit a job	`sbatch` & `srun`
list all running or queuing jobs	`squeue`
delete a job	`scancel`

Jobs

Key commands (native to the scheduler)

sbatch & srun to reserve resources

sbatch/srun

Jobs

Key commands (native to the scheduler)

sbatch & srun to reserve resources => you’re given a “job id”

sbatch/srun

Jobs

Key commands (native to the scheduler)

sbatch & srun to reserve resources => you’re given a “job id”

sbatch/srun

Jobs

Key commands (native to the scheduler)

sbatch [options] myscript.sh [script arguments]: “letter in postbox”
srun [options] --pty {bash,python,...}: interactive session

Common options:

--cpus-per-task X: how many CPUs
--mem X[M,G,T]: how much memory (e.g. 2G = 2 Gb)
--partition X: which partition (e.g. common, lowprio)
--time X: maximum allocated time (DD-HH:MM:SS)
--job-name X: personnalised job name

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

squeue

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

squeue

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

squeue

Jobs

Key commands (native to the scheduler)

squeue to follow jobs

=> lists all jobs, resources required & used, and their status

Status	Description
(PD) PENDING	Waiting for resources
(R) RUNNING	Job is running
(CG) COMPLETING	Job done and finalising
(CD) COMPLETED	Job completed successfully
(F) FAILED	Job stopped with non-zero error code

Jobs

Key commands (native to the scheduler)

scancel to delete a job (if you changed your mind - jobs terminate on there own otherwise)

scancel

Jobs

Key commands (native to the scheduler)

scancel to cancel a job (if you changed your mind - jobs terminate on there own otherwise)

scancel

Jobs

Third-party commands

A set of scripts from slurm-tools are installed on the master node:

About jobs:

jobqueue: get list of jobs in queue or running
jobhist: get list of all jobs that have run or are running or queuing
jobinfo <jobid>: get detailed information for a single job (that has run or is running)

About available resources:

nodeinfo: list of all partitions and their capacities (number of nodes, cpus, memory, gpu model if any, compact node list and state)
noderes: list of all nodes with accessory information e.g. partitions you can use to access them, state and free resources

There’s also a more interactive view with tsqueue to visualise currently running or queueing jobs.

Programme

Introduction
Storage system
Computational resources
Jobs
⇨ Software ⇦
A few tips

Software

Some programmes are already installed on the nodes…

Use module to check for them and load them:
- module avail: to list all available software
- module load <module name>: to load a specific programme

/!\ module load will only work on the slave nodes!

Never run heavy programmes on the master node…

If your programme isn’t in the list:
1. (preferred option) contact support.informatique@i2bc.paris-saclay.fr for a general installation
2. install it yourself in your home (e.g. pip install --user or with conda/mamba)

Programme

Introduction
Storage system
Computational resources
Jobs
Software
⇨ A few tips ⇦

A few tips

Where am I??

❝ All terminals look the same ❞

The terminal prompt is your friend!

cluster diagramme

A few tips

Who to contact when you have problems?

Need help with your computer, internet, some software, the cluster…?
- Contact the SICS – IT support team support.informatique@i2bc.paris-saclay.fr

Questions on specific bioinformatics tools & practices? Help on setting up an analysis pipeline? Etc.
- Contact BIOI2 contact-bioi2@i2bc.paris-saclay.fr
- Ask around you e.g. using our FramaTeam group: more information
- Search through the intranet (tools & procedures)
- Search & contribute to the Wiki on the Forge: here

Come to our next “Q&A in bioinformatics” session (look out for our emails) ;-)

A few tips

Stay FAIR!

Findable
Accessible
Interoperable
Reusable

For example:

organise your data into projects
describe folder contents in a “README” file saved within this folder
give meaningful names to your files & folders
keep raw data, processed data and scripts in separated (sub-)folders
keep track of software versions & parameters that you used
version your code with git (+ Forge I2BC or GitLab or GitHub…)