Last updated: 2025-03-13
This exercise is identical to the previous 2 case studies (Exercies
2A & 2B) but with a different input and programme. It also has a few
extra steps at the end to introduce you to for
loops and
job arrays.
⁕ ⁕ ⁕ ⁕ ⁕ ⁕
We just ran AlphaFold to predict the structure of a protein (Protein transport protein SEC39) from its sequence. We downloaded several models and would like to see how different they are from the experimental structure. In the following example, we will try to run the TM-align programme to structurally align our protein structure models onto the experimental one and calculate the TM-score similarity measure between them (more about the TM-score). It’s a small programme that doesn’t require a lot of resources and it’s already installed on the I2BC cluster. In our example, we will first be aligning only two structures on each other, then we’ll have a look at solutions to align several in one go.
⁕ ⁕ ⁕ ⁕ ⁕ ⁕
In this case study, you will see the complete step-by-step of how to use a given software on the I2BC cluster. Starting with looking for it within the installed software, understanding how to use it, running it within a Slurm job and optimising the resources to reserve.
⁕ ⁕ ⁕ ⁕ ⁕ ⁕
It’s the same as for Exercise
0{:target=“_blank”}: you should be connected to the cluster and on
the master node (i.e. slurmlogin
should be written in your
terminal prefix).
It’s the same as for the previous cases studies, Exercises 2A & 2B. You can skip this step if you’ve already done it.
If not:
You will need the example files available in Zenodo under this link, or on
the Forge Logicielle under
this link for those who are familiar with git
.
We’ll work in your home directory. Let’s move to it and fetch our
working files using wget
:
john.doe@slurmlogin:~$ cd /home/john.doe
john.doe@slurmlogin:/home/john.doe$ wget "https://zenodo.org/records/15017630/files/cluster_usage_examples.tar.gz?download=1" -O cluster_usage_examples.tar.gz
john.doe@slurmlogin:/home/john.doe$ tar -zxf cluster_usage_examples.tar.gz
john.doe@slurmlogin:/home/john.doe$ ls cluster_usage_examples/
example_fastqc example_mafft example_tmalign
In the example_tmalign
folder, you’ll see a set of
protein structure files in pdb format and containing the 3D coordinates
of each atom in the protein. 8FTU.pdb
corresponds to the
coordinates of the experimental structure, the other files correspond to
AlphaFold’s predictions ranked from 001 to 005.
john.doe@slurmlogin:/home/john.doe$ ls cluster_usage_examples/example_tmalign/
8FTU.pdb SEC39_unrelaxed_rank_003_alphafold2_ptm_model_3_seed_96009.pdb
SEC39_unrelaxed_rank_001_alphafold2_ptm_model_4_seed_96009.pdb SEC39_unrelaxed_rank_004_alphafold2_ptm_model_2_seed_96009.pdb
SEC39_unrelaxed_rank_002_alphafold2_ptm_model_5_seed_96009.pdb SEC39_unrelaxed_rank_005_alphafold2_ptm_model_1_seed_96009.pdb
⁕ ⁕ ⁕ ⁕ ⁕ ⁕
The TM-align
programme executable is called tmalign
. Try to find it
using the module
command.
The module command can be used from anywhere on the cluster. The main sub-commands are:
module avail
: to list all available softwaremodule load/unload <software name>
: to load
specific software (for use)module list
: to list currently loaded softwareTo get more details on options for these subcommands (e.g. to search
for a specific name), you can use the -h
option to get the
help page.
john.doe@slurmlogin:/home/john.doe$ module avail -C tm -i
-------------------------------------------------------- /usr/share/modules/modulefiles ---------------------------------------------------------
nodes/r-cran-statmod-1.4.35 TMalign/TMalign
In the above command, we used the -C
option to specify a
pattern to search for (“tm”) and -i
to make the search
case-insensitive.
So all we have to do is use: module load TMalign/TMalign
in order to load TMalign.
TMalign
executableLet’s investigate how to use the TMalign
executable: How
do we specify the inputs? What options or parameters can we use? What
would the final command line look like to run TMalign
on
your input?
Hints:
TMalign
, you can often get a
help message with usage examples by using the --help
or
-h
optionsrun --pty bash
)module
aren’t available on the master
node (slurmlogin), so let’s first connect to a node and then load
TMalign
with module
:john.doe@slurmlogin:/home/john.doe$ srun --pty bash
john.doe@node01:/home/john.doe$ module load TMalign/TMalign
man your_programme
” or
“your_programme -h
” or
“your_programme --help
”. Let’s see if we can access the
help menu for TMalign
:john.doe@node01:/home/john.doe$ TMalign -h
Brief instruction for running TM-align program:
(For detail: Zhang & Skolnick, Nucl. Acid. Res. 33: 2302-9, 2005)
1. Align 'chain_1.pdb' and 'chain_2.pdb':
>TMalign chain_1.pdb chain_2.pdb
[...]
According to line 5, the basic usage of TMalign
in our
case would look like this (executable in red, both input pdb files in
blue, output file in purple): “TMalign
cluster_usage_examples/example_tmalign/8FTU.pdb
cluster_usage_examples/example_tmalign/SEC39_AF_model_rank_001.pdb
> cluster_usage_examples/example_tmalign/tmalign_exp_vs_rank1.txt”
Note: by default, TMalign
will just print the
alignment and score information to the screen (no option to specify the
output file). In order to capture this printed output into a file, we
use the redirection sign “>
” followed by the name of the
file we want to redirect the printed output to.
TMalign
and its execution. We no longer need to be
connected to a node and can now liberate the resources that we’ve
blocked by disconnecting from it:john.doe@node01:/home/john.doe$ exit 0
john.doe@slurmlogin:/home/john.doe$
Of note: as you can see, the terminal prompt changed again from node01 back to slurmlogin: we’ve returned to the master node (slurmlogin) of the cluster and the job we were running has terminated.
Let’s move to your example_tmalign
subdirectory first
and write the slurm_script.sh
in there.
john.doe@slurmlogin:/home/john.doe$ cd cluster_usage_examples/example_tmalign/
nano
(but there
are other possibilities such as vi
, vim
or
emacs
for example). To use nano:
nano slurm_script.sh
will create and open the
slurm_script.sh
file.^
= Ctrl, Ctrl+G
to see help message,
Ctrl+X
to exit), see nano
cheat sheet.#SBATCH
-prefixed lines at the head to
specify slurm options for submission, see Slurm cheat
sheet.The Slurm submission script is written like a common bash script
(same language as the terminal). You write in this script all the
commands (1 per line) that you would usually type in your terminal. The
only particularity are the Slurm submission options, that you can add
directly to this script, commonly at the beginning (each parameter
should be preceded by #SBATCH
):
#! /bin/bash
#SBATCH --job-name="my_jobname"
#SBATCH --partition=common
#SBATCH --cpus-per-task=1
### prefix start - create temporary directory for your job
export TMPDIR=$(mktemp -d)
### prefix end
module load TMalign/TMalign
# This is a comment line - it will be ignored when the script is executed
# Comment lines start with a "#" symbol and can be put anywhere you like
# You can also add a comment at the end of a line, as shown below
cd /home/john.doe/cluster_usage_examples/example_tmalign/ #|
#| These are your shell commands
TMalign 8FTU.pdb SEC39_AF_model_rank_001.pdb > tmalign_exp_vs_rank1.txt #|
### suffix start - delete created temporary directory
rm -rf $TMPDIR
### suffix end
Explanation of the content:
#! /bin/bash
: this is the “shebang”, it specifies the
“language” of your script (in this case, the cluster understands that
the syntax of this text file is “bash” and will execute it with the
/bin/bash executable).#SBATCH
: All lines starting with #SBATCH
indicate to the Slurm job scheduler on the cluster that the following
information is information related to the job submission. This is where
you specify the slurm options such as your job name with
--job-name
or the partition you want to submit your job to
with -p
. There are many more options you can specify, see
the “cheat
sheet” tab on the intranetmodule load
will load the software you need (i.e. mafft
in this case)cd /path/to/your/folder
: although Slurm places you in
the same directory in which you’ve submitted the slurm script, it’s a
good habit to deliberately move to your working directly within the
submission script to avoid any nasty surprises… By moving to the
directory which contains your input, you won’t need to specify the full
path to the input, as you can see in the line of code that follows this
statement.$TMPDIR
. This folder is sometimes
overridden and dealt with automatically by the Scheduler (automatic
clean up) but it’s a good habit to do this yourself if you can (to avoid
saturating the temporary disk when it’s not dealt with automatically).
Here, we use the mktemp
command to create a temporary
folder with a random name at the beginning of the script and we use
rm
to delete it at the end of the script.When you exit the nano text editor, you should see the file created in your current directory:
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ ls
8FTU.pdb SEC39_AF_model_rank_003.pdb
SEC39_AF_model_rank_001.pdb SEC39_AF_model_rank_004.pdb
SEC39_AF_model_rank_002.pdb SEC39_AF_model_rank_005.pdb
slurm_script.sh
To submit a Slurm submission script, all you have to do is:
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ sbatch slurm_script.sh
Submitted batch job 287170
This will print your attributed job id on the screen (287170 in this case).
You can follow the progression of your job with
squeue
:
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ squeue -j 287170
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
287170 common my_jobname john.doe R 00:00:41 1 node06
(you can omit the -j
option to show all the currently
running jobs).
You can learn more about the options for squeue
in the
manual (type man squeue
, navigate with the up/down arrow
keys and exit by typing q).
Hints:
slurm-xxx.out
file (replace xxx with
your job id) to see if there were any problemsWhat files do we expect to see? There should be 2 new files in total:
Your job shouldn’t take too long to finish, then you should be able to see the output files in your folder:
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ ls
8FTU.pdb SEC39_AF_model_rank_003.pdb
SEC39_AF_model_rank_001.pdb SEC39_AF_model_rank_004.pdb
SEC39_AF_model_rank_002.pdb SEC39_AF_model_rank_005.pdb
slurm-287170.out slurm_script.sh
tmalign_exp_vs_rank1.txt
tmalign_exp_vs_rank1.txt
slurm-287170.out
Your can also have a look at your output, in which we see that the model of rank #1 has a TM-score of 0.84 with the reference:
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ cat tmalign_exp_vs_rank1.txt
**************************************************************************
* TM-align (Version 20190822) *
* An algorithm for protein structure alignment and comparison *
* Based on statistics: *
* 0.0 < TM-score < 0.30, random structural similarity *
* 0.5 < TM-score < 1.00, in about the same fold *
* Reference: Y Zhang and J Skolnick, Nucl Acids Res 33, 2302-9 (2005) *
* Please email your comments and suggestions to: zhng@umich.edu *
**************************************************************************
Name of Chain_1: 8FTU.pdb
Name of Chain_2: SEC39_AF_model_rank_001.pdb
Length of Chain_1: 627 residues
Length of Chain_2: 672 residues
Aligned length= 626, RMSD= 4.10, Seq_ID=n_identical/n_aligned= 0.954
TM-score= 0.83692 (if normalized by length of Chain_1)
TM-score= 0.78720 (if normalized by length of Chain_2)
(You should use TM-score normalized by length of the reference protein)
[...]
Having issues? If you don’t have the output files, then there might be a problem in the execution somewhere. In that case, you can have a look at the log file from Slurm:
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ cat slurm-287170.out
Note that the log file is generated by default in the directory in
which you ran the sbatch
command. There is an option in
sbatch
with which you can change this behaviour.
Typical error messages are for example:
-bash: tmalign: command not found
TMalign
correctly
(tmalign
instead of TMalign
)At line 293 of file TMalign.f (unit = 10) Fortran runtime error: Cannot open file '8FTT.pdb': No such file or directory
:TMalign
cannot find the input that you gave it. In
this case, it’s linked to a typo in the name (8FTT.pdb
instead of 8FTU.pdb
) but it could also have been because
TMalign
didn’t find your input in your current working
directory (double-check you’re in the right working directory and
double-check your file paths).Analyse your actual resource consumption: How much memory did you effectively use while running the job? How long did your job take to finish? How much CPU percentage did you use?
This is useful to know in order to adapt the resources you ask for in future jobs with similar proceedings (e.g. other TMalign submissions).
Hints:
jobinfo
command from the slurm-tools toolkit
(//! not a native command of Slurm) for thisjohn.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ jobinfo 287170
Job ID : 287170
Job name : my_jobname
User : john.doe
Account :
Working directory : /home/john.doe/cluster_usage_examples/example_tmalign
Cluster : cluster
Partition : common
Nodes : 1
Nodelist : node06
Tasks : 1
CPUs : 1
GPUs : 0
State : COMPLETED
Exit code : 0:0
Submit time : 2025-03-11T11:03:43
Start time : 2025-03-11T11:03:44
End time : 2025-03-11T11:03:51
Wait time : 00:00:01
Reserved walltime : 02:00:00
Used walltime : 00:00:07
Used CPU walltime : 00:00:07
Used CPU time : 00:00:04
CPU efficiency : 66.00%
% User (computation) : 94.89%
% System (I/O) : 5.11%
Reserved memory : 1000M/core
Max memory used : 2.85M (estimate)
Memory efficiency : 0.29%
Max disk write : 0.00
Max disk read : 0.00
The lines you should look at:
Adjustements to make:
//! it’s important to keep in mind that run times and resource usage also depends on the commands you run and on the size of your input.
Now try optimising your script to reseve not more than the resouces
you actually need to run TMalign
. Your colleagues will be
thankful ;-)
Hint: see Slurm
cheat sheet for a list of all options for sbatch
Your script could look like this:
#! /bin/bash
#SBATCH --job-name="my_jobname"
#SBATCH --partition=common
#SBATCH --cpus-per-task=1
#SBATCH --mem=100M
#SBATCH --time=00:10:00
### prefix start - create temporary directory for your job
export TMPDIR=$(mktemp -d)
### prefix end
module load TMalign/TMalign
# This is a comment line - it will be ignored when the script is executed
cd /home/john.doe/cluster_usage_examples/example_tmalign/ #|
#| These are your shell commands
TMalign 8FTU.pdb SEC39_AF_model_rank_001.pdb > tmalign_exp_vs_rank1.txt #|
### suffix start - delete created temporary directory
rm -rf $TMPDIR
### suffix end
NB: only the #SBATCH
lines were changed…
Can you see a way to adapt your job submission script to run TM-align on several pairs of structures without manually typing every single command line?
What we would like to do is to run TM-align on each model versus the experimental reference structure:
TMalign 8FTU.pdb SEC39_AF_model_rank_001.pdb > tmalign_exp_vs_rank1.txt
TMalign 8FTU.pdb SEC39_AF_model_rank_002.pdb > tmalign_exp_vs_rank2.txt
TMalign 8FTU.pdb SEC39_AF_model_rank_003.pdb > tmalign_exp_vs_rank3.txt
TMalign 8FTU.pdb SEC39_AF_model_rank_004.pdb > tmalign_exp_vs_rank4.txt
TMalign 8FTU.pdb SEC39_AF_model_rank_005.pdb > tmalign_exp_vs_rank5.txt
To submit 1 vs all, there are (at least) 2 solutions (other than running each command line individually):
If your are comfortable with programming, try implementing the
for
loop on your own. Refer to Task 8 below if you need any
tips. Job arrays will be detailed in Task 9.
Here, we want to run one structure (8FTU) versus all others (the SEC39 models).
*
” in bash that replaces any
(set of) character(s) in order to easily get a list of all SEC39
models.$mylist
, a for
loop has the following syntax:for myvariable in $mylist
do
# do some command(s) using the value "$myvariable" as input
done
in which $myvariable
will successively take the values
within $mylist
.
Our adjusted job script slurm_script.sh could then look like this:
#! /bin/bash
#SBATCH --job-name="my_jobname"
#SBATCH --partition=common
#SBATCH --cpus-per-task=1
#SBATCH --mem=100M
#SBATCH --time=00:10:00
### prefix start - create temporary directory for your job
export TMPDIR=$(mktemp -d)
### prefix end
module load TMalign/TMalign
# This is a comment line - it will be ignored when the script is executed
cd /home/john.doe/cluster_usage_examples/example_tmalign/
for pdb in SEC39_AF_model_rank_*
do
TMalign 8FTU.pdb $pdb >> tma_1vall.txt
done
### suffix start - delete created temporary directory
rm -rf $TMPDIR
### suffix end
Explanations:
$pdb
is a variable in bash that will successively take
the name of all files starting with “SEC39_AF_model_rank_
”
in the folder (note that we could’ve called it something else if we
wanted)>>
is a way of redirecting the output to a
file in bash without overwriting what’s already in the file if it
exists. This means that, in this case, all outputs will be written
successively to the same file.Submit the script using sbatch
as previously. Once the
job has finished, you can have a look at the output to see if
AlphaFold’s ranking is coherent with the experimental structure (in
terms of TM-score):
john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ grep "Name of Chain_2\|TM-score=" tma_1vall.txt
Name of Chain_2: SEC39_AF_model_rank_001
TM-score= 0.83692 (if normalized by length of Chain_1)
TM-score= 0.78720 (if normalized by length of Chain_2)
Name of Chain_2: SEC39_AF_model_rank_002
TM-score= 0.84478 (if normalized by length of Chain_1)
TM-score= 0.79436 (if normalized by length of Chain_2)
Name of Chain_2: SEC39_AF_model_rank_003
TM-score= 0.93824 (if normalized by length of Chain_1)
TM-score= 0.87825 (if normalized by length of Chain_2)
Name of Chain_2: SEC39_AF_model_rank_004
TM-score= 0.85504 (if normalized by length of Chain_1)
TM-score= 0.80356 (if normalized by length of Chain_2)
Name of Chain_2: SEC39_AF_model_rank_005
TM-score= 0.94959 (if normalized by length of Chain_1)
TM-score= 0.88831 (if normalized by length of Chain_2)
Explanation: Above, we used the grep
command in order to
only print the lines that interest us in the output file but you could
also just use cat tma_1vsall.txt
to show the whole lot if
you’re not familiar with this command.
The interesting lines, here, are lines 3, 6, 9, 12 and 15. As you can see, the structure with the highest TM-score is actually the one that was ranked last. There could be many reasons for this seeming discrepancy, especially in this case where the structure is very oblong. (if you want more information on the interpretation of AlphaFold results, please have a look at our AlphaFold training session)
First a little context:
What are job arrays?
In Slurm, you can use
job arrays. An array of jobs is a set of jobs that share the same
parameters (e.g. number of CPUs, amount of memory etc.) but each of them
work on different inputs. A job array runs as a collection of related
yet separate basic jobs that might be distributed across multiple hosts
and might run concurrently (instead of sequentially).
Why use job arrays?
The advantage of job
arrays in this case is their parallelism: we will be
running TMalign on each pair of structures in parallel compared to
for
loops where commands are run sequentially within a
single job. This is particularly useful when your individual commands
take a while to run. In this case, it’s not that much different from
using a for loop because TMalign
is fast and we only have 5
comparisons to do. On the other side, the disadvantage of using job
arrays are the constraints on the input names – but
this can be solved quite easily through work-arounds, as you’ll see
below.
2 important parameters in job arrays: - the
Slurm option --array=start-stop
: a range of integers - the
variable “$SLURM_ARRAY_TASK_ID
” which corresponds to the
job index and which takes the values in the start-stop range defined
above
For example, if you add:
parameter value | this will run | value of $SLURM_ARRAY_TASK_ID
in these jobs |
---|---|---|
#SBATCH --array=1-3 |
3 individual jobs | 1 in the first job 2 in the second job 3 in the third job |
#SBATCH --array=6-9 |
4 individual jobs | 6 in the first job 7 in the second job 8 in the third job 9 in the fourth job |
All jobs within a job array run the same Slurm submission script with
the same Slurm parameters (i.e. the same resources asked). It’s up to
you to play with the “$SLURM_ARRAY_TASK_ID
” variable (=the
job index) to avoid each job doing exactly the same thing.
i.e. in th TM-align context,
“$SLURM_ARRAY_TASK_ID
” could reflect the rank of our SEC39
models and would be used to specify different inputs for each job within
the job array.
More on job arrays in the Slurm documentation
Given this information, can you try adjusting your previous script to use job arrays instead?
The job script
For example, in our case, we would like to run 5 TMalign
commands and we would like each job in the job array to run TM-align on
a different pair of structures. Given the input names, we could take
advantage of their ranks to select different structures from one job to
the other (1, 2, 3, 4 and 5) like this:
#! /bin/bash
#SBATCH --job-name="my_jobname"
#SBATCH --partition=common
#SBATCH --cpus-per-task=1
#SBATCH --mem=100M
#SBATCH --time=00:10:00
#SBATCH --array=1-5
### prefix start - create temporary directory for your job
export TMPDIR=$(mktemp -d)
### prefix end
module load TMalign/TMalign
# This is a comment line - it will be ignored when the script is executed
cd /home/john.doe/cluster_usage_examples/example_tmalign/
TMalign 8FTU.pdb SEC39_AF_model_rank_00"${SLURM_ARRAY_TASK_ID}".pdb >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
### suffix start - delete created temporary directory
rm -rf $TMPDIR
### suffix end
Explanation:
$SLURM_ARRAY_TASK_ID
variable).$SLURM_ARRAY_TASK_ID
)
in order to individualise the outputs so they don’t overwrite each
other.sbatch
$SLURM_ARRAY_TASK_ID
in the format:
<jobid>_<arraytaskid>
:john.doe@slurmlogin:/home/john.doe/cluster_usage_examples/example_tmalign$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
16411_1 common my_jobna john.doe R 0:01 1 node24
16411_2 common my_jobna john.doe R 0:01 1 node24
16411_3 common my_jobna john.doe R 0:01 1 node24
16411_4 common my_jobna john.doe R 0:01 1 node24
16411_5 common my_jobna john.doe R 0:01 1 node24
NB: To cancel a specific sub-job, you can use the full id of your job
with scancel
(e.g. scancel 16411_4
), and to
cancel all sub-jobs within the job array, you can just use the root id
(e.g. scancel 16411
).
$SLURM_ARRAY_TASK_ID
even when you don’t have numbers in your input namesAs you realised above, the particularity with jobs within a job array
is that you have to try and individualise what they’re doing using only
the job index ($SLURM_ARRAY_TASK_ID
variable).
There are several options to use this variable:
Your job submission script would then look like:
#! /bin/bash
#SBATCH --job-name="my_jobname"
#SBATCH --partition=common
#SBATCH --cpus-per-task=1
#SBATCH --mem=100M
#SBATCH --time=00:10:00
#SBATCH --array=1-5
### prefix start - create temporary directory for your job
export TMPDIR=$(mktemp -d)
### prefix end
module load TMalign/TMalign
# This is a comment line - it will be ignored when the script is executed
cd /home/john.doe/cluster_usage_examples/example_tmalign/
INPUTS=(SEC39_AF_model_rank_*.pdb) # get list of models; INPUTS is a bash array
TMalign 8FTU.pdb ${INPUTS[$SLURM_ARRAY_TASK_ID]} >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
### suffix start - delete created temporary directory
rm -rf $TMPDIR
### suffix end
Explanation:
We’re reading all files matching the
SEC39_AF_model_rank_*.pdb
pattern in the working folder and
saving them within a bash array named $INPUT
. Then we
extract the i-th element of the array with the ${array[i]}
syntax. “i” is the index of our job aka.
$SLURM_ARRAY_TASK_ID
. Then, we run TMalign
on
the reference structure vs the i-th element in the list.
$SLURM_ARRAY_TASK_ID
Your job submission script could look like:
#! /bin/bash
#SBATCH --job-name="my_jobname"
#SBATCH --partition=common
#SBATCH --cpus-per-task=1
#SBATCH --mem=100M
#SBATCH --time=00:10:00
#SBATCH --array=1-5
### prefix start - create temporary directory for your job
export TMPDIR=$(mktemp -d)
### prefix end
module load TMalign/TMalign
# This is a comment line - it will be ignored when the script is executed
cd /home/john.doe/cluster_usage_examples/example_tmalign/
case "$SLURM_ARRAY_TASK_ID"in
1)
TMalign 8FTU.pdb SEC39_AF_model_rank_001.pdb >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
;;
2)
TMalign 8FTU.pdb SEC39_AF_model_rank_002.pdb >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
;;
3)
TMalign 8FTU.pdb SEC39_AF_model_rank_003.pdb >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
;;
4)
TMalign 8FTU.pdb SEC39_AF_model_rank_004.pdb >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
;;
5)
TMalign 8FTU.pdb SEC39_AF_model_rank_005.pdb >> tma_1vall_"${SLURM_ARRAY_TASK_ID}".txt
;;
*)
echo "Unknown job index"
;;
esac
### suffix start - delete created temporary directory
rm -rf $TMPDIR
### suffix end
Explanation:
In this script, we are using the “cases” function of
bash
: when the job index is a certain value, we only
execute a certain (set of) lines in the submission script. As you might
guess, in this case, using cases isn’t well suited. For example, imagine
you would like to add a few extra structures to the list, this would
mean re-adjusting the script. Also, imagine you have more than just 5
structures to deal with… However, in some cases it might be useful to
have this sort of system, especially if you want to vary the commands or
options from one job to the next.
Take home message
When discovering a new tool and wanting to use it on the cluster
module
module load
at the top#SBATCH
-prefixed lines (see Slurm cheat
sheet for a list of all options)sbatch
squeue
jobinfo
🔗 Back to exercise page{:target=“_blank”}