Getting started with Snakemake
Step 1 - connect to the I2BC cluster
We will be working on the I2BC cluster, on which all the tools that we need are already installed. To connect to the cluster you will need your Multipass login information. Please refer to Step 4 on the I2BC cluster training page for more details on how to proceed.
Step 2 - prepare your working environment
Once on the cluster, connect to a node (that’s where all the tools are installed):
john.doe@cluster-i2bc:~$ qsub -I -q common
qsub: waiting for job 659602.pbsserver to start
qsub: job 659602.pbsserver ready
john.doe@node06:~$
Then load the appropriate tools using the module command:
module load snakemake/snakemake-8.4.6
module load nodes/mafft-7.475
wget
and cat
are default commands that are usually automatically available in bash
so they don’t have to be loaded. Double-check that you’ve loaded the modules correctly, for example, by checking their version:
john.doe@node06:~$ snakemake --version
8.4.6
john.doe@node06:~$ mafft --version
v7.475 (2020/Nov/23)
And create & move to your chosen working space, for example:
WORKDIR="/data/work/I2BC/$USER/snakemake_tutorial"
mkdir -p $WORKDIR
cd $WORKDIR
Step 3 - fetch the Snakemake workflow
The example Snakemake workflow is accessible on the I2BC’s Forge (an equivalent to Github but where files are stored on our local servers instead). You can use the following command lines to download the repository:
git clone https://forge.i2bc.paris-saclay.fr/git/partage-bioinfo/snakemake_examples.git
Note: It will ask you your username (usually firstname.lastname – it’s the same one as for the I2BC cluster) and your password (it’s the usual Multipass one)
You should now see in your working space, a folder called snakemake_examples/
within which you have the Snakemake workflow for this exercise in exercise0/
:
john.doe@node06:/data/work/I2BC/john.doe/snakemake_tutorial$ ls snakemake_examples/exercise0/
Snakefile readme_runSnake.txt