Getting started with Snakemake

Exercise 1B - improving your snakefile

objective > setup > o1 > o2 > o3 > o4 > recap

Objective 3

Create a new snakefile named ex1b_o3.smk in which we redirect the standard output and error streams to log files.
Where to start?
  • About stdout/stderr: In Unix systems, the output of a command is usually sent to 2 separate streams: the expected output to Standard Out (stdout, or “>” or “1>”), and the error messages to Standard Error (stderr, or “2>”).
  • Add the log directive: redirect the stdout and stderr streams of the fastqc and multiqc rules to a file by adding a “log:” directive (similar to the already existing input: or output: directives) with two variables, out and err, to separately redirect each stream.
  • Adapt the shell commands: add stdout and stderr redirections using 1> stdout.txt and 2> stderr.txt in the shell command lines of your rules. Use wildcards to specify the chosen file names (e.g. “1>{log.std} 2>{log.err}“).
Your code for ex1b_o3.smk should look like this:
SAMPLES, = glob_wildcards(config["dataDir"]+"/{sample}.fastq.gz")

rule all:
  input:
    expand("FastQC/{sample}_fastqc.html", sample=SAMPLES),
    "multiqc_report.html"

rule fastqc:
  input:
    config["dataDir"]+"/{sample}.fastq.gz"
  output:
    "FastQC/{sample}_fastqc.zip",
    "FastQC/{sample}_fastqc.html"
  log:
    "Logs/{sample}_fastqc.std",
    "Logs/{sample}_fastqc.err"
  shell: "fastqc --outdir FastQC {input} 1>{log[0]} 2>{log[1]}"

rule multiqc:
  input:
    expand("FastQC/{sample}_fastqc.zip", sample = SAMPLES)
  output:
    "multiqc_report.html",
    directory("multiqc_data")
  log:
    std="Logs/multiqc.std",
    err="Logs/multiqc.err"
  shell: "multiqc {input} 1>{log.std} 2>{log.err}"
As you can see, we specify the log files differently in the fastqc rule and in the multiqc rule (for demonstration reasons). In the multiqc rule, both log files are named (“std” and “err”) and are used in the shell directive like so: “{log.std}” and “{log.err}“. In the fastqc rule, we don’t specify names and use them in the shell directive with Python’s list syntax instead: “{log[0]}” and “{log[1]}“.
Test the script

Next, let’s check if your pipeline works as expected:

You should see something similar to the following output on your screen.

Observe the output
As you can see, the fastqc steps don’t generate as much text as before. Also, if you have a look at your working directory, you should see a Logs folder in there now, containing all the individual logs of your input files and rules:
Scroll to Top