Commit 50170b46 authored by Lukas Jelonek's avatar Lukas Jelonek
Browse files

Rewrite exercise 3

parent d1a793b0
# Exercise: Container usage
# Exercise: Software availability (Conda, Docker, Singularity)
In this exercise you will start with a complete workflow that was created by
someone else on another system. Unfortunately the required software is not installed
on your current system. You will learn how to modify the workflow to avoid manual
installation of additional software.
In this exercise you will extend the given nextflow workflow `main.nf` to use
containers within the tasks instead of using the preinstalled tools. Therefore
......@@ -6,53 +11,84 @@ the actual workflow will not be changed. Instead each process will be
configured seperately inside the `nextflow.config` configuration file to use a
certain container.
## Identify suitable containers for fastqc and multiqc
Check the biocontainers homepage/git repo/quay.io repository for suitable
docker images for fastqc and multiqc.
## Add the containers to the configuration
## Configuration of processes without touching the workflow
A process can be configured inside the process configuration scope. You can
select a certain process with the withName selector. Configure both processes
A process can be configured inside the process configuration scope of the `nextflow.config` file.
You can select a certain process with the withName selector. Configure the processes
to use the containers you identified in the previous section.
~~~~
process {
withName:<processname> {
container = '<containername>'
<configuration directives>
}
}
~~~~
## Configure to use docker
## Install the tools with conda
Activate docker by adding the following to your config.
One way to install the required tools via conda. Therefore you add the `conda` directive to
the process configuration
~~~~
docker {
enabled = true
process {
withName:fastqc {
conda = "fastqc"
}
}
~~~~
Start the workflow. In the background the docker images will be downloaded. As
soon as they are there, the analysis will run inside docker containers.
Add these directives for all missing software packages and rerun the workflow. Conda will
install a separate environment with all required software for each tool. As `fastqc` and
`multiqc` are already available on your system you only need to add the packages for the
index-preparation and mapping step.
## Configure to use singularity
> **_HINT_** Both required tools are part of the `bowtie2` conda package
Singularity is another container technology that is popular in the HPC
community as it has no problems with privilege escalation. It has an own image
format, but can also use docker images.
**(Optional)** Configure all processes to use conda in order to obtain full portability.
## Obtain software via docker
Another way to obtain the required tools is docker. Therefore you add the `container` directive to
the process configuration and enable docker.
~~~~
process {
withName:fastqc {
container = "quay.io/biocontainers/fastqc:0.11.8--1"
}
}
docker.enables = true
~~~~
Add these directives for all missing software packages and rerun the workflow. Docker will download
the image for each tool and nextflow will run the task in the respective containers. As `fastqc` and
`multiqc` are already available on your system you only need to add the packages for the
index-preparation and mapping step. Search the biocontainers registry https://biocontainers.pro/#/registry
for recent images.
> **_HINT_** Both required tools are part of the `bowtie2` biocontainer
Deactivate docker. Either delete it or set the enabled option to false.
> **_NOTE_** Docker is not enabled on your training machine by default. To activate it run `sudo service docker start`.
Add the following snippet to your config file for singularity support:
**(Optional)** Configure all processes to use docker in order to obtain full portability.
## Obtain software via singularity
Singularity is another container technology that is popular in the HPC
community as it has no problems with privilege escalation. It has its own image
format, but can also use docker images.
Deactivate docker. Either delete it or set the enabled option to false. Then enable singularity with
~~~~
singularity {
enabled = true
autoMount = true
}
~~~~
Start the workflow. The images will be downloaded for singularity and the
analysis will be executed inside the singularity containers.
Nextflow will download the images from the docker registry and run them with singularity.
> **_NOTE_** Some images may not work, due to packaging errors or incomplete Image recipes.
You can file an issue in the bioconda github project or you can try to use another version.
\ No newline at end of file
params.input = 'data/*.fastq.gz'
params.output = 'results'
params.reference = 'data/GCF_000006765.1_ASM676v1_genomic.fna'
Channel.fromPath('data/*.fastq.gz').set{ch_fastqs}
Channel.fromPath(params.input).into{ch_fastqs_for_qc; ch_fastqs_for_mapping;}
Channel.fromPath(params.reference).set{ch_genome}
process fastqc {
publishDir results + '/fastqc', mode: 'copy'
publishDir params.output + '/fastqc', mode: 'copy'
input:
file f from ch_fastqs
file f from ch_fastqs_for_qc
output:
file '*.zip' into ch_fastqc_reports
......@@ -22,7 +24,7 @@ process fastqc {
process multiqc {
publishDir results + '/multiqc', mode: 'copy'
publishDir params.output + '/multiqc', mode: 'copy'
input:
file fastqc_zips from ch_fastqc_reports.collect()
......@@ -32,6 +34,37 @@ process multiqc {
script:
"""
# Language is not set right in conda execution, so we set it by hand here
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
multiqc .
"""
}
process prepare_genome_index {
input:
file genome from ch_genome
output:
set genomeName, file ("${genomeName}.*") into ch_index
script:
genomeName = genome.simpleName
"""
bowtie2-build $genome ${genomeName}
"""
}
process mapping {
input:
set file(reads), index_name, file(index_files) from ch_fastqs_for_mapping.combine(ch_index)
output:
script:
"""
bowtie2 -x ${index_name} ${reads} -S ${reads.simpleName}.sam
"""
}
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment