Commit 708eebeb authored by Lukas Jelonek's avatar Lukas Jelonek
Browse files

Update documentation

parent 8b448cb6
......@@ -11,17 +11,20 @@ Vocabulary
----------
Module
A module implements a bioinformatic tool and the corresponding json converter.
It is defined in a module manifest.
A module implements a bioinformatic tool and the corresponding json
converter. It is defined in a module manifest.
Profile
A profile is a set of modules that are executed during an execution of PSOT.
Profiles can override default parameters of modules.
A profile is a set of modules that are executed during an execution of
PSOT. Profiles can override default parameters of modules.
Repository
A collection of profiles, modules, scripts and configurations.
Workflow
--------
1. Load all module manifests and profiles
1. Load all module manifests and profiles from all available repositories
2. Create an execution directory
3. Generate a nextflow script for the choosen profile in the execution directory
4. Run the nextflow script
......@@ -33,6 +36,30 @@ Structure of the Nextflow Script
1. Run all analyses in parallel
2. Convert all analyses in parallel
3. In live mode: generate a json document for each module and each sequence within the live directory
4. Join all json files into a single one containing all information
5. Split the large json file into separate files for each sequence
4. In retrieve mode: retrieve all information from the referenced databases
5. Join all json files into a single one containing all information
6. Split the large json file into separate files for each sequence
Loading of configuration artifacts
----------------------------------
Profiles, modules and configurations are organized in repositories. A
repository can contain the following elements:
* config.yaml (file)
* modules/ (directory with yaml files)
* profiles/ (directory with yaml files)
* scripts/ (directory with scripts for modules)
PSOT uses a repository search path to find bundled and own repositories. It can
be either set by defining the environment variable `PSOT_REPOSITORIES` with a
':' separated list of paths or by passing the repositories via one or multiple
`-r` arguments.
The repositories are loaded in the following order. Later respositories
overwrite values from previous repositories.
* default repository
* PSOT_REPOSITORIES
* `-r` repositories
......@@ -13,6 +13,7 @@ Welcome to PSOT - protein sequence observation tool's documentation!
concepts
installation
modules
profiles
Indices and tables
......
......@@ -9,6 +9,7 @@ In order to run PSOT on your machine you need:
* git
* python >= 3
* nextflow
* dbxref library
* the bioinformatic tools you want to use
* blastp
......
.. highlight:: yaml
Profiles
========
A profile specifies a set of analysis (modules) that can be run within PSOT.
Each profile consists of a list of modules and corresponding configuration overrides.
A profile manifest has the following structure::
# The name of the profile
name: 'fast'
# Short description of the profile
info: 'Profile that contains tools that give a fast result'
# A list of modules that belong to the profile. They must exist in the same
# repository.
modules:
ghostx_swissprot:
signalp:
# configuration override for the module
organism: 'euk'
tmhmm:
You can integrate your own profiles into psot by adding your own repositories.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment