Commit 5e30edc8 authored by Lukas Jelonek's avatar Lukas Jelonek
Browse files

Add bachelor thesis code by Christian Fankep and start over\n Reason: large...

Add bachelor thesis code by Christian Fankep and start over\n Reason: large databases were added to his repository and it is easier to start over at this point
parents
Database Manager
================
This tool provide the preparation of Databases for several bioinformaticians tools.
The preparated databases can be used on the working computer or can be saved on the cloud Server(Amazon Web Service S3)
so that other computer can be able to download the data from there. It's also possible to delete some undesirable
databases on the working computer and/or in the Cloud.
Supported Databases with associated tools:
* Uniprot-Swissprot [Blast, Ghostx]
* CARD [Blast, Ghostx]
* Pfam [hmmer]
Installation
------------
Prerequisites:
* Python (Version >= 3.7)
* Git
Install for user::
pip install git+https://git.computational.bio.uni-giessen.de/cfankep/psotdb.git
Install for developer::
#checkout repository
git clone git+https://git.computational.bio.uni-giessen.de/cfankep/psotdb.git
#install
pip3 install -e .
Using Database Manager
----------------------
For the general help use::
dbman --help
Checkout which databases are available::
# in the S3 directory
dbman list_remote_databases
# in the local directory
dbman list_local_databases
Checkout which databases with associated transformations are available::
dbman list_recipes
Prepare databases::
# check the available optional parameters
dbman prepare -h
# run the standard preparation
dbman prepare example/database exemple/tool
Transfer databases from the working computer to s3 Cloud::
# check the available optional parameters
dbman upload -h
# run the standard transfer
dbman upload example/database exemple/tool*
Transfer databases from s3 Cloud to working computer::
# check the available optional parameters
dbman download -h
# run the standard download
dbman download example/database exemple/tool*
Delete undesirable databases::
# from the local directory
dbman delete example/database example/tool* local
# from the s3 cloud directory
dbman delete example/database example/tool* s3
remplace the standard directory to save the data::
# change local directory with the environment variable
export DBMAN_DBDIR = example/path
# change remote directory with the environment variable
export DBMAN_S3DIR = example/path
The standard directories(local and remote) can also be change with optional parameters.
(*) For upload, download and delete of the raw databank instead of 'example/tool' enter 'raw'.
This diff is collapsed.
#!/bin/bash
TOOL_DIRECTORY=$1
TOOL_FILE=$2
cd $TOOL_DIRECTORY
makeblastdb -dbtype prot -in $TOOL_FILE
cd -
\ No newline at end of file
#!/bin/bash
REMOTE_FILE=$1
s3cmd del $REMOTE_FILE
\ No newline at end of file
#!/bin/bash
REMOTE_TARFILE=$1
LOCAL_DATABASE_DIRECTORY=$2
TARFILE=$3
cd $LOCAL_DATABASE_DIRECTORY
s3cmd get $REMOTE_TARFILE
tar -xzvf $TARFILE
rm $TARFILE
cd -
\ No newline at end of file
#!/bin/bash
REMOTE_FILE=$1
LOCAL_DATABASE_DIRECTORY=$2
cd $LOCAL_DATABASE_DIRECTORY
s3cmd get $REMOTE_FILE
cd -
\ No newline at end of file
#!/bin/bash
TOOL_DIRECTORY=$1
RAW_FILE=$2
cd $TOOL_DIRECTORY
ghostx db -i $RAW_FILE -o ghostx_db
cd -
\ No newline at end of file
#!/bin/bash
TOOL_DIRECTORY=$1
RAW_FILE=$2
cd $TOOL_DIRECTORY
hmmpress $RAW_FILE
cd -
\ No newline at end of file
#!/bin/bash
LOCAL_DATABASE_DIRECTORY=$1
cd $LOCAL_DATABASE_DIRECTORY
wget --content-disposition https://card.mcmaster.ca/latest/data
tar xfa card-data.tar.bz2
rm card-data.tar.bz2
cd -
\ No newline at end of file
#!/bin/bash
LOCAL_DATABASE_DIRECTORY=$1
cd $LOCAL_DATABASE_DIRECTORY
wget ftp://ftp.ebi.ac.uk/pub/databases/Pfam/current_release/Pfam-A.hmm.gz
gunzip Pfam-A.hmm.gz
cd -
\ No newline at end of file
#!/bin/bash
LOCAL_DATABASE_DIRECTORY=$1
cd $LOCAL_DATABASE_DIRECTORY
wget ftp://ftp.uniprot.org/pub/databases/uniprot/knowledgebase/uniprot_sprot.fasta.gz
gunzip uniprot_sprot.fasta.gz
cd -
\ No newline at end of file
#!/bin/bash
LOCAL_DATABASE_DIRECTORY=$1
DATABASE_TARFILE=$2
DATABASE_DIRECTORY=$3
REMOTE_DATABASE_DIRECTORY=$4
cd $LOCAL_DATABASE_DIRECTORY
tar -czvf $DATABASE_TARFILE $DATABASE_DIRECTORY
s3cmd put $DATABASE_TARFILE $REMOTE_DATABASE_DIRECTORY
rm $DATABASE_TARFILE
cd -
\ No newline at end of file
#!/bin/bash
LOCAL_DATABASE_DIRECTORY=$1
LOCAL_FILE=$2
REMOTE_DATABASE_DIRECTORY=$3
cd $LOCAL_DATABASE_DIRECTORY
s3cmd put $LOCAL_FILE $REMOTE_DATABASE_DIRECTORY
cd -
\ No newline at end of file
wget
\ No newline at end of file
[metadata]
name = dbman
author = Rudel Fankep
author-email = Rudel.C.NKouamedjo-fankep@bio.uni-giessen.de
description = Download, convert and upload databases to cloud server
description-file = README.rst
project_urls =
Source Code = https://git.computational.bio.uni-giessen.de/cfankep/psotdb.git
keywords = tools, databases
license = MIT
[files]
packages =
dbman
[entry_points]
console_scripts =
dbman = dbman.main:main
from setuptools import setup
# this is only necessary when not using setuptools/distribute
from sphinx.setup_command import BuildDoc
cmdclass = {'build_sphinx': BuildDoc}
setup(
setup_requires=['pbr'],
pbr=True,
package_data={'dbman' : ['scripts/*']}
)
#!/usr/bin/env python
import pkg_resources
print(pkg_resources.resource_filename(__name__, "blast_db.sh"))
Database repository---> https://github.com/MGX-metagenomics/databases/blob/master/card.build
anzeigen lassen---> echo $VARIABLE
löschen---> unset VARIABLE
setzen---> export VARIABLE = pfad
1-setup.py
2-dbman verzeichnis
3- __init__.py in dbman
pip install git+https://git.computational.bio.uni-giessen.de/cfankep/psotdb.git
list_remote_databases
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment