BioMAJ download service
Project description
About
Microservice to manage the downloads of biomaj.
A protobuf interface is available in biomaj_download/message/message_pb2.py to exchange messages between BioMAJ and the download service. Messages go through RabbitMQ (to be installed).
Protobuf
To compile protobuf, in biomaj_download/message:
protoc --python_out=. downmessage.proto
Development
flake8 biomaj_download/\*.py biomaj_download/download
Test
To run the test suite, use:
nosetests -a '!local_irods' tests/biomaj_tests.py
This command skips the test that need a local iRODS server.
Some test might fail due to network connection. You can skip them with:
nosetests -a '!network' tests/biomaj_tests.py
(To skip the local iRODS test and the network tests, use -a '!network,!local_irods'
).
Run
Message consumer:
export BIOMAJ_CONFIG=path_to_config.yml
python bin/biomaj_download_consumer.py
Web server
If package is installed via pip, you need a file named gunicorn_conf.py containing somehwhere on local server:
def worker_exit(server, worker):
from prometheus_client import multiprocess
multiprocess.mark_process_dead(worker.pid)
If you cloned the repository and installed it via python setup.py install, just refer to the gunicorn_conf.py in the cloned repository.
export BIOMAJ_CONFIG=path_to_config.yml
rm -rf ..path_to/prometheus-multiproc
mkdir -p ..path_to/prometheus-multiproc
export prometheus_multiproc_dir=..path_to/prometheus-multiproc
gunicorn -c gunicorn_conf.py biomaj_download.biomaj_download_web:app
Web processes should be behind a proxy/load balancer, API base url /api/download
Prometheus endpoint metrics are exposed via /metrics on web server
Download options
Since version 3.0.26, you can use the set_options
method to pass a dictionary of downloader-specific options.
The following list shows some options and their effect (the option to set is the key and the parameter is the associated value):
- skip_check_uncompress:
- parameter: bool.
- downloader(s): all.
- effect: If true, don't test the archives after download.
- default: false (i.e. test the archives).
- ssl_verifyhost:
- parameter: bool.
- downloader(s):
CurlDownloader
,DirectFTPDownload
,DirectHTTPDownload
. - effect: If false, don't check that the name of the remote server is the same than in the SSL certificate.
- default: true (i.e. check host name).
- note: It's generally a bad idea to disable this verification. However some servers are badly configured. See here for the corresponding cURL option.
- ssl_verifypeer:
- parameter: bool.
- downloader(s):
CurlDownloader
,DirectFTPDownload
,DirectHTTPDownload
. - effect: If false, don't check the authenticity of the peer's certificate.
- default: true (i.e. check authenticity).
- note: It's generally a bad idea to disable this verification. However some servers are badly configured. See here for the corresponding cURL option.
- ssl_server_cert:
- parameter: filename of the certificate.
- downloader(s):
CurlDownloader
,DirectFTPDownload
,DirectHTTPDownload
. - effect: Pass a file holding one or more certificates to verify the peer with.
- default: use OS certificates.
- note: See here for the corresponding cURL option.
- tcp_keepalive:
- parameter: int.
- downloader(s):
CurlDownloader
,DirectFTPDownload
,DirectHTTPDownload
. - effect: Sets the interval, in seconds, that the operating system will wait between sending keepalive probes.
- default: cURL default (60s at the time of this writing).
- note: See here for the corresponding cURL option.
- ftp_method:
- parameter: one of
default
,multicwd
,nocwd
,singlecwd
(case insensitive). - downloader(s):
CurlDownloader
,DirectFTPDownload
,DirectHTTPDownload
. - effect: Sets the method to use to reach a file on a FTP(S) server (
nocwd
andsinglecwd
are usually faster but not always supported). - default:
default
(which ismulticwd
at the time of this writing) - note: See here for the corresponding cURL option.
- parameter: one of
Those options can be set in bank properties.
See file global.properties.example
in biomaj module.
3.1.2: #18 Add a protocol option to set CURLOPT_FTP_FILEMETHOD #19 Rename protocol options to options Fix copy of production files instead of download when files are in subdirectories 3.1.1: #17 Support MDTM command in directftp 3.1.0: #16 Don't change name after download in DirectHTTPDownloader PR #7 Refactor downloaders (WARNING breaks API) 3.0.27: Fix previous release broken with a bug in direct protocols 3.0.26: Change default download timeout to 1h #12 Allow FTPS protocol #14 Add mechanism for protocol specific options 3.0.25: Allow to use hardlinks in LocalDownload 3.0.24: Remove debug logs 3.0.23: Support spaces in remote file names 3.0.22: Fix */ remote.files parsing 3.0.21: Fix traefik labels 3.0.20: Update pika dependency release Add tags for traefik support 3.0.19: Check archives after download Fix python regexps syntax (deprecation) 3.0.18: Rename protobuf and use specific package to avoid conflicts 3.0.17: Regenerate protobuf message desc, failing on python3 3.0.16: Add missing req in setup.py 3.0.15: Fix progress download control where could have infinite loop Add irods download
3.0.14: Allow setup of local_endpoint per service, else use default local_endpoint
3.0.13: In rate limiting, add progress vs total of download Fix rate limiting submission
3.0.12: Add retry in case of session creation failure disable web thread logging
3.0.11: Display progress of download by percent of downloads In case of contact error in downloadclient, retry connection
3.0.10: Feature #3: Add rate limiting option to limit number of parallel downloads for a client
3.0.9: Add host in prometheus stats Fix #2: allow setting http.group.file.size or http.group.file.date to -1 if not avalaible in http(s) page for regexp
3.0.8: Fix prometheus stats Add consul supervision
3.0.7: Change size type to int64
3.0.6: Fix download_or_copy to avoid downloading a file existing in a previous production directory
3.0.4: Fixes on messages
3.0.3: Fix management of timeout leading to a crash when using biomaj.download parameter.
3.0.2: set rabbitmq parameter optional
3.0.1: add missing README etc.. in package
3.0.0: move download management out of biomaj main package
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file biomaj_download-3.1.2.tar.gz
.
File metadata
- Download URL: biomaj_download-3.1.2.tar.gz
- Upload date:
- Size: 31.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/2.7.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c046fb1c7a3adac4bd81687e95198aae14b24650ffb110f2694e1c9cd008c77c |
|
MD5 | 993a7a41070b3e044ef20afebd4f4e4a |
|
BLAKE2b-256 | 49d434d9fc7e60c1f7c4eb885f3b19125ea19b400d18eabc74f20a8a08942593 |
File details
Details for the file biomaj_download-3.1.2-py2.py3-none-any.whl
.
File metadata
- Download URL: biomaj_download-3.1.2-py2.py3-none-any.whl
- Upload date:
- Size: 48.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/2.7.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96dc2646ebca363ce9d13598e38ca9a78da0f23eeaad2c9ed8e8f7cbd20a71ae |
|
MD5 | 04dc0e74addca3c65b984e8c55415b5b |
|
BLAKE2b-256 | 7519c8389108881b0600ff35549551b92532d5c9c87b99df39b156aa8c858f43 |