tools for comparing DNA sequences with MinHash sketches
Project description
sourmash
Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.
Usage:
sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances
sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027
):.
The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)
Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).
sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.
Installation
We recommend using bioconda to install sourmash:
conda install -c conda-forge -c bioconda sourmash
This will install the latest stable version of sourmash 3.
You can also use pip to install sourmash:
pip install sourmash
A quickstart tutorial is available.
Requirements
sourmash runs under both Python 2.7.x and Python 3.5+. The base
requirements are screed and ijson, together with a Rust environment (for the
extension code). We suggest using rustup
to install the Rust environment:
curl https://sh.rustup.rs -sSf | sh
The comparison code (sourmash compare
) uses numpy, and the plotting
code uses matplotlib and scipy, but most of the code is usable without
these.
For search
and gather
you also need khmer
version 2.1+.
Installation with conda
Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:
$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h
which will install the latest alpha release.
Support
Please ask questions and files issues on Github.
Development
Development happens on github at dib-lab/sourmash.
After installation, sourmash
is the main command-line entry point;
run it with python -m sourmash
, or do pip install -e /path/to/repo
to
do a developer install in a virtual environment.
The sourmash/
directory contains the Python library and command-line interface code.
The src/core/
directory contains the Rust library implementing core
functionality.
Tests require py.test and can be run with make test
.
Please see the developer notes for more information.
CTB Jan 2020
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file sourmash-3.2.3.tar.gz
.
File metadata
- Download URL: sourmash-3.2.3.tar.gz
- Upload date:
- Size: 7.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bcb4c44bca22f3510fc2947b16501ffda9029dc95a9860ae51f2be771150e469 |
|
MD5 | c41f392dc39a066808f27f2cea180b2c |
|
BLAKE2b-256 | 14c1cef2b0299f78e461622fcb9de4580eb40e7ea5e78932d4282c9c9c552314 |
Provenance
File details
Details for the file sourmash-3.2.3-py2.py3-none-manylinux2010_x86_64.whl
.
File metadata
- Download URL: sourmash-3.2.3-py2.py3-none-manylinux2010_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: Python 2, Python 3, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93429bf55cdc5b3e66af03a9f769da9f5e3eb78ffe6a0bc635f5fb87726a8695 |
|
MD5 | a9499a3d874cc8d4b7724ffdaa6d5fe3 |
|
BLAKE2b-256 | 4d64044e4584ca78ee9b2b958913828101eff45962b59d2bbcf5e2083014aa5e |
Provenance
File details
Details for the file sourmash-3.2.3-py2.py3-none-manylinux1_x86_64.whl
.
File metadata
- Download URL: sourmash-3.2.3-py2.py3-none-manylinux1_x86_64.whl
- Upload date:
- Size: 1.1 MB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04f0163973bd73a9aa7f02d4e1dcf6d600d9f0144647ce931dd36fa2e5f3372d |
|
MD5 | 1debe4eab274a3a3c730f05cb3c09411 |
|
BLAKE2b-256 | a57854151fa901c6423f724b639979fc24cf40fe68a968039e2bc52a841408c0 |
Provenance
File details
Details for the file sourmash-3.2.3-py2.py3-none-macosx_10_11_intel.whl
.
File metadata
- Download URL: sourmash-3.2.3-py2.py3-none-macosx_10_11_intel.whl
- Upload date:
- Size: 473.8 kB
- Tags: Python 2, Python 3, macOS 10.11+ intel
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cbffb83bb8241b2ce9fedf3a1226095548b0a8842ffedf74adbc75ec8040347 |
|
MD5 | 777f58db19f5b82e8091994ba5ca5eb8 |
|
BLAKE2b-256 | 75b176fd7773d55b0e6964946eb435507036e2889291b9cce646bfd5ea4f409d |