Skip to main content

tools for comparing DNA sequences with MinHash sketches

Project description

sourmash

Documentation Build Status PyPI codecov DOI License: 3-Clause BSD


Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We recommend using bioconda to install sourmash:

conda install -c conda-forge -c bioconda sourmash

This will install the latest stable version of sourmash 3.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a Rust environment (for the extension code). We suggest using rustup to install the Rust environment:

curl https://sh.rustup.rs -sSf | sh

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB Jan 2020

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-3.1.0.tar.gz (7.3 MB view details)

Uploaded Source

Built Distributions

sourmash-3.1.0-py2.py3-none-manylinux2010_x86_64.whl (1.0 MB view details)

Uploaded Python 2 Python 3 manylinux: glibc 2.12+ x86-64

sourmash-3.1.0-py2.py3-none-manylinux1_x86_64.whl (1.0 MB view details)

Uploaded Python 2 Python 3

sourmash-3.1.0-py2.py3-none-macosx_10_6_intel.whl (451.5 kB view details)

Uploaded Python 2 Python 3 macOS 10.6+ intel

File details

Details for the file sourmash-3.1.0.tar.gz.

File metadata

  • Download URL: sourmash-3.1.0.tar.gz
  • Upload date:
  • Size: 7.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for sourmash-3.1.0.tar.gz
Algorithm Hash digest
SHA256 29fa5525f6acd268966aeb3ee2ac20cfe488cba304b8ae0db1aecb14d2735b55
MD5 1530c581952a2d882c75cb8776fedbb0
BLAKE2b-256 6e790b3b5e9513a06fad42be57796cfa61feecd870a110966e0b6dcb932a1bcf

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.1.0-py2.py3-none-manylinux2010_x86_64.whl.

File metadata

  • Download URL: sourmash-3.1.0-py2.py3-none-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 2, Python 3, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for sourmash-3.1.0-py2.py3-none-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c36a910ff96abb5fd6e74ae02149373c582edcce9ec498a136d4e83987eddd85
MD5 e9a8e5a3bf6fd064fd7d8fddc6e92c66
BLAKE2b-256 c6629d0512e87ca67cb5a564a34b2e892db815ee636eb66f66395965d4d2244d

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.1.0-py2.py3-none-manylinux1_x86_64.whl.

File metadata

  • Download URL: sourmash-3.1.0-py2.py3-none-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for sourmash-3.1.0-py2.py3-none-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6f4dbb282408167a03dcad95ae3de7998f488267e3987b2009a5002825449da8
MD5 e55a2ad98d393dccca059209f92aa8c3
BLAKE2b-256 032daf6a144af8a364bf7bf45d876c3a1e7284ab4213fbd7e38a8637330eff03

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.1.0-py2.py3-none-macosx_10_6_intel.whl.

File metadata

  • Download URL: sourmash-3.1.0-py2.py3-none-macosx_10_6_intel.whl
  • Upload date:
  • Size: 451.5 kB
  • Tags: Python 2, Python 3, macOS 10.6+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for sourmash-3.1.0-py2.py3-none-macosx_10_6_intel.whl
Algorithm Hash digest
SHA256 8c06e7c8b16c73a9e9b14664962f3a02a051c628cb22f81aa51fe31bd715012a
MD5 23e2d9af16aed5b175246cd39cfa4a5c
BLAKE2b-256 83369bf880979e925899e0c1fb6751b370b3536c533a8edcf79356d94739f683

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page