Skip to main content

tools for comparing DNA sequences with MinHash sketches

Project description

sourmash

Documentation Build Status PyPI codecov DOI License: 3-Clause BSD


Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We recommend using bioconda to install sourmash:

conda install -c conda-forge -c bioconda sourmash

This will install the latest stable version of sourmash 3.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a Rust environment (for the extension code). We suggest using rustup to install the Rust environment:

curl https://sh.rustup.rs -sSf | sh

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB Jan 2020

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-3.2.0.tar.gz (7.3 MB view details)

Uploaded Source

Built Distributions

sourmash-3.2.0-py2.py3-none-manylinux2010_x86_64.whl (1.0 MB view details)

Uploaded Python 2 Python 3 manylinux: glibc 2.12+ x86-64

sourmash-3.2.0-py2.py3-none-manylinux1_x86_64.whl (1.0 MB view details)

Uploaded Python 2 Python 3

sourmash-3.2.0-py2.py3-none-macosx_10_11_intel.whl (461.6 kB view details)

Uploaded Python 2 Python 3 macOS 10.11+ intel

File details

Details for the file sourmash-3.2.0.tar.gz.

File metadata

  • Download URL: sourmash-3.2.0.tar.gz
  • Upload date:
  • Size: 7.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.8.1

File hashes

Hashes for sourmash-3.2.0.tar.gz
Algorithm Hash digest
SHA256 b297d1857949a32dd8b4b509de5ef1f1c3b1d9e3ba69d7fb8384fe66e9da5c98
MD5 dd73802728ef49afb78e2506df7f3512
BLAKE2b-256 4ce3f337ee9284d229595bb64fe44e9f6884f330523f4c2792c26c51cba1cc41

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.2.0-py2.py3-none-manylinux2010_x86_64.whl.

File metadata

  • Download URL: sourmash-3.2.0-py2.py3-none-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 2, Python 3, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.8.1

File hashes

Hashes for sourmash-3.2.0-py2.py3-none-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e2c2086b250ed14b13c778696bcfc40bcabf58806f48bc95067f3d80b25bbbee
MD5 cf722540a3feca13bddd2432d50c0cad
BLAKE2b-256 6d5ede790a526879cc005bc758a07aa4e919a5c72d24fe22afb7a92b7ad3beae

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.2.0-py2.py3-none-manylinux1_x86_64.whl.

File metadata

  • Download URL: sourmash-3.2.0-py2.py3-none-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.0 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.8.1

File hashes

Hashes for sourmash-3.2.0-py2.py3-none-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c66b5e703a714d6a434db9e931d7ef2ed757dd474ac45fe148ffca6a05260f12
MD5 79ab11b6c0191fa5b2263cbf4771ccfe
BLAKE2b-256 b9a4627ef3aab899ffe990fc140b4101c711da53d7c375517731f035082efa18

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.2.0-py2.py3-none-macosx_10_11_intel.whl.

File metadata

  • Download URL: sourmash-3.2.0-py2.py3-none-macosx_10_11_intel.whl
  • Upload date:
  • Size: 461.6 kB
  • Tags: Python 2, Python 3, macOS 10.11+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.28.1 CPython/3.8.1

File hashes

Hashes for sourmash-3.2.0-py2.py3-none-macosx_10_11_intel.whl
Algorithm Hash digest
SHA256 9a2961518344891eca2e349aa1d4427ff302165d8aecda1865b8b6c0f73a2727
MD5 3145ffbf7a7c06a5f41731b89f71b27a
BLAKE2b-256 fba8eff1400c0b404cbffc7c69b332ee1e2a289621a435b5fbfaca1ba1a928ba

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page