Skip to main content

tools for comparing DNA sequences with MinHash sketches

Project description

sourmash

Documentation Build Status PyPI codecov DOI License: 3-Clause BSD


Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We recommend using bioconda to install sourmash:

conda install -c conda-forge -c bioconda sourmash

This will install the latest stable version of sourmash 3.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a Rust environment (for the extension code). We suggest using rustup to install the Rust environment:

curl https://sh.rustup.rs -sSf | sh

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB Jan 2020

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-3.2.3.tar.gz (7.3 MB view details)

Uploaded Source

Built Distributions

sourmash-3.2.3-py2.py3-none-manylinux2010_x86_64.whl (1.1 MB view details)

Uploaded Python 2 Python 3 manylinux: glibc 2.12+ x86-64

sourmash-3.2.3-py2.py3-none-manylinux1_x86_64.whl (1.1 MB view details)

Uploaded Python 2 Python 3

sourmash-3.2.3-py2.py3-none-macosx_10_11_intel.whl (473.8 kB view details)

Uploaded Python 2 Python 3 macOS 10.11+ intel

File details

Details for the file sourmash-3.2.3.tar.gz.

File metadata

  • Download URL: sourmash-3.2.3.tar.gz
  • Upload date:
  • Size: 7.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.6

File hashes

Hashes for sourmash-3.2.3.tar.gz
Algorithm Hash digest
SHA256 bcb4c44bca22f3510fc2947b16501ffda9029dc95a9860ae51f2be771150e469
MD5 c41f392dc39a066808f27f2cea180b2c
BLAKE2b-256 14c1cef2b0299f78e461622fcb9de4580eb40e7ea5e78932d4282c9c9c552314

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.2.3-py2.py3-none-manylinux2010_x86_64.whl.

File metadata

  • Download URL: sourmash-3.2.3-py2.py3-none-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 2, Python 3, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for sourmash-3.2.3-py2.py3-none-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 93429bf55cdc5b3e66af03a9f769da9f5e3eb78ffe6a0bc635f5fb87726a8695
MD5 a9499a3d874cc8d4b7724ffdaa6d5fe3
BLAKE2b-256 4d64044e4584ca78ee9b2b958913828101eff45962b59d2bbcf5e2083014aa5e

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.2.3-py2.py3-none-manylinux1_x86_64.whl.

File metadata

  • Download URL: sourmash-3.2.3-py2.py3-none-manylinux1_x86_64.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for sourmash-3.2.3-py2.py3-none-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 04f0163973bd73a9aa7f02d4e1dcf6d600d9f0144647ce931dd36fa2e5f3372d
MD5 1debe4eab274a3a3c730f05cb3c09411
BLAKE2b-256 a57854151fa901c6423f724b639979fc24cf40fe68a968039e2bc52a841408c0

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-3.2.3-py2.py3-none-macosx_10_11_intel.whl.

File metadata

  • Download URL: sourmash-3.2.3-py2.py3-none-macosx_10_11_intel.whl
  • Upload date:
  • Size: 473.8 kB
  • Tags: Python 2, Python 3, macOS 10.11+ intel
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0.post20200119 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6

File hashes

Hashes for sourmash-3.2.3-py2.py3-none-macosx_10_11_intel.whl
Algorithm Hash digest
SHA256 5cbffb83bb8241b2ce9fedf3a1226095548b0a8842ffedf74adbc75ec8040347
MD5 777f58db19f5b82e8091994ba5ca5eb8
BLAKE2b-256 75b176fd7773d55b0e6964946eb435507036e2889291b9cce646bfd5ea4f409d

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page