Skip to main content

tools for comparing biological sequences with k-mer sketches

Project description

sourmash

Quickly search, compare, and analyze genomic and metagenomic data sets.

Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: 3-Clause BSD Documentation Gitter DOI

Bioconda install PyPI Conda Platforms

Python 3.10 Python 3.11 Python 3.12 Build Status codecov

Usage:

sourmash sketch dna *.fq.gz
sourmash compare *.sig -o distances.cmp -k 31
sourmash plot distances.cmp

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.

The latest major release is sourmash v4, which has several command-line and Python incompatibilities with previous versions. Please visit our migration guide to upgrade!


sourmash is a k-mer analysis multitool, and we aim to provide stable, robust programmatic and command-line APIs for a variety of sequence comparisons. Some of our special sauce includes:

  • FracMinHash sketching, which enables accurate comparisons (including ANI) between data sets of different sizes
  • sourmash gather, a combinatorial k-mer approach for more accurate metagenomic profiling

Please see the sourmash publications for details.

The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Maintainers: C. Titus Brown (@ctb), Luiz C. Irber, Jr (@luizirber), and N. Tessa Pierce-Ward (@bluegenes).

sourmash was initially developed by the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine, and now includes contributions from the global research and developer community.

Installation

We recommend using conda-forge to install sourmash:

conda install -c conda-forge sourmash-minimal

This will install the latest stable version of sourmash 4.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under Python 3.10 and later on Windows, Mac OS X, and Linux. The base requirements are screed, cffi, numpy, matplotlib, and scipy. Conda will install everything necessary, and is our recommended installation method (see below).

Installation with conda

conda-forge is a community maintained channel for the conda package manager. installing conda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge sourmash-minimal
$ conda activate sourmash_env
$ sourmash --help

which will install the latest released version.

Support

For questions, please open an issue on Github, or ask in our chat.

Development

Development happens on github at sourmash-bio/sourmash.

sourmash is developed in Python and Rust, and you will need a Rust environment to build it; see the developer notes for our suggested development setup.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information on getting set up with a development environment.

CTB Jan 2024

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash-4.8.7.tar.gz (13.3 MB view details)

Uploaded Source

Built Distributions

sourmash-4.8.7-py3-none-win_amd64.whl (1.9 MB view details)

Uploaded Python 3 Windows x86-64

sourmash-4.8.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.8 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ x86-64

sourmash-4.8.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (4.2 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ ppc64le

sourmash-4.8.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.3 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ ARM64

sourmash-4.8.7-py3-none-macosx_11_0_arm64.whl (2.3 MB view details)

Uploaded Python 3 macOS 11.0+ ARM64

sourmash-4.8.7-py3-none-macosx_10_14_x86_64.whl (2.6 MB view details)

Uploaded Python 3 macOS 10.14+ x86-64

File details

Details for the file sourmash-4.8.7.tar.gz.

File metadata

  • Download URL: sourmash-4.8.7.tar.gz
  • Upload date:
  • Size: 13.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for sourmash-4.8.7.tar.gz
Algorithm Hash digest
SHA256 0cd18ea49d94ce92f5c85b2337cf7b7328ef3013c33311414db71b29962a93fd
MD5 a9c5794d30221cb3442b921d559954ab
BLAKE2b-256 7c026dde6357f0ffa47cc1ca00367f5834816575ed301f0c6075a06cdb961e99

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-4.8.7-py3-none-win_amd64.whl.

File metadata

  • Download URL: sourmash-4.8.7-py3-none-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for sourmash-4.8.7-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 89301ae02cee14f51ed367f7ac44e291d5462ec0d631357ded95895a71522683
MD5 a729971e7256846064fd48425360938d
BLAKE2b-256 402239b40bf7d4c2872e2d5685e46f17a3e37a1e85bca0a0212bb3a1cf226820

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-4.8.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b37cd0a09766fdcdce91a7bca7afb867577c18f344a01bb6b57e26c43f41aa8b
MD5 7b311a8d71a14a6b707c344ed0ed7983
BLAKE2b-256 ca0b597d12d3329b7895ecf6e16e7b6f56e2275fdac3d0ebd63d914719cc69d9

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-4.8.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for sourmash-4.8.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 bacf9cc193e769eeea01b1e1ffeffe3a94e96f0eccb17ec6d1bce92de81a1410
MD5 41790a092e2c85ebe04693dd4ac28c82
BLAKE2b-256 487263b925d3225d9294151865e5352f7006c07ee504ae0f81efc1dbd10e90bd

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-4.8.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9ccff79e9abe6ab16c100d320ee422b7efad0eb8b7b8ab849c8e0a7f0e5d77d0
MD5 f17b3329924c90c74109bacdace49dbe
BLAKE2b-256 5c5172746c55729e2dfcf589b4e4eba49a709eef69f4fec8b64f1e2fd0d75f57

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-4.8.7-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.7-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b3c96b27508fe294b1d987113c93f8d294e2ac2c15d772f61faf1eb655bfc015
MD5 db6bf0bfaa3846d692a16af69f200ecb
BLAKE2b-256 95a9641b1f121aee373ed91aa43ef526edf855749cbd65ade1d352dd71fc142a

See more details on using hashes here.

Provenance

File details

Details for the file sourmash-4.8.7-py3-none-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for sourmash-4.8.7-py3-none-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9549f3f4f6386393699a2c1b4ca8b24530838c88e4c2bace014144c19706c851
MD5 4e89deb51c5fae362b3e84274cff0e6f
BLAKE2b-256 df8236136623df4ee4a379131b65e127a2cbdc80f80ff03babf7700fb1480004

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page