Skip to main content

sourmash plugin to calculate common hashes across multiple sketches.

Project description

sourmash_plugin_commonhash

If you have sketched many samples and you want to remove "rare" k-mers (present in 1, or only a few samples), this plugin is for you! This procedure helps reduce noise in Jaccard comparisons between samples.

See sourmash#2383 for an extended discussion!

Thanks to Taylor Reiter and Jessica Lumian for all their work on this!

Installation

pip install sourmash_plugin_commonhash

Usage

sourmash scripts commonhash <multiple sketches> -o commonhashes.zip

commonhash will output one filtered sketch for each input sketch. You can then use the various sourmash sig commands to union these sketches, extract individual ones, etc.

Example

sourmash scripts commonhash examples/*.sig.gz -o commonhash.zip

should yield:

...

Selecting k=31, DNA
Loaded 10587 hashes from 3 sketches in 3 files.
Of 10587 hashes, keeping 2529 that are in 2 or more samples.
Saved 3 signatures to 'commonhash.zip'

Support

We suggest filing issues in the main sourmash issue tracker as that receives more attention!

Dev docs

commonhash is developed at https://github.com/ctb/sourmash_plugin_commonhash.

Generating a release

Bump version number in pyproject.toml and push.

Make a new release on github.

Then pull, and:

python -m build

followed by twine upload dist/....

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sourmash_plugin_commonhash-0.4.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

sourmash_plugin_commonhash-0.4-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file sourmash_plugin_commonhash-0.4.tar.gz.

File metadata

File hashes

Hashes for sourmash_plugin_commonhash-0.4.tar.gz
Algorithm Hash digest
SHA256 5ee0f62692fb41824bcb3141fafd381dd172e94ef60ee2bb0214f6f16b6e7d2e
MD5 8edec3aa791e8ebf02ed8319c2b685de
BLAKE2b-256 2c89d46d65f6e4fdb68f81dea08dbd092c097765031f367bb1a86bdad3b497b9

See more details on using hashes here.

File details

Details for the file sourmash_plugin_commonhash-0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for sourmash_plugin_commonhash-0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 96d0d983a3c3719961bb5b15f9f0d33eb6ad12c4ea3da34b1e949c05d1d58a0a
MD5 952ad9687f7a598defeb954aad7e9479
BLAKE2b-256 3bcd6f5df111fddd12eba899fd18acc940e951e26978fd510525286f0d94d07b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page