sourmash plugin for improved plotting/viz and cluster examination.
Project description
sourmash_plugin_betterplot
sourmash is a tool for biological sequence analysis and comparisons.
betterplot
is a sourmash plugin that provides improved plotting/viz
and cluster examination for sourmash-based sketch comparisons.
Why are we using the
sourmash compare
and
sourmash plot
produce basic distance matrix plots that are useful for comparing and
visualizing the relationships between dozens to hundreds of
genomes. And this is one of the most popular use cases for sourmash!
But! The visualization can be improved a lot beyond the basic viz
that sourmash plot
produces, and there are a lot of only slightly
more complicated use cases for comparing, clustering, and visualizing
many genomes!
This plugin will explore some of these use cases!
Specific goals:
- provide a variety of plotting and exploration commands that can be used with sourmash tools;
- provide both command-line functionality and functions that can be imported and used in Jupyter notebooks;
- (maybe) explore other backends than matplotlib;
and who knows what else??
Installation
pip install sourmash_plugin_betterplot
Usage
See the examples below.
Examples
The command lines below are executable in the examples/
subdirectory
of the repository after installing the plugin.
Basic 3 sketches example: plot2
Compare 3 sketches, and cluster.
This command:
sourmash compare sketches/{2,47,63}.sig.zip -o 3sketches.cmp
--labels-to 3sketches.cmp.labels_to.csv
sourmash scripts plot2 3sketches.cmp 3sketches.cmp.labels_to.csv \
-o examples/plot2.3sketches.cmp.png
produces this plot:
3 sketches example with a cut line: plot2 --cut-point 1.2
Compare 3 sketches, cluster, and show a cut point.
This command:
sourmash compare sketches/{2,47,63}.sig.zip -o 3sketches.cmp
--labels-to 3sketches.cmp.labels_to.csv
sourmash scripts plot2 3sketches.cmp 3sketches.cmp.labels_to_csv \
-o examples/plot2.cut.3sketches.cmp.png \
--cut-point=1.2
produces this plot:
Dendrogram of 10 sketches with a cut line + cluster extraction
Compare 10 sketches, cluster, and use a cut point to extract
multiple clusters. Use --dendrogram-only
to plot just the dendrogram.
This command:
sourmash compare sketches/{2,47,48,49,51,52,53,59,60}.sig.zip \
-o 10sketches.cmp \
--labels-to 10sketches.cmp.labels_to.csv
sourmash scripts plot2 10sketches.cmp 10sketches.cmp.labels_to.csv \
-o plot2.cut.dendro.10sketches.cmp.png \
--cut-point=1.35 --cluster-out --dendrogram-only
produces this plot:
as well as a set of 6 clusters to 10sketches.cmp.*.csv
.
Multidimensional Scaling (MDS) plot of 10-sketch comparison
Use MDS to display a comparison.
This command:
sourmash compare sketches/{2,47,48,49,51,52,53,59,60}.sig.zip \
-o 10sketches.cmp \
--labels-to 10sketches.cmp.labels_to.csv
sourmash scripts mds 10sketches.cmp 10sketches.cmp.labels_to.csv \
-o mds.10sketches.cmp.png \
-C 10sketches-categories.csv
produces this plot:
Support
We suggest filing issues in the main sourmash issue tracker as that receives more attention!
Dev docs
betterplot
is developed at
https://github.com/sourmash-bio/sourmash_plugin_betterplot.
See environment.yml
for the dependencies needed to develop betterplot
.
Testing
Run:
make examples
to run the examples.
For now, the examples serve as the tests; eventually we will add unit tests.
Generating a release
Bump version number in pyproject.toml
and push.
Make a new release on github.
Then pull, and:
python -m build
followed by twine upload dist/...
.
CTB May 2024
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file sourmash_plugin_betterplot-0.2.1.tar.gz
.
File metadata
- Download URL: sourmash_plugin_betterplot-0.2.1.tar.gz
- Upload date:
- Size: 8.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b55f03e403fc41ae9817e333aafaff5f683c62c17cb3cfe17042a9b8c3c009a4 |
|
MD5 | db4563b3f4c138569c14c8fdb0721d82 |
|
BLAKE2b-256 | ae268ba336cb5b64bed1d05278a60157cdfccd176715958b01f7be7709371d19 |
Provenance
File details
Details for the file sourmash_plugin_betterplot-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: sourmash_plugin_betterplot-0.2.1-py3-none-any.whl
- Upload date:
- Size: 8.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf7f2ddc1b50d3ed9e3366bc504b134cdcdd07846958501130fd749a74eb16b0 |
|
MD5 | 1a048fb9c7bc92362ec17c64c69cef20 |
|
BLAKE2b-256 | 2363c0f8453758c326c6910d1d6cd66d49e940a73dbc4261c72e9780c242fa8d |