Skip to main content

Various integrations for ANN (Approximate Nearest Neighbours) libraries into scikit-learn.

Project description

ReadTheDocs

sklearn-ann

sklearn-ann eases integration of approximate nearest neighbours libraries such as annoy, nmslib and faiss into your sklearn pipelines. It consists of:

  • Transformers conforming to the same interface as KNeighborsTransformer which can be used to transform feature matrices into sparse distance matrices for use by any estimator that can deal with sparse distance matrices. Many, but not all, of scikit-learn’s clustering and manifold learning algorithms can work with this kind of input.

  • RNN-DBSCAN: a variant of DBSCAN based on reverse nearest neighbours.

Installation

To install the latest release from PyPI, run:

pip install sklearn-ann

To install the latest development version from GitHub, run:

pip install git+https://github.com/scikit-learn-contrib/sklearn-ann.git#egg=sklearn-ann

Why? When do I want this?

The main scenarios in which this is needed is for performing clustering or manifold learning or high dimensional data. The reason is that currently the only neighbourhood algorithms which are build into scikit-learn are essentially the standard tree approaches to space partitioning: the ball tree and the K-D tree. These do not perform competitively in high dimensional spaces.

Development

This project is managed using Hatch and pre-commit. To get started, run pre-commit install and hatch env create. Run all commands using hatch run python <command> which will ensure the environment is kept up to date. pre-commit comes into play on every git commit after installation.

Consult pyproject.toml for which dependency groups and extras exist, and the Hatch help or user guide for more info on what they are.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn_ann-0.1.1.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

sklearn_ann-0.1.1-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file sklearn_ann-0.1.1.tar.gz.

File metadata

  • Download URL: sklearn_ann-0.1.1.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for sklearn_ann-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7efa875481325f282f28c600c20bdce41e9578f358d578034ff688932b2c7ef6
MD5 e6570c26fc92749677e9e55c766efdca
BLAKE2b-256 33a0f0123fcf87f041cb9311fee6c89f389a989542325f59060bee47e579a2c6

See more details on using hashes here.

File details

Details for the file sklearn_ann-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sklearn_ann-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for sklearn_ann-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 aed83067f8d9769bd5f0876772ba3bd7fb677c43546ef6fc24813fe065f8c6f6
MD5 f4d306b92872eec04785d56c68c44b6d
BLAKE2b-256 631b19bdeb29033cb075c782c368d4d460223eff9e4d63db853dc3e9e27763bd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page