Skip to main content

Alternate Python Bindings for NMSLIB

Project description

nmsbind
=======

[![Build Status](https://travis-ci.org/benfred/nmsbind.svg?branch=master)](https://travis-ci.org/benfred/nmsbind)
[![Windows Build status](https://ci.appveyor.com/api/projects/status/025rl7knj2m62hs5?svg=true)](https://ci.appveyor.com/project/benfred/nmsbind)

nmsbind: Alternate Python Bindings for NMSLIB

This project is a proof of concept of using [pybind11](https://github.com/pybind/pybind11) to
build Python bindings for [the Non-Metric Space Library (NMSLIB)](https://github.com/searchivarius/nmslib).

NMSLIB is a great library that provides many different methods for calculating approximate nearest neighbours.
Some of these methods are up to [10 times faster](https://raw.githubusercontent.com/searchivarius/nmslib/master/docs/figures/glove.png)
than provided by libraries like [Annoy](https://github.com/spotify/annoy). However, Annoy is currently much
more popular - at least in part because its easier to install and use from Python.

This project aims to fix some of the hassles of using NMSLIB in Python, by using pybind to write alternate
bindings, and to work on making the install work seamlessly across multiple different environments.

Some advantages of this approach are:
* Works with Python 3.5+ and Python 2.7+
* Works on Linux / OSX / Windows systems
* Easily installable with pip: 'pip install nmsbind' will download from pypi and install
* More natural Python API:
* the index is a class with methods (instead of getting a memory location and using global functions to access)
* methods have sensible default parameters
* docstrings provide some basic documentation on how to call
* no need to manually free memory with ```nmslib.freeIndex```

To install:

```
pip install nmsbind
```

Basic usage:

```python
import nmsbind
import numpy

# create a random matrix to index
data = numpy.random.randn(10000, 100).astype(numpy.float32)

# initialize a new index, using a HNSW index on Cosine Similarity
index = nmsbind.init(data, method='hnsw', space_type='cosinesimil')
index.createIndex({'post': 2})

# query for the nearest neighbours of the first datapoint
ids, distances = index.knnQuery(data[0], k=10)
```

This library has been tested with Python 2.7 and 3.5/3.6. Running 'tox' will
build and run unittests on all these versions.

This project is mainly a proof of concept and is currently lacking some of the features
in the default nmslib bindings.

TODO:
* incremental data addition (addDataPoint row at a time, also addDataPointBatch)
* support missing datatypes (SPARSE_VECTOR/STRING_AS_OBJECT)
* Batch querying via knnQueryBatch

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nmsbind-0.0.1.tar.gz (206.7 kB view details)

Uploaded Source

File details

Details for the file nmsbind-0.0.1.tar.gz.

File metadata

  • Download URL: nmsbind-0.0.1.tar.gz
  • Upload date:
  • Size: 206.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for nmsbind-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2d934c19a903a92a4eafeaaccd15a4b7f6b93add5c240ad0e63e8e9dfecf180d
MD5 0923d37c3ec35ab963ec4841a91283f6
BLAKE2b-256 666595916ebfabebcfc598675313a123aa7e723411c58f3bc778190d89c1bb36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page