Skip to main content

DiskANN Python extension module

Project description

DiskANN

DiskANN Pull Request Build and Test

DiskANN is a suite of scalable, accurate and cost-effective approximate nearest neighbor search algorithms for large-scale vector search that support real-time changes and simple filters. This code is based on ideas from the DiskANN, Fresh-DiskANN and the Filtered-DiskANN papers with further improvements. This code forked off from code for NSG algorithm.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

See guidelines for contributing to this project.

Linux build:

Install the following packages through apt-get

sudo apt install make cmake g++ libaio-dev libgoogle-perftools-dev clang-format libboost-all-dev

Install Intel MKL

Ubuntu 20.04 or newer

sudo apt install libmkl-full-dev

Earlier versions of Ubuntu

Install Intel MKL either by downloading the oneAPI MKL installer or using apt (we tested with build 2019.4-070 and 2022.1.2.146).

# OneAPI MKL Installer
wget https://registrationcenter-download.intel.com/akdlm/irc_nas/18487/l_BaseKit_p_2022.1.2.146.sh
sudo sh l_BaseKit_p_2022.1.2.146.sh -a --components intel.oneapi.lin.mkl.devel --action install --eula accept -s

Build

mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. && make -j 

Windows build:

The Windows version has been tested with Enterprise editions of Visual Studio 2022, 2019 and 2017. It should work with the Community and Professional editions as well without any changes.

Prerequisites:

  • CMake 3.15+ (available in VisualStudio 2019+ or from https://cmake.org)
  • NuGet.exe (install from https://www.nuget.org/downloads)
    • The build script will use NuGet to get MKL, OpenMP and Boost packages.
  • DiskANN git repository checked out together with submodules. To check out submodules after git clone:
git submodule init
git submodule update
  • Environment variables:
    • [optional] If you would like to override the Boost library listed in windows/packages.config.in, set BOOST_ROOT to your Boost folder.

Build steps:

  • Open the "x64 Native Tools Command Prompt for VS 2019" (or corresponding version) and change to DiskANN folder
  • Create a "build" directory inside it
  • Change to the "build" directory and run
cmake ..

OR for Visual Studio 2017 and earlier:

<full-path-to-installed-cmake>\cmake ..
  • This will create a diskann.sln solution. Open it from VisualStudio and build either Release or Debug configuration.
    • Alternatively, use MSBuild:
msbuild.exe diskann.sln /m /nologo /t:Build /p:Configuration="Release" /property:Platform="x64"
* This will also build gperftools submodule for libtcmalloc_minimal dependency.
  • Generated binaries are stored in the x64/Release or x64/Debug directories.

Usage:

Please see the following pages on using the compiled code:

Please cite this software in your work as:

@misc{diskann-github,
   author = {Simhadri, Harsha Vardhan and Krishnaswamy, Ravishankar and Srinivasa, Gopal and Subramanya, Suhas Jayaram and Antonijevic, Andrija and Pryce, Dax and Kaczynski, David and Williams, Shane and Gollapudi, Siddarth and Sivashankar, Varun and Karia, Neel and Singh, Aditi and Jaiswal, Shikhar and Mahapatro, Neelam and Adams, Philip and Tower, Bryan}},
   title = {{DiskANN: Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search}},
   url = {https://github.com/Microsoft/DiskANN},
   version = {0.5},
   year = {2023}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

diskannpy-0.5.0rc4-cp311-cp311-win_amd64.whl (4.5 MB view details)

Uploaded CPython 3.11 Windows x86-64

diskannpy-0.5.0rc4-cp311-cp311-manylinux_2_28_x86_64.whl (91.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.28+ x86-64

diskannpy-0.5.0rc4-cp310-cp310-win_amd64.whl (4.5 MB view details)

Uploaded CPython 3.10 Windows x86-64

diskannpy-0.5.0rc4-cp310-cp310-manylinux_2_28_x86_64.whl (91.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.28+ x86-64

diskannpy-0.5.0rc4-cp39-cp39-win_amd64.whl (4.5 MB view details)

Uploaded CPython 3.9 Windows x86-64

diskannpy-0.5.0rc4-cp39-cp39-manylinux_2_28_x86_64.whl (91.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.28+ x86-64

File details

Details for the file diskannpy-0.5.0rc4-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for diskannpy-0.5.0rc4-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 5a161cecdc4a3b3f3023ce966a5ec83e0a110c5a216fbd3f59e4ecf62305e82f
MD5 bf50f8e1b567412e8209c8e87664c327
BLAKE2b-256 b512792e445e767d48cc8180f638647ca4bca30e219ade734d4286b2936e37ad

See more details on using hashes here.

File details

Details for the file diskannpy-0.5.0rc4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for diskannpy-0.5.0rc4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e503be0d75cfad7c541e19141e5b3513d945199508fdea8dc96125645eda3027
MD5 a0c1dde18e04190f0c0459513087b49c
BLAKE2b-256 447d5d388fd00149e2ff9e3bf62bb7967ce2d3b89a956d222299ef0ea3242431

See more details on using hashes here.

File details

Details for the file diskannpy-0.5.0rc4-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for diskannpy-0.5.0rc4-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5a6e9d26983a7388fdf296cdf18c4a3855725f75943ba06b875a678330b3a97c
MD5 d6dc9606f2ffcfd15f59ce536b940361
BLAKE2b-256 94dcfeb1a9593d5d109e5cdc0e4a7e62e5eedad4ae2e4a2fc589d9ce7b2f9c1e

See more details on using hashes here.

File details

Details for the file diskannpy-0.5.0rc4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for diskannpy-0.5.0rc4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 08390c8f7cc676121b7fd42239e265071ebe4273687316678102696cbb8413cb
MD5 da2046d009a50454f0d272a6c4b51b1c
BLAKE2b-256 5008f9b1a69fd75c68a49ba4d8f9f4f5962c6ca3842d5609b11d03c46fb9b8a3

See more details on using hashes here.

File details

Details for the file diskannpy-0.5.0rc4-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for diskannpy-0.5.0rc4-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 56a04e81b4411d17d61190748ffc8f4016887330698ef8ad4c9852e6fa0441bc
MD5 8b295ea80bc3b3e63d455f5e8aefaf58
BLAKE2b-256 b0fa8a8d6e6fc3642fa1de59513916075801059b30f9e80388c1a4db37ba8766

See more details on using hashes here.

File details

Details for the file diskannpy-0.5.0rc4-cp39-cp39-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for diskannpy-0.5.0rc4-cp39-cp39-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1629b740b714a755b719b4babdcb5c93a580f22e1a9eb674a2b2a264beb60a76
MD5 ffceb9315b730a30deb937971117c205
BLAKE2b-256 c847c6442bfaa274d6e44a5a3035108c7b0df530a136251cc46e05917389ccec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page