No project description provided
Project description
naan
Table of Contents
What is Naan?
- Naan is a wrapper around FAISS indexes that provides metadata storage and retrieval for the vectors added to the index.
- Naan's job is to eliminate the tedious task of keeping around the original content before it's encoded and added to the index.
- Naan is NOT a vector database. All the vector-search operations are demanded to FAISS.
Installation
pip install naan
Index data
To see Naan in action, let's first get some data to embed:
from io import StringIO
import requests
import json
res = requests.get("https://raw.githubusercontent.com/masci/naan/main/example/sentences.json")
sentences = json.load(StringIO(res.text))
Naan tries not to get in the way you manage your FAISS index, so the first step is always setting up the FAISS side of things:
from sentence_transformers import SentenceTransformer
import faiss
model = SentenceTransformer("bert-base-nli-mean-tokens")
sentence_embeddings = model.encode(sentences)
dim = sentence_embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
Now it's time to wrap the FAISS index with Naan and use it to index data:
from naan import NaanDB
# Create a Naan database from scratch
db = NaanDB("db.naan", index, force_recreate=True)
db.add(sentence_embeddings, sentences)
Naan will add the vector embeddings to the FAISS index, and will also store the original sentences. This way, a vector search will look like this:
# Reopen an existing Naan database
db = NaanDB("db.naan")
query_embeddings = model.encode(["The book is on the table"])
# Naan's search API is the same as FAISS, let's get the 3 closest vectors
results = db.search(query_embeddings, 3)
for result in results:
print(result)
# Document(vector_id=11451, content='A group of people sitting around a desk.', embeddings=None)
# Document(vector_id=2754, content='A close-up picture of a desk with a computer and papers on it.', embeddings=None)
# Document(vector_id=11853, content='A computer on a desk.', embeddings=None)
License
naan
is distributed under the terms of the MIT license.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
naan-0.0.4.tar.gz
(6.4 kB
view details)
Built Distribution
naan-0.0.4-py3-none-any.whl
(6.5 kB
view details)
File details
Details for the file naan-0.0.4.tar.gz
.
File metadata
- Download URL: naan-0.0.4.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e8d4b32e1c96ab6dea15e86df891f6a56f721e17085e5020ac6bb123b92b984 |
|
MD5 | 3018c3aaf27b99c03e7c80b105af5895 |
|
BLAKE2b-256 | 3153d2182f75c917b40fe8679f2ff9f847f5a2400bce534b4546ed7953737397 |
File details
Details for the file naan-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: naan-0.0.4-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cedc458dfc35a5d3cf9c1e53751698858a829d9407a19e9493b9e09d6a66a5bf |
|
MD5 | ddd61c1950de5cff25b8bcd8bf5ac993 |
|
BLAKE2b-256 | 8618ba235729e63f180e6f483953630955ccbc5c676f7de0607f17d5aa029581 |