Skip to main content

The smallest possible LLM API

Project description

llama-small

MicroLlama

The smallest possible LLM API. Build a question and answer interface to your own content in a few minutes. Uses OpenAI embeddings, gpt-3.5 and Faiss, via Langchain.

Usage

  1. Combine your source documents into a single JSON file called source.json. It should look like this:
[
    {
        "source": "Reference to the source of your content. This could be a URL or a title or a filename",
        "content": "Your content as a single string. If there's a title or summary, put these first, separated by new lines."
    }, 
    ...
]

See example.source.json for an example.

  1. Install dependencies:
pip install langchain faiss-cpu openai fastapi "uvicorn[standard]"
  1. Get an OpenAI API key and add it to the environment, e.g. export OPENAI_API_KEY=sk-etc. Note that indexing and querying require OpenAI credits, which aren't free.

  2. Run your server with uvicorn serve:app. If the search index doesn't exist, it'll be created and stored.

  3. Query your documents at /api/ask?your question or use the simple front-end at /

Deploying your API

On Fly.io

Sign up for a Fly.io account and install flyctl. Then:

fly launch # answer no to Postgres, Redis and deploying now 
fly secrets set OPENAI_API_KEY=sk-etc 
fly deploy

On Google Cloud Run

gcloud run deploy --source . --set-env-vars="OPENAI_API_KEY=sk-etc"

For Cloud Run and other serverless platforms you should probably generate the FAISS index at container build time, to reduce cold starts. See the two commented lines in Dockerfile.

Based on

TODO

  • Use splitting which generates more meaningful fragments, e.g. text_splitter = SpacyTextSplitter(chunk_size=700, chunk_overlap=200, separator=" ")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

microllama-0.1.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

microllama-0.1-py2.py3-none-any.whl (4.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file microllama-0.1.tar.gz.

File metadata

  • Download URL: microllama-0.1.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.28.2

File hashes

Hashes for microllama-0.1.tar.gz
Algorithm Hash digest
SHA256 79b029b848bbf8c70d4476767f60a1597bc601bb9c5a85c3e1bf4a5528c293a4
MD5 cc7e44a08fef5408a20fd57e00583acd
BLAKE2b-256 2a04c87212939ca81f1d6096b776cc322e64b64b006b6345ec61feedc381be3c

See more details on using hashes here.

File details

Details for the file microllama-0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: microllama-0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.28.2

File hashes

Hashes for microllama-0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c3245d59756e9c8c1c112fb8dc2695c9b160fe11d53a6380edadab8123ad0887
MD5 e34d91388deaea0ef9a9096798167f67
BLAKE2b-256 21210a9387783a56abd8c8473c12e83248847027df13ce7066c0000977b2075b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page