Skip to main content

LLM unified service

Project description

Modelz LLM

discord invitation link trackgit-views

Modelz LLM is an inference server that facilitates the utilization of open source large language models (LLMs), such as FastChat, LLaMA, and ChatGLM, on either local or cloud-based environments with OpenAI compatible API.

Features

  • OpenAI compatible API: Modelz LLM provides an OpenAI compatible API for LLMs, which means you can use the OpenAI python SDK to interact with the model.
  • Self-hosted: Modelz LLM can be easily deployed on either local or cloud-based environments.
  • Open source LLMs: Modelz LLM supports open source LLMs, such as FastChat, LLaMA, and ChatGLM.
  • Cloud native: We provide docker images for different LLMs, which can be easily deployed on Kubernetes, or other cloud-based environments (e.g. Modelz)

Quick Start

Install

pip install modelz-llm[gpu]
# or install from source
pip install git+https://github.com/tensorchord/modelz-llm.git[gpu]

Run the self-hosted API server

Please first start the self-hosted API server by following the instructions:

modelz-llm -m "THUDM/chatglm-6b-int4"

Currently, we support the following models:

Model Name Huggingface Model Docker Image
FastChat T5 lmsys/fastchat-t5-3b-v1.0 modelzai/llm-fastchat-t5-3b
Vicuna 7B Delta V1.1 lmsys/vicuna-7b-delta-v1.1 modelzai/llm-vicuna-7b
LLaMA 7B decapoda-research/llama-7b-hf modelzai/llm-llama-7b
ChatGLM 6B INT4 THUDM/chatglm-6b-int4 modelzai/llm-chatglm-6b-int4
ChatGLM 6B THUDM/chatglm-6b modelzai/llm-chatglm-6b
Bloomz 560M bigscience/bloomz-560m
Bloomz 1.7B bigscience/bloomz-1b7
Bloomz 3B bigscience/bloomz-3b
Bloomz 7.1B bigscience/bloomz-7b1

Use OpenAI python SDK

Then you can use the OpenAI python SDK to interact with the model:

import openai
openai.api_base="http://localhost:8000"
openai.api_key="any"

# create a chat completion
chat_completion = openai.ChatCompletion.create(model="any", messages=[{"role": "user", "content": "Hello world"}])

Supported APIs

    app.add_route("/", Ping())
    app.add_route("/completions", completion)
    app.add_route("/chat/completions", chat_completion)
    app.add_route("/embeddings", embeddings)
    app.add_route("/engines/{engine}/embeddings", embeddings)
    app.add_route("/v1/completions", completion)
    app.add_route("/v1/chat/completions", chat_completion)
    app.add_route("/v1/embeddings", embeddings)
    app.add_route("/v1/engines/{engine}/embeddings", embeddings)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelz-llm-23.6.4.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

modelz_llm-23.6.4-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file modelz-llm-23.6.4.tar.gz.

File metadata

  • Download URL: modelz-llm-23.6.4.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for modelz-llm-23.6.4.tar.gz
Algorithm Hash digest
SHA256 b0474e86dd718c8760ef3108e65a3e9fa48089a74d09590955187fd43cd63faf
MD5 69d581d4722ed90ca29d03e83a0fe37b
BLAKE2b-256 586c441c6673915f9e18256c6c127d996a73c58308a57bd177c039ee63f84d75

See more details on using hashes here.

File details

Details for the file modelz_llm-23.6.4-py3-none-any.whl.

File metadata

  • Download URL: modelz_llm-23.6.4-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for modelz_llm-23.6.4-py3-none-any.whl
Algorithm Hash digest
SHA256 49d056b6d258acb68f0133f34d166693751a42055fe0c0e0b2668ca2ef356647
MD5 9495c21f63860ea253c9d71cc3237a5f
BLAKE2b-256 2ccc9c381996421924a01b70ed0166dc90aa1358ef7e8c4be9a2dfa91a90afb4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page