Skip to main content

LLM unified service

Project description

Modelz LLM

discord invitation link trackgit-views

Modelz LLM is an inference server that facilitates the utilization of open source large language models (LLMs), such as FastChat, LLaMA, and ChatGLM, on either local or cloud-based environments with OpenAI compatible API.

Features

  • OpenAI compatible API: Modelz LLM provides an OpenAI compatible API for LLMs, which means you can use the OpenAI python SDK to interact with the model.
  • Self-hosted: Modelz LLM can be easily deployed on either local or cloud-based environments.
  • Open source LLMs: Modelz LLM supports open source LLMs, such as FastChat, LLaMA, and ChatGLM.
  • Cloud native: We provide docker images for different LLMs, which can be easily deployed on Kubernetes, or other cloud-based environments (e.g. Modelz)

Quick Start

Install

pip install modelz-llm[gpu]
# or install from source
pip install git+https://github.com/tensorchord/modelz-llm.git[gpu]

Run the self-hosted API server

Please first start the self-hosted API server by following the instructions:

modelz-llm -m "THUDM/chatglm-6b-int4"

Currently, we support the following models:

Model Name Huggingface Model Docker Image
FastChat T5 lmsys/fastchat-t5-3b-v1.0 modelzai/llm-fastchat-t5-3b
Vicuna 7B Delta V1.1 lmsys/vicuna-7b-delta-v1.1 modelzai/llm-vicuna-7b
LLaMA 7B decapoda-research/llama-7b-hf modelzai/llm-llama-7b
ChatGLM 6B INT4 THUDM/chatglm-6b-int4 modelzai/llm-chatglm-6b-int4
ChatGLM 6B THUDM/chatglm-6b modelzai/llm-chatglm-6b
Bloomz 560M bigscience/bloomz-560m
Bloomz 1.7B bigscience/bloomz-1b7
Bloomz 3B bigscience/bloomz-3b
Bloomz 7.1B bigscience/bloomz-7b1

Use OpenAI python SDK

Then you can use the OpenAI python SDK to interact with the model:

import openai
openai.api_base="http://localhost:8000"
openai.api_key="any"

# create a chat completion
chat_completion = openai.ChatCompletion.create(model="any", messages=[{"role": "user", "content": "Hello world"}])

Supported APIs

    app.add_route("/", Ping())
    app.add_route("/completions", completion)
    app.add_route("/chat/completions", chat_completion)
    app.add_route("/embeddings", embeddings)
    app.add_route("/engines/{engine}/embeddings", embeddings)
    app.add_route("/v1/completions", completion)
    app.add_route("/v1/chat/completions", chat_completion)
    app.add_route("/v1/embeddings", embeddings)
    app.add_route("/v1/engines/{engine}/embeddings", embeddings)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelz-llm-23.6.5.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

modelz_llm-23.6.5-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file modelz-llm-23.6.5.tar.gz.

File metadata

  • Download URL: modelz-llm-23.6.5.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for modelz-llm-23.6.5.tar.gz
Algorithm Hash digest
SHA256 bb456900eb82aa2ac70569ea06281e7364fbebc4d9f2ba9427d7f47a73bda0b4
MD5 c74e2c56a3993568344385edc9f39e02
BLAKE2b-256 af286e9b1e1c33970bd111139f09bb485dba56d03746578ad0775df7bd64cb0f

See more details on using hashes here.

File details

Details for the file modelz_llm-23.6.5-py3-none-any.whl.

File metadata

  • Download URL: modelz_llm-23.6.5-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for modelz_llm-23.6.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a831f210296ce3b0043a3c5d07ae2f39575746ab6f09964e3561f6dff2b15b54
MD5 45a16aec59c68e9e206d7bbe3bb478a5
BLAKE2b-256 7eca2840e2943313cf1191bd8cc002e922ca7b1f384bc444e6c8b24d9650fe7e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page