Skip to main content

No project description provided

Project description

LangServe 🦜️🔗

Overview

LangServe helps developers deploy LangChain runnables and chains as a REST API.

This library is integrated with FastAPI and uses pydantic for data validation.

In addition, it provides a client that can be used to call into runnables deployed on a server. A javascript client is available in LangChainJS.

Features

  • Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
  • API docs page with JSONSchema and Swagger (insert example link)
  • Efficient /invoke, /batch and /stream endpoints with support for many concurrent requests on a single server
  • /stream_log endpoint for streaming all (or some) intermediate steps from your chain/agent
  • Built-in (optional) tracing to LangSmith, just add your API key (see Instructions])
  • All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
  • Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)

Limitations

  • Client callbacks are not yet supported for events that originate on the server
  • Does not work with pydantic v2 yet

LangChain CLI 🛠️

Use the LangChain CLI to bootstrap a LangServe project quickly.

To use the langchain CLI make sure that you have a recent version of langchain installed and also typer. (pip install langchain typer or pip install "langchain[cli]")

langchain ../path/to/directory

And follow the instructions...

Examples

For more examples, see the examples directory.

Server

Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.

#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes


app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple api server using Langchain's Runnable interfaces",
)

add_routes(
    app,
    ChatOpenAI(),
    path="/openai",
)

add_routes(
    app,
    ChatAnthropic(),
    path="/anthropic",
)

model = ChatAnthropic()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
    app,
    prompt | model,
    path="/chain",
)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="localhost", port=8000)

Docs

If you've deployed the server above, you can view the generated OpenAPI docs using:

curl localhost:8000/docs

Client

Python SDK

from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable

openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/chain/")

joke_chain.invoke({"topic": "parrots"})

# or async
await joke_chain.ainvoke({"topic": "parrots"})

prompt = [
    SystemMessage(content='Act like either a cat or a parrot.'), 
    HumanMessage(content='Hello!')
]

# Supports astream
async for msg in anthropic.astream(prompt):
    print(msg, end="", flush=True)
    
prompt = ChatPromptTemplate.from_messages(
    [("system", "Tell me a long story about {topic}")]
)
    
# Can define custom chains
chain = prompt | RunnableMap({
    "openai": openai,
    "anthropic": anthropic,
})

chain.batch([{ "topic": "parrots" }, { "topic": "cats" }])

In TypeScript (requires LangChain.js version 0.0.166 or later):

import { RemoteRunnable } from "langchain/runnables/remote";

const chain = new RemoteRunnable({ url: `http://localhost:8000/chain/invoke/` });
const result = await chain.invoke({
  "topic": "cats", 
});

Python using requests:

import requests
response = requests.post(
    "http://localhost:8000/chain/invoke/",
    json={'input': {'topic': 'cats'}}
)
response.json()

You can also use curl:

curl --location --request POST 'http://localhost:8000/chain/invoke/' \
    --header 'Content-Type: application/json' \
    --data-raw '{
        "input": {
            "topic": "cats"
        }
    }'

Endpoints

The following code:

...
add_routes(
  app,
  runnable,
  path="/my_runnable",
)

adds of these endpoints to the server:

  • POST /my_runnable/invoke - invoke the runnable on a single input
  • POST /my_runnable/batch - invoke the runnable on a batch of inputs
  • POST /my_runnable/stream - invoke on a single input and stream the output
  • POST /my_runnable/stream_log - invoke on a single input and stream the output, including output of intermediate steps as it's generated
  • GET /my_runnable/input_schema - json schema for input to the runnable
  • GET /my_runnable/output_schema - json schema for output of the runnable
  • GET /my_runnable/config_schema - json schema for config of the runnable

Installation

For both client and server:

pip install "langserve[all]"

or pip install "langserve[client]" for client code, and pip install "langserve[server]" for server code.

Legacy Chains

LangServe works with both Runnables (constructed via LangChain Expression Language) and legacy chains (inheriting from Chain). However, some of the input schemas for legacy chains may be incomplete/incorrect, leading to errors. This can be fixed by updating the input_schema property of those chains in LangChain. If you encounter any errors, please open an issue on THIS repo, and we will work to address it.

Handling Authentication

If you need to add authentication to your server, please reference FastAPI's security documentation and middleware documentation.

Deployment

Deploy to GCP

You can deploy to GCP Cloud Run using the following command:

gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langserve-0.0.11.tar.gz (460.8 kB view details)

Uploaded Source

Built Distribution

langserve-0.0.11-py3-none-any.whl (460.3 kB view details)

Uploaded Python 3

File details

Details for the file langserve-0.0.11.tar.gz.

File metadata

  • Download URL: langserve-0.0.11.tar.gz
  • Upload date:
  • Size: 460.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for langserve-0.0.11.tar.gz
Algorithm Hash digest
SHA256 2f8d493540154c5808c5e368f009a0f916951c267bd39ec34faabe9006bd791e
MD5 f1ab4a6b99dacfacde86f17fb6ab0e1f
BLAKE2b-256 11b466d339a3f76651b534c4cf3a07d0fe83d8ffc12a8fad34be86c67d93df48

See more details on using hashes here.

File details

Details for the file langserve-0.0.11-py3-none-any.whl.

File metadata

  • Download URL: langserve-0.0.11-py3-none-any.whl
  • Upload date:
  • Size: 460.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for langserve-0.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 878f8ce94db4abab7a3f5d2d6c5cd58f3a9f0b8c8223f6bc4e4ee8e57a56b3f1
MD5 f25ca5adc0c82c8d9c6b96af447d00c5
BLAKE2b-256 7e9748f1aeea764cc7b186d0ac133b984c980cfaad0d3bfa010e72ba00154cb0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page