No project description provided
Project description
LangServe 🦜️🏓
Overview
LangServe
helps developers deploy LangChain
runnables and chains as a REST API.
This library is integrated with FastAPI and uses pydantic for data validation.
In addition, it provides a client that can be used to call into runnables deployed on a server. A javascript client is available in LangChainJS.
Features
- Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
- API docs page with JSONSchema and Swagger (insert example link)
- Efficient
/invoke
,/batch
and/stream
endpoints with support for many concurrent requests on a single server /stream_log
endpoint for streaming all (or some) intermediate steps from your chain/agent- Playground page at
/playground
with streaming output and intermediate steps - Built-in (optional) tracing to LangSmith, just add your API key (see Instructions])
- All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
- Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)
Limitations
- Client callbacks are not yet supported for events that originate on the server
- Does not work with pydantic v2 yet
Security
- Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing arbitrary files on server. Resolved in 0.0.16.
LangChain CLI 🛠️
Use the LangChain
CLI to bootstrap a LangServe
project quickly.
To use the langchain CLI make sure that you have a recent version of langchain
installed
and also typer
. (pip install langchain typer
or pip install "langchain[cli]"
)
langchain ../path/to/directory
And follow the instructions...
Examples
For more examples, see the examples directory.
Server
Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.
#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes
app = FastAPI(
title="LangChain Server",
version="1.0",
description="A simple api server using Langchain's Runnable interfaces",
)
add_routes(
app,
ChatOpenAI(),
path="/openai",
)
add_routes(
app,
ChatAnthropic(),
path="/anthropic",
)
model = ChatAnthropic()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
app,
prompt | model,
path="/chain",
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="localhost", port=8000)
Docs
If you've deployed the server above, you can view the generated OpenAPI docs using:
curl localhost:8000/docs
make sure to add the /docs
suffix.
Below will return a 404 until you define a @app.get("/")
localhost:8000
Client
Python SDK
from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable
openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/chain/")
joke_chain.invoke({"topic": "parrots"})
# or async
await joke_chain.ainvoke({"topic": "parrots"})
prompt = [
SystemMessage(content='Act like either a cat or a parrot.'),
HumanMessage(content='Hello!')
]
# Supports astream
async for msg in anthropic.astream(prompt):
print(msg, end="", flush=True)
prompt = ChatPromptTemplate.from_messages(
[("system", "Tell me a long story about {topic}")]
)
# Can define custom chains
chain = prompt | RunnableMap({
"openai": openai,
"anthropic": anthropic,
})
chain.batch([{ "topic": "parrots" }, { "topic": "cats" }])
In TypeScript (requires LangChain.js version 0.0.166 or later):
import { RemoteRunnable } from "langchain/runnables/remote";
const chain = new RemoteRunnable({
url: `http://localhost:8000/chain/invoke/`,
});
const result = await chain.invoke({
topic: "cats",
});
Python using requests
:
import requests
response = requests.post(
"http://localhost:8000/chain/invoke/",
json={'input': {'topic': 'cats'}}
)
response.json()
You can also use curl
:
curl --location --request POST 'http://localhost:8000/chain/invoke/' \
--header 'Content-Type: application/json' \
--data-raw '{
"input": {
"topic": "cats"
}
}'
Endpoints
The following code:
...
add_routes(
app,
runnable,
path="/my_runnable",
)
adds of these endpoints to the server:
POST /my_runnable/invoke
- invoke the runnable on a single inputPOST /my_runnable/batch
- invoke the runnable on a batch of inputsPOST /my_runnable/stream
- invoke on a single input and stream the outputPOST /my_runnable/stream_log
- invoke on a single input and stream the output, including output of intermediate steps as it's generatedGET /my_runnable/input_schema
- json schema for input to the runnableGET /my_runnable/output_schema
- json schema for output of the runnableGET /my_runnable/config_schema
- json schema for config of the runnable
Playground
You can find a playground page for your runnable at /my_runnable/playground
. This exposes a simple UI to configure and invoke your runnable with streaming output and intermediate steps.
Installation
For both client and server:
pip install "langserve[all]"
or pip install "langserve[client]"
for client code, and pip install "langserve[server]"
for server code.
Legacy Chains
LangServe works with both Runnables (constructed via LangChain Expression Language) and legacy chains (inheriting from Chain
).
However, some of the input schemas for legacy chains may be incomplete/incorrect, leading to errors.
This can be fixed by updating the input_schema
property of those chains in LangChain.
If you encounter any errors, please open an issue on THIS repo, and we will work to address it.
Handling Authentication
If you need to add authentication to your server, please reference FastAPI's security documentation and middleware documentation.
Deployment
Deploy to GCP
You can deploy to GCP Cloud Run using the following command:
gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key
Advanced
Files
LLM applications often deal with files. There are different architectures that can be made to implement file processing; at a high level:
- The file may be uploaded to the server via a dedicated endpoint and processed using a separate endpoint
- The file may be uploaded by either value (bytes of file) or reference (e.g., s3 url to file content)
- The processing endpoint may be blocking or non-blocking
- If significant processing is required, the processing may be offloaded to a dedicated process pool
You should determine what is the appropriate architecture for your application.
Currently, to upload files by value to a runnable, use base64 encoding for the
file (multipart/form-data
is not supported yet).
Here's an example that shows how to use base64 encoding to send a file to a remote runnable.
Remember, you can always upload files by reference (e.g., s3 url) or upload them as multipart/form-data to a dedicated endpoint.
Custom User Types
Inherit from CustomUserType
if you want the data to de-serialize into a
pydantic model rather than the equivalent dict representation.
At the moment, this type only works server side and is used to specify desired decoding behavior. If inheriting from this type the server will keep the decoded type as a pydantic model instead of converting it into a dict.
from langserve.schema import CustomUserType
app = FastAPI()
class Foo(CustomUserType):
bar: int
def func(foo: Foo) -> int:
"""Sample function that expects a Foo type which is a pydantic model"""
assert isinstance(foo, Foo)
return foo.bar
add_routes(app, RunnableLambda(func), path="/foo")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file langserve-0.0.18.tar.gz
.
File metadata
- Download URL: langserve-0.0.18.tar.gz
- Upload date:
- Size: 480.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ecc9082add84cf9b3e20f744fca16f46f653c5680f01ee2543b8658451a04fd2 |
|
MD5 | 6ee1cbdac16c266ee4ace54c78799f5f |
|
BLAKE2b-256 | f609711ea128b7042b6c03f2db7addda41a8fb3cecff0358fb514e94a85c8945 |
File details
Details for the file langserve-0.0.18-py3-none-any.whl
.
File metadata
- Download URL: langserve-0.0.18-py3-none-any.whl
- Upload date:
- Size: 480.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e84bab70613e102d44799acd32ea4f4510174d5efb1b1ed412659d5183126bbe |
|
MD5 | ced3eff156106f3b3e0c8088f7ce9938 |
|
BLAKE2b-256 | b0e8cc34786621f9d1405ae0c4ff72e697b5fe6f7432690a2a8ff626906e9874 |