Model Serving made Efficient in the Cloud.
Project description
Model Serving made Efficient in the Cloud.
Introduction
Mosec is a high-performance and flexible model serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.
- Highly performant: web layer and task coordination built with Rust 🦀, which offers blazing speed in addition to efficient CPU utilization powered by async I/O
- Ease of use: user interface purely in Python 🐍, by which users can serve their models in an ML framework-agnostic manner using the same code as they do for offline testing
- Dynamic batching: aggregate requests from different users for batched inference and distribute results back
- Pipelined stages: spawn multiple processes for pipelined stages to handle CPU/GPU/IO mixed workloads
Installation
Mosec requires Python 3.6 or above. Install the latest PyPI package with:
pip install -U mosec
Usage
Write the server
Import the libraries and set up a basic logger to better observe what happens.
import logging
from mosec import Server, Worker
from mosec.errors import ValidationError
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter(
"%(asctime)s - %(process)d - %(levelname)s - %(filename)s:%(lineno)s - %(message)s"
)
sh = logging.StreamHandler()
sh.setFormatter(formatter)
logger.addHandler(sh)
Then, we build an API to calculate the exponential with base e for a given number. To achieve that, we simply inherit the Worker
class and override the forward
method. Note that the input req
is by default a JSON-decoded object, e.g., a dictionary here (because we design it to receive data like {"x": 1}
). We also enclose the input parsing part with a try...except...
block to reject invalid input (e.g., no key named "x"
or field "x"
cannot be converted to float
).
import math
class CalculateExp(Worker):
def forward(self, req: dict) -> dict:
try:
x = float(req["x"])
except KeyError:
raise ValidationError("cannot find key 'x'")
except ValueError:
raise ValidationError("cannot convert 'x' value to float")
y = math.exp(x) # f(x) = e ^ x
logger.debug(f"e ^ {x} = {y}")
return {"y": y}
Finally, we append the worker to the server to construct a single-stage workflow
, with specifying how many processes we want it to run in parallel. Then we run the server.
if __name__ == "__main__":
server = Server()
server.append_worker(
CalculateExp, num=2
) # we spawn two processes for parallel computing
server.run()
Run the server
After merging the snippets above into a file named server.py
, we can first have a look at the supported arguments:
python server.py --help
Then let's start the server...
python server.py
and test it:
curl -X POST http://127.0.0.1:8000/inference -d '{"x": 2}'
That's it! You have just hosted your exponential-computing model as a server! 😉
Example
More ready-to-use examples can be found in the Example section. It includes:
- Multi-stage workflow
- Batch processing worker
- PyTorch deep learning models
- sentiment analysis
- image recognition
Contributing
We welcome any kind of contribution. Please give us feedback by raising issues or directly contribute your code and pull request!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for mosec-0.2.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48380dfa4a057ad48f803c40342ba82ba7ebea6e7defca9b4dea1fe995e88150 |
|
MD5 | c1f2d754e4cca24cd4955da4b515498a |
|
BLAKE2b-256 | b63d72bf24e35a511d0b26ade65cb53c39b3611f1c91eba8945226b5eb6f68eb |
Hashes for mosec-0.2.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | feb574f5a759f4b5b0a32bb2d6eb8564f779ac04d01c42de1324bf122254669a |
|
MD5 | e03f73f758f595b0d4ba1201827409ac |
|
BLAKE2b-256 | e97c62880f3907b780e22f0ed164582faa2cbc639eadda108073f1962b2fbe8c |
Hashes for mosec-0.2.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3dbb2ddb934636dac5d00ab4b38a82102cea544798751b236bc7a349ae20c854 |
|
MD5 | 01cfa61a19cf00b35912750fe0c60c77 |
|
BLAKE2b-256 | f68c0b8f6608300e5b55505b358da5609bbb3ef67dea56967b30a0ef0e0d6a36 |
Hashes for mosec-0.2.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8dd02790ebce44c82ec9c775adbbedcfa461cc130eda3bbbb2caae854b975555 |
|
MD5 | bbb2a7708151971e9f90359ab24b1982 |
|
BLAKE2b-256 | 2c6e77840ddba1c08f8638d455cc30e48f05be6d3a37bf4b36e60d322f06cfe5 |
Hashes for mosec-0.2.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aeacff5072b7a61da682095e7ef85f1ceb7caa8117475a4b038a25026a4c4352 |
|
MD5 | 03842be2c15f722ac49cbf10fc878f09 |
|
BLAKE2b-256 | 6de23e08ce493ce57e2f09b8594f3a50f102d8f90e36e8bf4efa15ab83ae20ba |
Hashes for mosec-0.2.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8fb310cdcfc501bdc83fe6e4a02a273a964af0ee008487703ba7a4da7c34ca49 |
|
MD5 | dc1153b1a49ab10555faae1e2c2a6df5 |
|
BLAKE2b-256 | b39f15d6d48073a87aad62ac7e2a01463c556fc654533063a7ccfca8f657ebcd |
Hashes for mosec-0.2.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a761edf2b5a096c09cc696a6387080446e18ae07a6357aeaf705455e8975dbc3 |
|
MD5 | c57b4f55a8ef35f6710f03a8ecfbfb6c |
|
BLAKE2b-256 | effc0b5a01c3393bfc06c5402aadb98aa12c1013b69e851189ba09ebf7325ac3 |
Hashes for mosec-0.2.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de28bde05fd4505d0d2e0da9206810a0f44b11d9f1076d62dbdbe1fb1a918840 |
|
MD5 | b0a9a77fa7d60dc969e61dfe3c77ade3 |
|
BLAKE2b-256 | ccaed95433e80bddd76d6fb8936b992d622a78df138637986a583ee5478e1c11 |