Skip to main content

A minimum-lovable machine-learning pipeline, built on top of AWS SageMaker.

Project description

ML2P – or (ML)^2P – is the minimal lovable machine-learning pipeline and a friendlier interface to AWS SageMaker.

Design goals:

  • support the full machine learning lifecyle

  • support custom feature engineering

  • support building custom models in Python

  • provide reproducible training and deployment of models

  • support the use of customised base Docker images for training and deployment

Concretely it provides a command line interface and a Python library to assist with:

  • S3:
    • Managing training data

  • SageMaker:
    • Launching training jobs

    • Deploying trained models

    • Creating notebook instances

  • On your local machine or in a SageMaker notebook:
    • Downloading training datasets from S3

    • Training models

    • Loading trained models from SageMaker / S3

Installing

Install ML2P with:

$ pip install ml2p

Overview

ML2P helps manage a machine learning project. You’ll define your project by writing a small YAML file named ml2p.yml:

project: "ml2p-tutorial"
s3folder: "s3://your-s3-bucket/"
models:
  bob: "models.RegressorModel"
defaults:
  image: "XXXXX.dkr.ecr.REGION.amazonaws.com/your-docker-image:X.Y.Z"
  role: "arn:aws:iam::XXXXX:role/your-role"
train:
  instance_type: "ml.m5.large"
deploy:
  instance_type: "ml.t2.medium"
  record_invokes: true

This specifies:

  • project: the name of your project

  • s3folder: the S3 bucket that will hold the models and data sets for your project

  • models: a list of model names and the Python classes that will be used to train the models and make predictions

  • defaults:

    • image: the docker image that your project will use for training and prediction

    • role: the AWS role your project will run under

  • train:

    • instance_type: the AWS instance type that will be used when training your model

  • deploy:

    • instance_type: the AWS instance type that will be used when deploying your model

    • record_invokes: whether to record prediction requests in S3

The name of your project functions as a prefix to the names of SageMaker training jobs, models and endpoints that ML2P creates (since these names are global within a SageMaker account).

ML2P also tags all of the AWS objects it creates with your project name.

Tutorial

See https://ml2p.readthedocs.io/en/latest/tutorial/ for a step-by-step tutorial.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml2p-0.2.0.tar.gz (37.1 kB view details)

Uploaded Source

Built Distribution

ml2p-0.2.0-py2.py3-none-any.whl (39.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ml2p-0.2.0.tar.gz.

File metadata

  • Download URL: ml2p-0.2.0.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for ml2p-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0c4e3ac48b88b44cffb815079a8c4bbfc03a373fc7655258b26fe286fa1a584d
MD5 c190e713826e5713fa9149c45231ae1e
BLAKE2b-256 bc783ec12423a6af1d8c86d3d8ec1ab825c545d697bdfc50b199933e5c4d0ea2

See more details on using hashes here.

File details

Details for the file ml2p-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: ml2p-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 39.7 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.5

File hashes

Hashes for ml2p-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ad7d44cc5f9aa88fdb6d1b73fc8238cd841e8c2b62f583ca37cd1f100d79bf24
MD5 40f84dc682d940e183a831749804c24e
BLAKE2b-256 e8471eab6278faa47634d84c2c77b8a04a8824f97c215b88ade8d956503a5c65

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page