Skip to main content

Code-generation for various ML models into native code.

Project description

m2cgen

Build Status Coverage Status License: MIT Python Versions PyPI Version

m2cgen (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java, Go, JavaScript, Visual Basic, C#).

Installation

Supported Python version is >= 3.4.

pip install m2cgen

Supported Languages

  • C
  • C#
  • Go
  • Java
  • JavaScript
  • Python
  • Visual Basic

Supported Models

Classification Regression
Linear LogisticRegression, LogisticRegressionCV, RidgeClassifier, RidgeClassifierCV, SGDClassifier, PassiveAggressiveClassifier LinearRegression, HuberRegressor, ElasticNet, ElasticNetCV, TheilSenRegressor, Lars, LarsCV, Lasso, LassoCV, LassoLars, LassoLarsIC, OrthogonalMatchingPursuit, OrthogonalMatchingPursuitCV, Ridge, RidgeCV, BayesianRidge, ARDRegression, SGDRegressor, PassiveAggressiveRegressor
SVM SVC, NuSVC, LinearSVC SVR, NuSVR, LinearSVR
Tree DecisionTreeClassifier, ExtraTreeClassifier DecisionTreeRegressor, ExtraTreeRegressor
Random Forest RandomForestClassifier, ExtraTreesClassifier RandomForestRegressor, ExtraTreesRegressor
Boosting XGBClassifier(gbtree/dart booster only), LGBMClassifier(gbdt/dart booster only) XGBRegressor(gbtree/dart booster only), LGBMRegressor(gbdt/dart booster only)

Classification Output

Linear/Linear SVM

Binary

Scalar value; signed distance of the sample to the hyperplane for the second class.

Multiclass

Vector value; signed distance of the sample to the hyperplane per each class.

Comment

The output is consistent with the output of LinearClassifierMixin.decision_function.

SVM

Binary

Scalar value; signed distance of the sample to the hyperplane for the second class.

Multiclass

Vector value; one-vs-one score for each class, shape (n_samples, n_classes * (n_classes-1) / 2).

Comment

The output is consistent with the output of BaseSVC.decision_function when the decision_function_shape is set to ovo.

Tree/Random Forest/XGBoost/LightGBM

Binary

Vector value; class probabilities.

Multiclass

Vector value; class probabilities.

Comment

The output is consistent with the output of the predict_proba method of DecisionTreeClassifier/ForestClassifier/XGBClassifier/LGBMClassifier.

Usage

Here's a simple example of how a linear model trained in Python environment can be represented in Java code:

from sklearn.datasets import load_boston
from sklearn import linear_model
import m2cgen as m2c

boston = load_boston()
X, y = boston.data, boston.target

estimator = linear_model.LinearRegression()
estimator.fit(X, y)

code = m2c.export_to_java(estimator)

Generated Java code:

public class Model {

    public static double score(double[] input) {
        return (((((((((((((36.45948838508965) + ((input[0]) * (-0.10801135783679647))) + ((input[1]) * (0.04642045836688297))) + ((input[2]) * (0.020558626367073608))) + ((input[3]) * (2.6867338193449406))) + ((input[4]) * (-17.76661122830004))) + ((input[5]) * (3.8098652068092163))) + ((input[6]) * (0.0006922246403454562))) + ((input[7]) * (-1.475566845600257))) + ((input[8]) * (0.30604947898516943))) + ((input[9]) * (-0.012334593916574394))) + ((input[10]) * (-0.9527472317072884))) + ((input[11]) * (0.009311683273794044))) + ((input[12]) * (-0.5247583778554867));
    }
}

You can find more examples of generated code for different models/languages here.

CLI

m2cgen can be used as a CLI tool to generate code using serialized model objects (pickle protocol):

$ m2cgen <pickle_file> --language <language> [--indent <indent>] [--class_name <class_name>]
         [--module_name <module_name>] [--package_name <package_name>] [--namespace <namespace>]
         [--recursion-limit <recursion_limit>]

Piping is also supported:

$ cat <pickle_file> | m2cgen --language <language>

FAQ

Q: Generation fails with RuntimeError: maximum recursion depth exceeded error.

A: If this error occurs while generating code using an ensemble model, try to reduce the number of trained estimators within that model. Alternatively you can increase the maximum recursion depth with sys.setrecursionlimit(<new_depth>).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

m2cgen-0.5.0.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

m2cgen-0.5.0-py3-none-any.whl (39.2 kB view details)

Uploaded Python 3

File details

Details for the file m2cgen-0.5.0.tar.gz.

File metadata

  • Download URL: m2cgen-0.5.0.tar.gz
  • Upload date:
  • Size: 25.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.4

File hashes

Hashes for m2cgen-0.5.0.tar.gz
Algorithm Hash digest
SHA256 d75e0c01a46e3b1ff46d96a5da3db19bb15d3a14b9c0cf292619c29ea182faba
MD5 559d96d0ccba68f1b69ce97f46d46414
BLAKE2b-256 6b7203b6b0c2e4debdfc8723d78a1df5590b93e1126a852f95e6a53cbc6b7691

See more details on using hashes here.

File details

Details for the file m2cgen-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: m2cgen-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 39.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.4

File hashes

Hashes for m2cgen-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2988e24922ada85af741a4fce77a1333451c967d61505a7655617d4cc96d930f
MD5 68c5fd8b232d4a0261a49495b0d0e33f
BLAKE2b-256 ed6ff4ab71cf8b6add55ab79ba53de06dd73bda01ea72fad3fd960c1e0dbd5eb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page