Skip to main content

A fast and lightweight autoML system

Project description

PyPI version Build Python Version Downloads Join the chat at https://gitter.im/FLAMLer/community

FLAML - Fast and Lightweight AutoML


FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner. It is fast and economical. The simple and lightweight design makes it easy to extend, such as adding customized learners or metrics. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research. FLAML leverages the structure of the search space to choose a search order optimized for both cost and error. For example, the system tends to propose cheap configurations at the beginning stage of the search, but quickly moves to configurations with high model complexity and large sample size when needed in the later stage of the search. For another example, it favors cheap learners in the beginning but penalizes them later if the error improvement is slow. The cost-bounded search and cost-based prioritization make a big difference in the search efficiency under budget constraints.

FLAML has a .NET implementation as well from ML.NET Model Builder. This ML.NET blog describes the improvement brought by FLAML.

Installation

FLAML requires Python version >= 3.6. It can be installed from pip:

pip install flaml

To run the notebook example, install flaml with the [notebook] option:

pip install flaml[notebook]

Quickstart

  • With three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.
from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification")
  • You can restrict the learners and use FLAML as a fast hyperparameter tuning tool for XGBoost, LightGBM, Random Forest etc. or a customized learner.
automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])
  • You can also run generic ray-tune style hyperparameter tuning for a custom function.
from flaml import tune
tune.run(train_with_config, config={…}, low_cost_partial_config={…}, time_budget_s=3600)

Advantages

  • For classification and regression tasks, find quality models with lower computational resources.
  • Users can choose their desired customizability: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), full customization (arbitrary training and evaluation code).
  • Allow human guidance in hyperparameter tuning to respect prior on certain subspaces but also able to explore other subspaces. Read more about the hyperparameter optimization methods in FLAML here. They can be used beyond the AutoML context. And they can be used in distributed HPO frameworks such as ray tune or nni.
  • Support online AutoML: automatic hyperparameter tuning for online learning algorithms. Read more about the online AutoML method in FLAML here.

Examples

A basic classification example.

from flaml import AutoML
from sklearn.datasets import load_iris
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'accuracy',
    "task": 'classification',
    "log_file_name": "test/iris.log",
}
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)
# Predict
print(automl.predict_proba(X_train))
# Export the best model
print(automl.model)

A basic regression example.

from flaml import AutoML
from sklearn.datasets import load_boston
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'r2',
    "task": 'regression',
    "log_file_name": "test/boston.log",
}
X_train, y_train = load_boston(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)
# Predict
print(automl.predict(X_train))
# Export the best model
print(automl.model)

More examples can be found in notebooks.

Documentation

Please find the API documentation here.

Please find demo and tutorials of FLAML here

For more technical details, please check our papers.

@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

If you are new to GitHub here is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Developing

Setup

git clone https://github.com/microsoft/FLAML.git
pip install -e .[test,notebook]

Coverage

Any code you commit should generally not significantly impact coverage. To run all unit tests:

coverage run -m pytest test

Then you can see the coverage report by coverage report -m or coverage html. If all the tests are passed, please also test run notebook/flaml_automl to make sure your commit does not break the notebook example.

Authors

  • Chi Wang
  • Qingyun Wu

Contributors (alphabetical order): Sebastien Bubeck, Surajit Chaudhuri, Nadiia Chepurko, Ofer Dekel, Alex Deng, Anshuman Dutt, Nicolo Fusi, Jianfeng Gao, Johannes Gehrke, Silu Huang, Dongwoo Kim, Christian Konig, John Langford, Xueqing Liu, Paul Mineiro, Amin Saied, Neil Tenenholtz, Olga Vrousgou, Markus Weimer, Haozhe Zhang, Erkang Zhu.

License

MIT License

Project details


Release history Release notifications | RSS feed

This version

0.5.6

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

FLAML-0.5.6.tar.gz (145.9 kB view details)

Uploaded Source

Built Distribution

FLAML-0.5.6-py3-none-any.whl (194.2 kB view details)

Uploaded Python 3

File details

Details for the file FLAML-0.5.6.tar.gz.

File metadata

  • Download URL: FLAML-0.5.6.tar.gz
  • Upload date:
  • Size: 145.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.1 CPython/3.7.7

File hashes

Hashes for FLAML-0.5.6.tar.gz
Algorithm Hash digest
SHA256 e0e74faffa6e36f44298d9dfb9aaed5e7d521c5ad2bce4fc76601b29d59018dc
MD5 00a96de6337c65e2c93f6317cea0ca65
BLAKE2b-256 97d0eb397c68db210f89955fbd52b70dc189c4e57227739cb3cab6a17c0e0530

See more details on using hashes here.

File details

Details for the file FLAML-0.5.6-py3-none-any.whl.

File metadata

  • Download URL: FLAML-0.5.6-py3-none-any.whl
  • Upload date:
  • Size: 194.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.56.1 CPython/3.7.7

File hashes

Hashes for FLAML-0.5.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ba3fdd3a03714791d914462ac8bb95d2e16a5fe048719c2de5b64bb33c86d39f
MD5 e6ac2b8097eaf4b64e066a3eb4a7166e
BLAKE2b-256 48d0e778093f39022d1102bab0c46c0387c985acadbaa2a91a5a5a4a6ceb4a51

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page