A fast and lightweight autoML system

These details have not been verified by PyPI

Project links

Homepage

Project description

Conda version Python Version

FLAML - Fast and Lightweight AutoML

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically. It frees users from selecting learners and hyperparameters for each learner. It is fast and economical. The simple and lightweight design makes it easy to extend, such as adding customized learners or metrics. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research. FLAML leverages the structure of the search space to choose a search order optimized for both cost and error. For example, the system tends to propose cheap configurations at the beginning stage of the search, but quickly moves to configurations with high model complexity and large sample size when needed in the later stage of the search. For another example, it favors cheap learners in the beginning but penalizes them later if the error improvement is slow. The cost-bounded search and cost-based prioritization make a big difference in the search efficiency under budget constraints.

FLAML has a .NET implementation as well from ML.NET Model Builder. This ML.NET blog describes the improvement brought by FLAML.

Installation

FLAML requires Python version >= 3.6. It can be installed from pip:

pip install flaml

To run the notebook example, install flaml with the [notebook] option:

pip install flaml[notebook]

Quickstart

With three lines of code, you can start using this economical and fast AutoML engine as a scikit-learn style estimator.

from flaml import AutoML
automl = AutoML()
automl.fit(X_train, y_train, task="classification")

You can restrict the learners and use FLAML as a fast hyperparameter tuning tool for XGBoost, LightGBM, Random Forest etc. or a customized learner.

automl.fit(X_train, y_train, task="classification", estimator_list=["lgbm"])

You can also run generic ray-tune style hyperparameter tuning for a custom function.

from flaml import tune
tune.run(train_with_config, config={…}, low_cost_partial_config={…}, time_budget_s=3600)

Advantages

For common machine learning tasks like classification and regression, find quality models with small computational resources.
Users can choose their desired customizability: minimal customization (computational resource budget), medium customization (e.g., scikit-style learner, search space and metric), full customization (arbitrary training and evaluation code).
Allow human guidance in hyperparameter tuning to respect prior on certain subspaces but also able to explore other subspaces. Read more about the hyperparameter optimization methods in FLAML here. They can be used beyond the AutoML context. And they can be used in distributed HPO frameworks such as ray tune or nni.
Support online AutoML: automatic hyperparameter tuning for online learning algorithms. Read more about the online AutoML method in FLAML here.

Examples

A basic classification example.

from flaml import AutoML
from sklearn.datasets import load_iris
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'accuracy',
    "task": 'classification',
    "log_file_name": "iris.log",
}
X_train, y_train = load_iris(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)
# Predict
print(automl.predict_proba(X_train))
# Print the best model
print(automl.model.estimator)

A basic regression example.

from flaml import AutoML
from sklearn.datasets import fetch_california_housing
# Initialize an AutoML instance
automl = AutoML()
# Specify automl goal and constraint
automl_settings = {
    "time_budget": 10,  # in seconds
    "metric": 'r2',
    "task": 'regression',
    "log_file_name": "california.log",
}
X_train, y_train = fetch_california_housing(return_X_y=True)
# Train with labeled input data
automl.fit(X_train=X_train, y_train=y_train,
           **automl_settings)
# Predict
print(automl.predict(X_train))
# Print the best model
print(automl.model.estimator)

A basic time series forecasting example.

# pip install "flaml[ts_forecast]"
import numpy as np
from flaml import AutoML
X_train = np.arange('2014-01', '2021-01', dtype='datetime64[M]')
y_train = np.random.random(size=72)
automl = AutoML()
automl.fit(X_train=X_train[:72],  # a single column of timestamp
           y_train=y_train,  # value for each timestamp
           period=12,  # time horizon to forecast, e.g., 12 months
           task='ts_forecast', time_budget=15,  # time budget in seconds
           log_file_name="ts_forecast.log",
          )
print(automl.predict(X_train[72:]))

Learning to rank.

from sklearn.datasets import fetch_openml
from flaml import AutoML
X_train, y_train = fetch_openml(name="credit-g", return_X_y=True, as_frame=False)
# not a real learning to rank dataaset
groups = [200] * 4 + [100] * 2    # group counts
automl = AutoML()
automl.fit(
    X_train, y_train, groups=groups,
    task='rank', time_budget=10,    # in seconds
)

Fine tuning language model.

# pip install "flaml[nlp]"
from flaml import AutoML
from datasets import load_dataset

train_dataset = load_dataset("glue", "mrpc", split="train").to_pandas()
dev_dataset = load_dataset("glue", "mrpc", split="validation").to_pandas()
test_dataset = load_dataset("glue", "mrpc", split="test").to_pandas()
custom_sent_keys = ["sentence1", "sentence2"]
label_key = "label"
X_train, y_train = train_dataset[custom_sent_keys], train_dataset[label_key]
X_val, y_val = dev_dataset[custom_sent_keys], dev_dataset[label_key]
X_test = test_dataset[custom_sent_keys]

automl = AutoML()
automl_settings = {
    "time_budget": 100,
    "task": "seq-classification",
    "custom_hpo_args": {"output_dir": "data/output/"},
    "gpu_per_trial": 1,  # set to 0 if no GPU is available
}
automl.fit(X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val, **automl_settings)
automl.predict(X_test)

More examples can be found in notebooks.

Documentation

Please find the API documentation here.

Please find demo and tutorials of FLAML here.

For more technical details, please check our papers.

FLAML: A Fast and Lightweight AutoML Library. Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu. MLSys 2021.

@inproceedings{wang2021flaml,
    title={FLAML: A Fast and Lightweight AutoML Library},
    author={Chi Wang and Qingyun Wu and Markus Weimer and Erkang Zhu},
    year={2021},
    booktitle={MLSys},
}

Frugal Optimization for Cost-related Hyperparameters. Qingyun Wu, Chi Wang, Silu Huang. AAAI 2021.
Economical Hyperparameter Optimization With Blended Search Strategy. Chi Wang, Qingyun Wu, Silu Huang, Amin Saied. ICLR 2021.
ChaCha for Online AutoML. Qingyun Wu, Chi Wang, John Langford, Paul Mineiro and Marco Rossi. ICML 2021.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

If you are new to GitHub here is a detailed help source on getting involved with development on GitHub.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Developing

Setup

git clone https://github.com/microsoft/FLAML.git
pip install -e .[test,notebook]

Docker

We provide a simple Dockerfile.

docker build git://github.com/microsoft/FLAML -t flaml-dev
docker run -it flaml-dev

Develop in Remote Container

If you use vscode, you can open the FLAML folder in a Container. We have provided the configuration in .devcontainer.

Pre-commit

Run pre-commit install to install pre-commit into your git hooks. Before you commit, run pre-commit run to check if you meet the pre-commit requirements. If you use Windows (without WSL) and can't commit after installing pre-commit, you can run pre-commit uninstall to uninstall the hook. In WSL or Linux this is supposed to work.

Coverage

Any code you commit should not decrease coverage. To run all unit tests:

coverage run -m pytest test

Then you can see the coverage report by coverage report -m or coverage html. If all the tests are passed, please also test run notebook/flaml_automl to make sure your commit does not break the notebook example.

Authors

Chi Wang
Qingyun Wu

Contributors (alphabetical order): Amir Aghaei, Vijay Aski, Sebastien Bubeck, Surajit Chaudhuri, Nadiia Chepurko, Ofer Dekel, Alex Deng, Anshuman Dutt, Nicolo Fusi, Jianfeng Gao, Johannes Gehrke, Niklas Gustafsson, Silu Huang, Dongwoo Kim, Christian Konig, John Langford, Menghao Li, Mingqin Li, Zhe Liu, Naveen Gaur, Paul Mineiro, Vivek Narasayya, Jake Radzikowski, Marco Rossi, Amin Saied, Neil Tenenholtz, Olga Vrousgou, Markus Weimer, Yue Wang, Qingyun Wu, Qiufeng Yin, Haozhe Zhang, Minjia Zhang, XiaoYun Zhang, Eric Zhu, and open-source contributors.

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.3.2

Nov 1, 2024

2.3.1

Sep 22, 2024

2.3.0

Sep 19, 2024

2.2.0

Aug 8, 2024

2.1.2

Mar 13, 2024

2.1.1

Oct 2, 2023

2.1.0

Sep 14, 2023

2.0.2

Aug 30, 2023

2.0.1

Aug 28, 2023

2.0.0

Aug 15, 2023

2.0.0rc5 pre-release

Aug 9, 2023

2.0.0rc4 pre-release

Aug 4, 2023

2.0.0rc3 pre-release

Jul 10, 2023

2.0.0rc2 pre-release

Jun 26, 2023

2.0.0rc1 pre-release

Jun 9, 2023

1.2.4

May 23, 2023

1.2.3

May 5, 2023

1.2.2

Apr 25, 2023

1.2.1

Apr 17, 2023

1.2.0

Apr 8, 2023

1.1.3

Mar 1, 2023

1.1.2

Feb 6, 2023

1.1.1

Jan 8, 2023

1.1.0

Dec 30, 2022

1.0.14

Nov 16, 2022

1.0.13

Oct 13, 2022

1.0.12

Sep 6, 2022

1.0.11

Aug 21, 2022

1.0.10

Aug 16, 2022

1.0.9

Jul 31, 2022

1.0.8

Jul 10, 2022

1.0.7

Jun 17, 2022

1.0.6

Jun 9, 2022

1.0.5

Jun 7, 2022

1.0.4

Jun 2, 2022

1.0.3

May 31, 2022

1.0.2

May 20, 2022

1.0.1

Apr 24, 2022

1.0.0

Mar 31, 2022

0.10.0

Mar 3, 2022

0.9.7

Feb 12, 2022

0.9.6

Jan 31, 2022

0.9.5

Jan 17, 2022

0.9.4

Jan 8, 2022

0.9.3

Jan 4, 2022

0.9.2

Dec 26, 2021

0.9.1

Dec 18, 2021

0.9.0

Dec 7, 2021

0.8.2

Dec 4, 2021

This version

0.8.1

Nov 28, 2021

0.8.0

Nov 23, 2021

0.7.1

Nov 8, 2021

0.7.0

Nov 4, 2021

0.6.9

Oct 20, 2021

0.6.8

Oct 19, 2021

0.6.7

Oct 11, 2021

0.6.6

Oct 9, 2021

0.6.5

Sep 26, 2021

0.6.4

Sep 20, 2021

0.6.3

Sep 11, 2021

0.6.2

Sep 5, 2021

0.6.1

Sep 4, 2021

0.6.0

Aug 24, 2021

0.5.12

Aug 12, 2021

0.5.11

Aug 3, 2021

0.5.10

Jul 28, 2021

0.5.9

Jul 25, 2021

0.5.8

Jul 21, 2021

0.5.7

Jul 11, 2021

0.5.6

Jul 6, 2021

0.5.5 yanked

Jul 6, 2021

Reason this release was yanked:

bug causing performance issue

0.5.4

Jun 19, 2021

0.5.3

Jun 16, 2021

0.5.2

Jun 8, 2021

0.5.1

Jun 5, 2021

0.5.0

Jun 4, 2021

0.4.1

May 28, 2021

0.4.0

May 22, 2021

0.3.6

May 7, 2021

0.3.5

May 1, 2021

0.3.4

Apr 26, 2021

0.3.3

Apr 22, 2021

0.3.2

Apr 21, 2021

0.3.1

Apr 11, 2021

0.3.0

Apr 8, 2021

0.2.10

Apr 1, 2021

0.2.9

Mar 19, 2021

0.2.8

Mar 6, 2021

0.2.7

Mar 6, 2021

0.2.6

Feb 28, 2021

0.2.5

Feb 23, 2021

0.2.4

Feb 17, 2021

0.2.3

Feb 10, 2021

0.2.2

Feb 6, 2021

0.1.3

Dec 15, 2020

0.1.2

Dec 15, 2020

0.1.0 yanked

Dec 4, 2020

Reason this release was yanked:

settings.json missing in the wheel

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

FLAML-0.8.1.tar.gz (125.0 kB view details)

Uploaded Nov 28, 2021 Source

Built Distribution

FLAML-0.8.1-py3-none-any.whl (130.3 kB view details)

Uploaded Nov 28, 2021 Python 3

File details

Details for the file FLAML-0.8.1.tar.gz.

File metadata

Download URL: FLAML-0.8.1.tar.gz
Upload date: Nov 28, 2021
Size: 125.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for FLAML-0.8.1.tar.gz
Algorithm	Hash digest
SHA256	`aa0ee164cb5cd8faf8f1aea6b48923e0603ce41a509017aedfeca3cbc03f0acb`
MD5	`42b343ff1b12f9b6ba9002857939fa35`
BLAKE2b-256	`1e92aadda3684012822e6431a8263ffb75c07739c848ea844027098838884b8e`

See more details on using hashes here.

File details

Details for the file FLAML-0.8.1-py3-none-any.whl.

File metadata

Download URL: FLAML-0.8.1-py3-none-any.whl
Upload date: Nov 28, 2021
Size: 130.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for FLAML-0.8.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`457e8f499c114816cb390272457ecf9647c2ae3eb8b2e2506ed53372456a7cff`
MD5	`e0cf2f40ba10da1e4fe9d94e222508dd`
BLAKE2b-256	`87d43db47726bc27d6a6be7ca99038b73b7934a02e48299a3cc796cea49a8dde`