Scalable machine learning based time series forecasting

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language

Project description

mlforecast

Install

PyPI

pip install mlforecast

If you want to perform distributed training, you can instead use pip install "mlforecast[distributed]", which will also install dask. Note that you’ll also need to install either LightGBM or XGBoost.

conda-forge

conda install -c conda-forge mlforecast

Note that this installation comes with the required dependencies for the local interface. If you want to perform distributed training, you must install dask (conda install -c conda-forge dask) and either LightGBM or XGBoost.

How to use

The following provides a very basic overview, for a more detailed description see the documentation.

Data setup

Store your time series in a pandas dataframe in long format, that is, each row represents an observation for a specific serie and timestamp.

from mlforecast.utils import generate_daily_series

series = generate_daily_series(
    n_series=20,
    max_length=100,
    n_static_features=1,
    static_as_categorical=False,
    with_trend=True
)
series.head()

	ds	y	static_0
unique_id
id_00	2000-01-01	1.751917	72
id_00	2000-01-02	9.196715	72
id_00	2000-01-03	18.577788	72
id_00	2000-01-04	24.520646	72
id_00	2000-01-05	33.418028	72

Models

Next define your models. If you want to use the local interface this can be any regressor that follows the scikit-learn API. For distributed training there are LGBMForecast and XGBForecast.

import lightgbm as lgb
import xgboost as xgb
from sklearn.ensemble import RandomForestRegressor

models = [
    lgb.LGBMRegressor(),
    xgb.XGBRegressor(),
    RandomForestRegressor(random_state=0),
]

Forecast object

Now instantiate a MLForecast object with the models and the features that you want to use. The features can be lags, transformations on the lags and date features. The lag transformations are defined as numba jitted functions that transform an array, if they have additional arguments you can either supply a tuple (transform_func, arg1, arg2, …) or define new functions fixing the arguments. You can also define differences to apply to the series before fitting that will be restored when predicting.

from mlforecast import MLForecast
from numba import njit
from window_ops.expanding import expanding_mean
from window_ops.rolling import rolling_mean


@njit
def rolling_mean_28(x):
    return rolling_mean(x, window_size=28)


fcst = MLForecast(
    models=models,
    freq='D',
    lags=[7, 14],
    lag_transforms={
        1: [expanding_mean],
        7: [rolling_mean_28]
    },
    date_features=['dayofweek'],
    differences=[1],
)

Training

To compute the features and train the models call fit on your Forecast object. Here you have to specify the columns that:

Identify each serie (id_col). If the series identifier is the index you can specify id_col='index'
Contain the timestamps (time_col). Can also be integers if your data doesn’t have timestamps.
Are the series values (target_col)
Are static (static_features). These are features that don’t change over time and can be repeated when predicting.

fcst.fit(series, id_col='index', time_col='ds', target_col='y', static_features=['static_0'])

MLForecast(models=[LGBMRegressor, XGBRegressor, RandomForestRegressor], freq=<Day>, lag_features=['lag-7', 'lag-14', 'expanding_mean_lag-1', 'rolling_mean_28_lag-7'], date_features=['dayofweek'], num_threads=1)

Predicting

To get the forecasts for the next n days call predict(n) on the forecast object. This will automatically handle the updates required by the features using a recursive strategy.

predictions = fcst.predict(14)
predictions

	ds	LGBMRegressor	XGBRegressor	RandomForestRegressor
unique_id
id_00	2000-04-04	69.082830	67.761337	68.184016
id_00	2000-04-05	75.706024	74.588699	75.470680
id_00	2000-04-06	82.222473	81.058289	82.846249
id_00	2000-04-07	89.577638	88.735947	90.201271
id_00	2000-04-08	44.149095	44.981384	46.096322
...	...	...	...	...
id_19	2000-03-23	30.236012	31.949095	32.656369
id_19	2000-03-24	31.308269	32.765919	33.624488
id_19	2000-03-25	32.788550	33.628864	34.581486
id_19	2000-03-26	34.086976	34.508457	35.553173
id_19	2000-03-27	34.288968	35.411613	36.526505

280 rows × 4 columns

Visualize results

import matplotlib.pyplot as plt
import pandas as pd

fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(12, 6), gridspec_kw=dict(hspace=0.3))
for i, (cat, axi) in enumerate(zip(series.index.categories, ax.flat)):
    pd.concat([series.loc[cat, ['ds', 'y']], predictions.loc[cat]]).set_index('ds').plot(ax=axi)
    axi.set(title=cat, xlabel=None)
    if i % 2 == 0:
        axi.legend().remove()
    else:
        axi.legend(bbox_to_anchor=(1.01, 1.0))
fig.savefig('figs/index.png', bbox_inches='tight')
plt.close()

Sample notebooks

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
License
- OSI Approved :: Apache Software License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

0.14.0

Nov 11, 2024

0.13.6

Nov 8, 2024

0.13.5

Oct 10, 2024

0.13.4

Aug 23, 2024

0.13.3

Jul 25, 2024

0.13.2

Jul 17, 2024

0.13.1

Jul 1, 2024

0.13.0

May 9, 2024

0.12.1

Apr 8, 2024

0.12.0

Mar 4, 2024

0.11.8

Feb 16, 2024

0.11.7

Feb 15, 2024

0.11.6

Jan 19, 2024

0.11.5

Jan 8, 2024

0.11.4

Jan 2, 2024

0.11.3

Dec 14, 2023

0.11.2

Dec 7, 2023

0.11.1

Nov 24, 2023

0.11.0

Nov 6, 2023

0.10.0

Oct 3, 2023

0.9.3

Sep 12, 2023

0.9.2

Aug 29, 2023

0.9.1

Aug 15, 2023

0.9.0

Aug 1, 2023

0.8.1

Jul 21, 2023

0.8.0

Jul 20, 2023

0.7.4

Jul 5, 2023

0.7.3

May 23, 2023

0.7.2

May 16, 2023

0.7.1

Apr 27, 2023

0.7.0

Apr 11, 2023

0.6.0

Feb 3, 2023

This version

0.5.0

Jan 31, 2023

0.4.0

Nov 25, 2022

0.3.1

Nov 9, 2022

0.3.0

Nov 1, 2022

0.2.0

Aug 10, 2022

0.1.0

Jun 24, 2021

0.0.9

Jun 9, 2021

0.0.8

May 31, 2021

0.0.7

May 31, 2021

0.0.6

May 8, 2021

0.0.5

May 7, 2021

0.0.4.1

May 4, 2021

0.0.4

May 3, 2021

0.0.3

Apr 30, 2021

0.0.2

Apr 27, 2021

0.0.1

Apr 27, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlforecast-0.5.0.tar.gz (33.3 kB view hashes)

Uploaded Jan 31, 2023 Source

Built Distribution

mlforecast-0.5.0-py3-none-any.whl (35.4 kB view hashes)

Uploaded Jan 31, 2023 Python 3

Hashes for mlforecast-0.5.0.tar.gz

Hashes for mlforecast-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`03a4c8061e2d445deae091c2093f6c60394cd8b31a3ffb0f468672a732422ac6`
MD5	`09763de72282f5a9d27380c9fd17f509`
BLAKE2b-256	`122b96843f9a078e84ad5521ab680827c74640684333a94aeb2c442f95475030`

Hashes for mlforecast-0.5.0-py3-none-any.whl

Hashes for mlforecast-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2e223709c0552a69de5e32747daf41b3d0f2f8d32629d7d1d11877c63869d9e4`
MD5	`25b4911c845dfceea7cff2d7792340b4`
BLAKE2b-256	`acb944f3382703a3455b60525dcd78926d38da97118a7bf969fd6a8926682626`