Scalable machine learning based time series forecasting
Project description
mlforecast
Install
PyPI
pip install mlforecast
If you want to perform distributed training, you can instead use
pip install mlforecast[distributed]
, which will also install
dask. Note that you’ll also need to install either
LightGBM
or
XGBoost.
conda-forge
conda install -c conda-forge mlforecast
Note that this installation comes with the required dependencies for the
local interface. If you want to perform distributed training, you must
install dask (conda install -c conda-forge dask
) and either
LightGBM
or
XGBoost.
How to use
The following provides a very basic overview, for a more detailed description see the documentation.
Store your time series in a pandas dataframe in long format, that is, each row represents an observation for a specific serie and timestamp.
from mlforecast.utils import generate_daily_series
series = generate_daily_series(
n_series=20,
max_length=100,
n_static_features=1,
static_as_categorical=False,
with_trend=True
)
series.head()
ds | y | static_0 | |
---|---|---|---|
unique_id | |||
id_00 | 2000-01-01 | 1.751917 | 72 |
id_00 | 2000-01-02 | 9.196715 | 72 |
id_00 | 2000-01-03 | 18.577788 | 72 |
id_00 | 2000-01-04 | 24.520646 | 72 |
id_00 | 2000-01-05 | 33.418028 | 72 |
Next define your models. If you want to use the local interface this can
be any regressor that follows the scikit-learn API. For distributed
training there are LGBMForecast
and XGBForecast
.
import lightgbm as lgb
import xgboost as xgb
from sklearn.ensemble import RandomForestRegressor
models = [
lgb.LGBMRegressor(),
xgb.XGBRegressor(),
RandomForestRegressor(random_state=0),
]
Now instantiate a Forecast
object with the models and the features
that you want to use. The features can be lags, transformations on the
lags and date features. The lag transformations are defined as
numba jitted functions that transform an
array, if they have additional arguments you supply a tuple
(transform_func
, arg1
, arg2
, …).
from mlforecast import Forecast
from window_ops.expanding import expanding_mean
from window_ops.rolling import rolling_mean
fcst = Forecast(
models=models,
freq='D',
lags=[7, 14],
lag_transforms={
1: [expanding_mean],
7: [(rolling_mean, 7)]
},
date_features=['dayofweek'],
differences=[1],
)
To compute the features and train the models call fit
on your
Forecast
object. Here you have to specify the columns that:
- Identify each serie (
id_col
). If the series identifier is the index you can specifyid_col='index'
- Contain the timestamps (
time_col
). Can also be integers if your data doesn’t have timestamps. - Are the series values (
target_col
)
fcst.fit(series, id_col='index', time_col='ds', target_col='y', static_features=['static_0'])
Forecast(models=[LGBMRegressor, XGBRegressor, RandomForestRegressor], freq=<Day>, lag_features=['lag-7', 'lag-14', 'expanding_mean_lag-1', 'rolling_mean_lag-7_window_size-7'], date_features=['dayofweek'], num_threads=1)
To get the forecasts for the next 14 days call predict(14)
on the
forecast object. This will automatically handle the updates required by
the features using a recursive strategy.
predictions = fcst.predict(14)
import matplotlib.pyplot as plt
import pandas as pd
fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(12, 6), gridspec_kw=dict(hspace=0.3))
for i, (cat, axi) in enumerate(zip(series.index.categories, ax.flat)):
pd.concat([series.loc[cat, ['ds', 'y']], predictions.loc[cat]]).set_index('ds').plot(ax=axi)
axi.set(title=cat, xlabel=None)
if i % 2 == 0:
axi.legend().remove()
else:
axi.legend(bbox_to_anchor=(1.01, 1.0))
fig.savefig('figs/index.png', bbox_inches='tight')
plt.close()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for mlforecast-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe182cae6952528e7a7eee48667877acc67a6711834a50cb469722bb8484aaa2 |
|
MD5 | a88b9c609e74cb22b2c87dab20b88f35 |
|
BLAKE2b-256 | 2de07201d75d9d10d2fa1ace78daed1c0566e538ed166a547cfc4fecdd50fc61 |