Skip to main content

Mozilla simpleprophet forecast framework.

Project description

simpleprophet KPI forecasting models and tools

This directory contains the simpleprophet forecast models and accompanying tools. The simpleprophet models prioritize simplicity and stability, adding complexity and "bendiness" only when it significantly improves model performance on a holdback period.

Setup

You will need a python environment with fbprophet and a few other dependencies installed. We provide a Docker image that can be pulled from GCR and run interactively like:

GOOGLE_APPLICATION_CREDENTIALS=/path/to/creds.json
docker run -it \
  -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/keys/key.json \
  -v $GOOGLE_APPLICATION_CREDENTIALS:/tmp/keys/key.json:ro \
  --entrypoint python \
  gcr.io/moz-fx-data-forecasting/simpleprophet

Or you can make code updates and build the image locally:

docker build . --tag simpleprophet
docker run -it \
  -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/keys/key.json \
  -v $GOOGLE_APPLICATION_CREDENTIALS:/tmp/keys/key.json:ro \
  --entrypoint python \
  simpleprophet

If you want to produce a local environment outside docker, you can create an appropriate Conda environment via conda env create -f environment.yml. To create an environment outside Conda, see the fbprophet installation instructions and then use pip install -r requirements.txt to install remaining dependencies.

Usage

The functions for running the forecasting pipeline are in pipeline.py. The update_table function can be run daily to add rows to the output table as necessary to incorporate newly available metrics data. The replace_table function will clear the output table and regenerate it from scratch.

For model-building the code in modeling.py may be useful. It includes a function to evaluate a model on a holdout set and provide some useful visualizations.

The validations.py file contains code that produces plots to evaluate model performance and validate the model behavior over time is reasonable. It can also be used to compare multiple models.

Modeling Strategy

The models.py file contains the production model specifications. The models were developed by Jesse McCrosky using the fbprophet framework. The guiding modeling philosophy was to be guided by simplicity and intuitive fit, while informing the modeling process using conventional types of quantitative evidence.

The evaluation of a forecast is fundamentally multi-dimensional. As well as the competing objectives of stability, accuracy, and non-bias, each of these objectives can be evaluated on multiple time horizons. This complexity makes a pure machine learning optimization approach extremely complex.

As an alternative, I chose to fit the models somewhat intuitively. Parameter sets were explored iteratively and each iteration was evaluated visually to see if the model components (seasonality and trend) seemed to fit the observed actuals. Once a reasonable parameter space was defined, the modeling process proceeded to evaluate holdout set metrics and other quantitative evaluations on a set of models. Simpler models were preferred and complexity was only added when clearly justified by the quantitative evidence.

A few relevant model characteristics:

  • Due to the "smoothed" nature of MAU as a metric, yearly seasonality was adequate to capture all holiday effects except for Easter, which was included as a model component.
  • We select a start date for the training data based on the point where the metric appears to have reached a somewhat steady state in its development - the first weeks of most metrics are quite atypically and their use for training would not be helpful.
  • Similarly, some product metrics have "anomalies" - periods during which the metric value was highly atypical, usually due to a data problem. These periods were excluded from training data.
  • The appropriate start dates and anomaly periods were determined through manual examination of metric plots.

For more information, contact jmccrosky@mozilla.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simpleprophet-0.1.0.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

simpleprophet-0.1.0-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file simpleprophet-0.1.0.tar.gz.

File metadata

  • Download URL: simpleprophet-0.1.0.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.16

File hashes

Hashes for simpleprophet-0.1.0.tar.gz
Algorithm Hash digest
SHA256 53b6840f1abcdeb4ffa2a68ac90a2a0a71d860d1263030810530145a4080906e
MD5 40f9a4b51b90ef535dc164e31c34406b
BLAKE2b-256 5ca8c45f5fc3110ef0c4e4decd91ad232ace7fe0bf8fbe82501f9691c45ca307

See more details on using hashes here.

File details

Details for the file simpleprophet-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: simpleprophet-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.16

File hashes

Hashes for simpleprophet-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3fd2c3d82656f96fe1e1a7a887d24a055cbb00a24c6e1ed58186cad96657f537
MD5 04ed451d72da1e3474f6319617f25e36
BLAKE2b-256 646c1538c2ae44994d2d61ec36602151cc28567ebaf95faae98d5c1d3fa3bed9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page