Skip to main content

scikit-learn compatible toolbox for learning with time series/panel data

Project description

https://travis-ci.com/alan-turing-institute/sktime.svg?token=kTo6WTfr4f458q1WzPCH&branch=master https://badge.fury.io/py/sktime.svg https://badges.gitter.im/sktime/community.svg

sktime

A scikit-learn compatible Python toolbox for learning with time series and panel data. Eventually, we would like to support:

  • Time series classification and regression,

  • Classical forecasting,

  • Supervised/panel forecasting,

  • Time series segmentation,

  • Time-to-event and event risk modelling,

  • Unsupervised tasks such as motif discovery, anomaly detection and diagnostic visualization,

  • On-line and streaming tasks, e.g. in variation of the above.

For deep learning methods, we have a separate extension package: sktime-dl.

The package is under active development. Development takes place in the sktime repository on Github.

Currently, modular modelling workflows for forecasting and supervised learning with time series have been implemented. As next steps, we will move to supervised forecasting and integration of a modified pysf interface and extensions to the existing frameworks.

Installation

The package is available via PyPI using:

pip install sktime

But note that the package is actively being developed and currently not feature stable.

Development version

To install the development version, follow these steps:

  1. Download the repository: git clone https://github.com/alan-turing-institute/sktime.git

  2. Move into the root directory of the repository: cd sktime

  3. Switch to development branch: git checkout dev

  4. Make sure your local version is up-to-date: git pull

  5. Install package: pip install .

You currently may have to install numpy and Cython first using: pip install numpy and pip install Cython.

Overview

Low-level interface

The low-level interface extends the standard scikit-learn API to handle time series and panel data. Currently, the package implements:

  • Various state-of-the-art approaches to supervised learning with time series features,

  • Transformation of time series, including series-to-series transforms (e.g. Fourier transform), series-to-primitives transforms aka feature extractors, (e.g. mean, variance), sub-divided into fittables (on table) and row-wise applicates,

  • Pipelining, allowing to chain multiple transformers with a final estimator,

  • Meta-learning strategies including tuning and ensembling, accepting pipelines as the base estimator,

  • Off-shelf composite strategies, such as a fully customisable random forest for time-series classification, with interval segmentation and feature extraction,

  • Classical forecasting algorithms and reduction strategies to solve forecasting tasks with time series regression algorithms.

High-level interface

There are numerous different time series data related learning tasks, for example

  • Time series classification and regression,

  • Classical forecasting,

  • Supervised/panel forecasting,

  • Time series segmentation.

The sktime high-level interface aims to create a unified interface for these different learning tasks (partially inspired by the APIs of mlr and openML) through the following two objects:

  • Task object that encapsulates meta-data from a dataset and the necessary information about the particular supervised learning task, e.g. the instructions on how to derive the target/labels for classification from the data,

  • Strategy objects that wrap low-level estimators and allows to use fit and predict methods using data and a task object.

Documentation

The full API documentation and an introduction can be found here. Tutorial notebooks for currently stable functionality are in the examples folder.

Development road map

  1. Functionality for the advanced time series tasks. For (supervised) forecasting, integration of a modified pysf interface. For time-to-event and event risk modell, integration of an adapted pysf interface.

  2. Extension of high-level interface to classical and supervised/panel forecasting, to include reduction strategies in which forecasting or supervised forecasting tasks are reduced to tasks that can be solved with classical supervised learning algorithms or time series classification/regression,

  3. Integration of algorithms for classical forecasting (e.g. ARIMA), deep learning strategies, and third-party feature extraction tools,

  4. Design and implementation of specialised data-container for efficient handling of time series/panel data in a supervised learning workflow and separation of time series meta-data, re-utilising existing data-containers whenever possible,

  5. Automated benchmarking functionality including orchestration of experiments and post-hoc evaluation methods, based on the mlaut design.

Contributors

Former and current active contributors are as follows.

Project management: Jason Lines (@jasonlines), Franz Király (@fkiraly)

Design: Anthony Bagnall (@TonyBagnall), Sajaysurya Ganesh (@sajaysurya), Jason Lines (@jasonlines), Viktor Kazakov (@viktorkaz), Franz Király (@fkiraly), Markus Löning (@mloning)

Coding: Sajaysurya Ganesh (@sajaysurya), Anthony Bagnall (@TonyBagnall), Jason Lines (@jasonlines), George Oastler (@goastler), Viktor Kazakov (@viktorkaz), Markus Löning (@mloning)

We are actively looking for contributors. Please contact @fkiraly or @jasonlines for volunteering or information on paid opportunities, or simply raise an issue in the tracker.

Project details


Release history Release notifications | RSS feed

This version

0.3.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sktime-0.3.0.tar.gz (5.0 MB view details)

Uploaded Source

Built Distribution

sktime-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl (3.1 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file sktime-0.3.0.tar.gz.

File metadata

  • Download URL: sktime-0.3.0.tar.gz
  • Upload date:
  • Size: 5.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for sktime-0.3.0.tar.gz
Algorithm Hash digest
SHA256 315d6c1ed19e6721d448987466853e5aa0be7cd62a1e1f6b6cb89a786bc37b04
MD5 a6c4d73f6f6a3ad7c78682f5c41ab802
BLAKE2b-256 74aaa8709883f270fad7052c0f618aa7fe2001d35bb2c12d68317beb65c7789f

See more details on using hashes here.

File details

Details for the file sktime-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: sktime-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 3.1 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for sktime-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 3d5e791b9f36ef73d1036235726e6363df6ffa4306863d1562e8d0abbeb233b2
MD5 42d4f58a504664e7b2e56f495cf7c443
BLAKE2b-256 600ccfaa4adaf4a04130c35c4acfb2ed8f2572bae161f5d7211b71b2f3428efc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page