Skip to main content

scikit-learn compatible toolbox for supervised learning with time-series/panel data

Project description

.. image:: https://travis-ci.com/alan-turing-institute/sktime.svg?token=kTo6WTfr4f458q1WzPCH&branch=master
:target: https://travis-ci.com/alan-turing-institute/sktime
sktime
======

A `scikit-learn <https://github.com/scikit-learn/scikit-learn>`_ compatible Python toolbox for learning with
time series and panel data. Eventually, we would like to support:

* Time series classification and regression,
* Classical forecasting,
* Supervised/panel forecasting,
* Time series segmentation,
* Time-to-event and event risk modelling,
* Unsupervised tasks such as motif discovery and anomaly detection, and diagnostic visualization,
* On-line and streaming tasks, e.g., in variation of the above.

The package is under active development. Development takes place in the `sktime <https://github.com/alan-turing-institute/sktime>`_ repository on Github.

Currently, modular modelling workflows for supervised learning with time series have been implemented.
As next steps, we will move to forecasting and integration of a modified `pysf <https://github.com/alan-turing-institute/pysf/>`_ interface for forecasting and supervised forecasting.


Installation
------------
The package is currently not feature stable, and thus not available directly via PyPI. In the interim, please follow these steps to install the development version:

1. Download the repository if you have not done so already: :code:`git clone https://github.com/alan-turing-institute/sktime.git`
2. Move into the root directory: :code:`cd sktime`
3. Make sure your local version is up-to-date: :code:`git pull`
4. Optionally, activate destination environment for package, e.g. with conda: :code:`conda activate <env>`
5. Install package: :code:`pip install .`


Overview
--------

High-level interface
~~~~~~~~~~~~~~~~~~~~
There are numerous different time series data related learning tasks, for example

* Time series classification and regression,
* Classical forecasting,
* Supervised/panel forecasting,
* Time series segmentation.

The sktime high-level interface aims to create a unified interface for these different learning tasks (partially inspired by the APIs of mlr and openML) through the following two objects:

* :code:`Task` object that encapsulates meta-data from a dataset and the necessary information about the particular supervised learning task, e.g. the instructions on how to derive the target/labels for classification from the data,
* :code:`Strategy` objects that wrap low-level estimators and allows to use :code:`fit` and :code:`predict` methods using data and a task object.


Low-level interface
~~~~~~~~~~~~~~~~~~~
The low-level interface extends the standard scikit-learn API to handle time series and panel data.
Currently, the package implements:

* Various state-of-the-art approaches to supervised learning with time series features,
* Transformation of time series, including series-to-series transforms (e.g. Fourier transform), series-to-primitives transforms aka feature extractors, (e.g. mean, variance), sub-divided into fittables (on table) and row-wise applicates,
* Pipelining, allowing to chain multiple transformers with a final estimator,
* Meta-learning strategies including tuning and ensembling, accepting pipelines as the base estimator,
* Off-shelf composites strategies, such as a fully customisable random forest for time-series classification, with interval segmentation and feature extraction.


Documentation
-------------
The full API documentation and an introduction can be found `here <https://alan-turing-institute.github.io/sktime/>`_.
Tutorial notebooks for currently stable functionality are `here <https://github.com/alan-turing-institute/sktime/tree/master/examples>`_.


Development road map
--------------------
1. Functionality for the advanced time series tasks. For (supervised) forecasting, integration of a modified `pysf <https://github.com/alan-turing-institute/pysf/>`_ interface. For time-to-event and event risk modell, integration of an adapted `pysf <https://github.com/alan-turing-institute/skpro/>`_ interface.
2. Extension of high-level interface to classical and supervised/panel forecasting, to include reduction strategies in which forecasting or supervised forecasting tasks are reduced to tasks that can be solved with classical supervised learning algorithms or time series classification/regression,
3. Integration of algorithms for classical forecasting (e.g. ARIMA), deep learning strategies, and third-party feature extraction tools,
4. Design and implementation of specialised data-container for efficient handling of time series/panel data in a supervised learning workflow and separation of time series meta-data, re-utilising existing data-containers whenever possible,
5. Automated benchmarking functionality including orchestration of experiments and post-hoc evaluation methods, based on the `mlaut <https://github.com/alan-turing-institute/pysf/>`_ design.

Contributors
------------
Former and current active contributors are as follows.

Project management: Jason Lines (@jasonlines), Franz J Kiraly (@fkiraly)

Design: Anthony Bagnall, Sajaysurya Ganesh (@sajaysurya), Jason Lines (@jasonlines), Viktor Kazakov (@viktorkaz), Franz J Kiraly (@fkiraly), Markus Löning (@mloning)

Coding: Sajaysurya Ganesh (@sajaysurya), Jason Lines (@jasonlines), Viktor Kazakov (@viktorkaz), Markus Löning (@mloning)

We are actively looking for contributors. Please contact @fkiraly or @jasonlines for volunteering or information on paid opportunities, or simply raise an issue in the tracker.


Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sktime-0.1.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

sktime-0.1.0-py3-none-any.whl (548.6 kB view details)

Uploaded Python 3

File details

Details for the file sktime-0.1.0.tar.gz.

File metadata

  • Download URL: sktime-0.1.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for sktime-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e284709b84f6e02ca24d831222df1dd949fbb83a5cb9d14b1ba541a509e94678
MD5 0b7c7b2e89a7ad94bbd2c9a681e7d8aa
BLAKE2b-256 4c463c58ca993091cda71bd1c8493b18493109510e0bf29d7e06b12b240e43e8

See more details on using hashes here.

File details

Details for the file sktime-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sktime-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 548.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for sktime-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 64679594a610c41866f993ee1f39e769252ee1aba1bf5876774383788f607e35
MD5 ae8ce266457c808775d406624d7b98ea
BLAKE2b-256 1d0d479992cb0477403238c6248be39493cee9fc9b1c0c41e16e2db0afe279ca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page