Skip to main content

tsfresh extracts relevant characteristics from time series

Project description

Documentation Status Build Status Coverage Status license Gitter chat py27 status py352 status

tsfresh

This repository contains the TSFRESH python package. The abbreviation stands for

“Time Series Feature extraction based on scalable hypothesis tests”.

The package contains many feature extraction methods and a robust feature selection algorithm.

Spend less time on feature engineering

Data Scientists often spend most of their time either cleaning data or building features. While we cannot change the first thing, the second can be automated. TSFRESH frees your time spent on building features by extracting them automatically. Hence, you have more time to study the newest deep learning paper, read hacker news or build better models.

Automatic extraction of 100s of features

TSFRESH automatically extracts 100s of features from time series. Those features describe basic characteristics of the time series such as the number of peaks, the average or maximal value or more complex features such as the time reversal symmetry statistic.

The features extracted from a exemplary time series

The features extracted from a exemplary time series

The set of features can then be used to construct statistical or machine learning models on the time series to be used for example in regression or classification tasks.

Forget irrelevant features

Time series often contain noise, redundancies or irrelevant information. As a result most of the extracted features will not be useful for the machine learning task at hand.

To avoid extracting irrelevant features, the TSFRESH package has a built-in filtering procedure. This filtering procedure evaluates the explaining power and importance of each characteristic for the regression or classification tasks at hand.

It is based on the well developed theory of hypothesis testing and uses a multiple test procedure. As a result the filtering process mathematically controls the percentage of irrelevant extracted features.

The TSFRESH package is described in the following open access paper

  • Christ, M., Braun, N., Neuffer, J. and Kempa-Liehr A.W. (2018).

    Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh – A Python package). Neurocomputing 307 (2018) 72-77, https://doi.org/10.1016/j.neucom.2018.03.067.

The FRESH algorithm is described in the following whitepaper

  • Christ, M., Kempa-Liehr, A.W. and Feindt, M. (2017). Distributed and parallel time series feature extraction for industrial big data applications. ArXiv e-print 1610.07717, https://arxiv.org/abs/1610.07717.

Advantages of tsfresh

TSFRESH has several selling points, for example

  1. it is field tested

  2. it is unit tested

  3. the filtering process is statistically/mathematically correct

  4. it has a comprehensive documentation

  5. it is compatible with sklearn, pandas and numpy

  6. it allows anyone to easily add their favorite features

Next steps

If you are interested in the technical workings, go to see our comprehensive Read-The-Docs documentation at http://tsfresh.readthedocs.io.

The algorithm, especially the filtering part are also described in the paper mentioned above.

If you have some questions or feedback you can find the developers in the gitter chatroom.

We appreciate any contributions, if you are interested in helping us to make TSFRESH the biggest archive of feature extraction methods in python, just head over to our How-To-Contribute instructions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tsfresh-0.11.1.tar.gz (3.4 MB view details)

Uploaded Source

Built Distribution

tsfresh-0.11.1-py2.py3-none-any.whl (1.2 MB view details)

Uploaded Python 2 Python 3

File details

Details for the file tsfresh-0.11.1.tar.gz.

File metadata

  • Download URL: tsfresh-0.11.1.tar.gz
  • Upload date:
  • Size: 3.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.5.2

File hashes

Hashes for tsfresh-0.11.1.tar.gz
Algorithm Hash digest
SHA256 f380d8dadc14ef854cb154d301e7c773c499097c1b24c572443d929261fc45dc
MD5 1f4b87c3f3ff5394a23b81ac9a67747d
BLAKE2b-256 3bea0da7e36adb9a005a9f07a82c950fd8542c6677068f5a1269db159576d2cf

See more details on using hashes here.

File details

Details for the file tsfresh-0.11.1-py2.py3-none-any.whl.

File metadata

  • Download URL: tsfresh-0.11.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.5.2

File hashes

Hashes for tsfresh-0.11.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 50c49495d4c4b1b1d6eedb1e49274e1cf47f63d51e3347c0403f3cc2cd0d17f3
MD5 0b9a8178d2eebf4a0c4501a15d266baf
BLAKE2b-256 2f32265c651f4fd70751f5ada348af0f9e322b058eddcda6a6f9bb305c8d270a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page