Scalable time series features computation
Project description
FastTSFeatures
Time-series feature extraction as a service. FastTSFeatures is an SDK to compute static, temporal and calendar variables as a service.
The package serves as a wraper for tsfresh and tsfeatures. Since we take care of the whole infrastructure, feature extraction becomes as easy as running a line in your python nootebooks or calling an API.
Why?
We build FastTSFeatures because we wanted and easy and fast way to extract Time Series Features without having to think about infrastructure and deployment. Now we want to see if other Data Scientists find it useful too.
Avaiable Features (More than 600)
Static Features
- 40+ Features: https://github.com/Nixtla/tsfeatures
- 600+ https://github.com/blue-yonder/tsfresh/ Temporal Features
- 10 Temporal Features (lags, mean lags, std_lags) [Currently just supported for daily data] Calendar Features (distance in minutes to holidays)
- Calendar features for 83 Countries https://github.com/dr-prodigy/python-holidays
API
For api documantation visit [PENDING]
Install
pip install fasttsfeatures
How to use
1. Request free trial
Request a free trial sending an email to: fede.garza.ramirez@gmail.com and get your API_KEY, API_ID and private URI
2. Run fasttsfeatures on a public S3 Bucket (Reading and writing permissions needed)
- Import and Instantiate
TSFeatures
. Introduce yourAPI_ID
andAPI_KEY
.
from fasttsfeatures.core import TSFeatures
tsfeatures = TSFeatures(api_id=os.environ['API_ID'],
api_key=os.environ['API_KEY'])
- Run the process introducing the public S3 uri.
#Run Temporal Features
response_tmp_ft = tsfeatures.calculate_temporal_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE",
freq=7))
#Run Static Features
response_static_ft = tsfeatures.calculate_static_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE", freq=7)
#Run Calendar Features
response_cal_ft = tsfeatures.calculate_calendar_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE",
country="USA")
display_df(response)
status | body | id | message | |
---|---|---|---|---|
0 | 200 | "s3://nixtla-user-test/features/features.csv" | f7bdb6dc-dcdb-4d87-87e8-b5428e4c98db | Check job status at GET /tsfeatures/jobs/{job_id} |
- Monitor the process with the following code. Once it's done, access to your bucket to download the generated features.
job_id = response['id'].item()
display(tsfeatures.get_status(job_id))
status | processing_time_seconds | |
---|---|---|
0 | InProgress | 3 |
Once the process is done you will find a file for each process you ran in the URI we provied.
3. Run fasttsfeatures on a private S3 Bucket
To run fasttsfeatures on a private S3 Bucket you have to upload your data to a private S3 Bucket that we will provide for you, you can do this either inside of python or with the AWS Console.
3.1 Case 1: Upload to S3 from python
You will need the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY that we provided.
- Import and Instantiate
TSFeatures
introduce yourAPI_ID
andAPI_KEY
,AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
.
from fasttsfeatures.core import TSFeatures
tsfeatures = TSFeatures(api_id=os.environ['API_ID'],
api_key=os.environ['API_KEY'],
aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])
- Upload your local file introducing its name and the bucket's name (provided by
Nixtla
).
s3_uri = tsfeatures.upload_to_s3('../train.csv', 'PROVIDED URI GOES HERE')
- Run the process introducing the public S3 uri.
#Run Temporal Features
response_tmp_ft = tsfeatures.calculate_temporal_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE",
freq=7))
#Run Static Features
response_static_ft = tsfeatures.calculate_static_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE", freq=7)
#Run Calendar Features
response_cal_ft = tsfeatures.calculate_calendar_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE",
country="USA")
Once the process is done you will find a file for each process you ran in the URI we provied.
3.2 Case 2: Upload to S3 Manually using the S3 Console
A. Upload your dataset
- Access the url provided by
Nixtla
. You'll see a login page like the following. Just enter your user and paswsword.
- Next you'll see the bucket where you can upload your dataset:
- Upload your dataset and copy its S3 URI.
B. Run the process
- Import the library.
from fasttsfeatures.core import TSFeatures
- Instantiate
TSFeatures
introduce yourapi_id
andapi_key
.
tsfeatures = TSFeatures(api_id=os.environ['API_ID'],
api_key=os.environ['API_KEY'])
- Run the process introducing the public S3 uri.
#Run Temporal Features
response_tmp_ft = tsfeatures.calculate_temporal_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE",
freq=7))
#Run Static Features
response_static_ft = tsfeatures.calculate_static_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE", freq=7)
#Run Calendar Features
response_cal_ft = tsfeatures.calculate_calendar_features_from_s3_uri(s3_uri="PUBLIC S3 URI HERE",
country="USA")
display_df(response)
status | body | id | message | |
---|---|---|---|---|
0 | 200 | "s3://tsfeatures-api-public/features/features.csv" | 740a410a-d138-41b4-8373-581710f020f8 | Check job status at GET /tsfeatures/jobs/{job_id} |
- Monitor the process with the following code. Once it's done, access to your bucket to download the generated features.
job_id = response['id'].item()
display(tsfeatures.get_status(job_id))
status | processing_time_seconds | |
---|---|---|
0 | InProgress | 20 |
Once the process is done you will find a file for each process you ran in the URI we provied.
ToDos
- Optimizing writing and reading speed with Parquet files
- Making temporal features available for different granularities
- Fill zeros (For Data where 0 values are not reported, e.g. Retail Data)
- Empirical benchamarking of model improvement
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fasttsfeatures-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ad02ac30344bb87c546f989cf5183ecfa1a6aecfc690fc4cac223dc762fc257 |
|
MD5 | 8bd3ddf39291518394b0487d0791df5c |
|
BLAKE2b-256 | bb1fc1fbd5c7fa26a142144ef803e206d8f01f13766e0391df47140c2dd21885 |