Skip to main content

A python engine for evaluating Altair transforms.

Project description

altair-transform

Python evaluation of Altair/Vega-Lite transforms.

build status code style black

altair-transform requires Python 3.6 or later. Install with:

$ pip install altair_transform

Altair-transform evaluates Altair and Vega-Lite transforms directly in Python. This can be useful in a number of contexts, illustrated in the examples below.

Example: Extracting Data

The Vega-Lite specification includes the ability to apply a wide range of transformations to input data within the chart specification. As an example, here is a sliding window average of a Gaussian random walk, implemented in Altair:

import altair as alt
import numpy as np
import pandas as pd

rand = np.random.RandomState(12345)

df = pd.DataFrame({
    'x': np.arange(200),
    'y': rand.randn(200).cumsum()
})

points = alt.Chart(df).mark_point().encode(
    x='x:Q',
    y='y:Q'
)

line = alt.Chart(df).transform_window(
    ymean='mean(y)',
    sort=[alt.SortField('x')],
    frame=[5, 5]
).mark_line(color='red').encode(
    x='x:Q',
    y='ymean:Q'
)

points + line

Altair Visualization

Because the transform is encoded within the renderer, however, the computed values are not directly accessible from the Python layer.

This is where altair_transform comes in. It includes a (nearly) complete Python implementation of Vega-Lite's transform layer, so that you can easily extract a pandas dataframe with the computed values shown in the chart:

from altair_transform import extract_data
data = extract_data(line)
data.head()
x y ymean
0 0 -0.204708 0.457749
1 1 0.274236 0.771093
2 2 -0.245203 1.041320
3 3 -0.800933 1.336943
4 4 1.164847 1.698085

From here, you can work with the transformed data directly in Python.

Example: Pre-Aggregating Large Datasets

Altair creates chart specifications containing the full dataset. The advantage of this is that the data used to make the chart is entirely transparent; the disadvantage is that it causes issues as datasets grow large. To prevent users from inadvertently crashing their browsers by trying to send too much data to the frontend, Altair limits the data size by default. For example, a histogram of 20000 points:

import altair as alt
import pandas as pd
import numpy as np

np.random.seed(12345)

df = pd.DataFrame({
    'x': np.random.randn(20000)
})
chart = alt.Chart(df).mark_bar().encode(
    alt.X('x', bin=True),
    y='count()'
)
chart
MaxRowsError: The number of rows in your dataset is greater than the maximum allowed (5000). For information on how to plot larger datasets in Altair, see the documentation

There are several possible ways around this, as mentioned in Altair's FAQ. Altiar-transform provides another option via the transform_chart() function, which will pre-transform the data according to the chart specification, so that the final chart specification holds the aggregated data rather than the full dataset:

from altair_transform import transform_chart
new_chart = transform_chart(chart)
new_chart

Altair Visualization

Examining the new chart specification, we can see that it contains the pre-aggregated dataset:

new_chart.data
x_binned x_binned2 count
0 -4.0 -3.0 29
1 -3.0 -2.0 444
2 -2.0 -1.0 2703
3 -1.0 0.0 6815
4 0.0 1.0 6858
5 1.0 2.0 2706
6 2.0 3.0 423
7 3.0 4.0 22

Limitations

altair_transform currently works only for non-compound charts; that is, it cannot transform or extract data from layered, faceted, repeated, or concatenated charts.

There are also a number of less-used transform options that are not yet fully supported. These should explicitly raise a NotImplementedError if you attempt to use them.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

altair_transform-0.2.0.tar.gz (42.3 kB view details)

Uploaded Source

Built Distribution

altair_transform-0.2.0-py2.py3-none-any.whl (51.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file altair_transform-0.2.0.tar.gz.

File metadata

  • Download URL: altair_transform-0.2.0.tar.gz
  • Upload date:
  • Size: 42.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.2

File hashes

Hashes for altair_transform-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b348c94a3a71254baac626635f38300121dc61a98eb46b25568a6aecc4202d66
MD5 7f7e4e531d4fe72b3b024b7253a16e34
BLAKE2b-256 d015f84008e6cd08d0f99976091cd82e1b39796f5abdbcfc2d6890261fdb1c87

See more details on using hashes here.

File details

Details for the file altair_transform-0.2.0-py2.py3-none-any.whl.

File metadata

  • Download URL: altair_transform-0.2.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 51.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.2

File hashes

Hashes for altair_transform-0.2.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5c69161dd423b0f6a456abe765dbfb2ab7d5e9cef3a77b91e185179c6da328b2
MD5 40b6e748d434bfc2b0ebe0a8f45a6e0a
BLAKE2b-256 11909a08fee17264b0ff2a4a563b0412adecf46086c350bc2fe6df60e64ed422

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page