A Python package for offline access to Vega datasets
Project description
# vega_datasets
[![build status](http://img.shields.io/travis/altair-viz/vega_datasets/master.svg?style=flat)](https://travis-ci.org/altair-viz/vega_datasets)
A Python package for offline access to [vega datasets](https://github.com/vega/vega-datasets).
This package has several goals:
- Provide straightforward access in Python to the datasets made available at [vega-datasets](https://github.com/vega/vega-datasets).
- return the results in the form of a Pandas dataframe.
- wherever dataset size and/or license constraints make it possible, bundle the dataset with the package so that datasets can be loaded in the absence of a web connection.
Currently the package bundles a half-dozen datasets, and falls back to using HTTP requests for the others.
## Installation
```
$ pip install vega_datasets
```
## Usage
The main object in this library is ``data``:
```python
>>> from vega_datasets import data
```
It contains attributes that access all available datasets, locally if
available. For example, here is the well-known iris dataset:
```python
>>> df = data.iris()
>>> df.head()
petalLength petalWidth sepalLength sepalWidth species
0 1.4 0.2 5.1 3.5 setosa
1 1.4 0.2 4.9 3.0 setosa
2 1.3 0.2 4.7 3.2 setosa
3 1.5 0.2 4.6 3.1 setosa
4 1.4 0.2 5.0 3.6 setosa
```
If you're curious about the source data, you can access the URL for any of the available datasets:
```python
>>> data.iris.url
'https://vega.github.io/vega-datasets/data/iris.json'
```
For datasets bundled with the package, you can also find their location on disk:
```python
>>> data.iris.filepath
'/lib/python3.6/site-packages/vega_datasets/data/iris.json'
```
## Available Datasets
To list all the available datsets, use ``list_datasets``:
```python
>>> data.list_datasets()
['7zip', 'airports', 'anscombe', 'barley', 'birdstrikes', 'budget', 'budgets', 'burtin', 'cars', 'climate', 'co2-concentration', 'countries', 'crimea', 'disasters', 'driving', 'earthquakes', 'ffox', 'flare', 'flare-dependencies', 'flights-10k', 'flights-200k', 'flights-20k', 'flights-2k', 'flights-3m', 'flights-5k', 'flights-airport', 'gapminder', 'gapminder-health-income', 'gimp', 'github', 'graticule', 'income', 'iris', 'jobs', 'londonBoroughs', 'londonCentroids', 'londonTubeLines', 'lookup_groups', 'lookup_people', 'miserables', 'monarchs', 'movies', 'normal-2d', 'obesity', 'points', 'population', 'population_engineers_hurricanes', 'seattle-temps', 'seattle-weather', 'sf-temps', 'sp500', 'stocks', 'udistrict', 'unemployment', 'unemployment-across-industries', 'us-10m', 'us-employment', 'us-state-capitals', 'weather', 'weball26', 'wheat', 'world-110m', 'zipcodes']
```
To list local datasets (i.e. those that are bundled with the package and can be used without a web connection), use the ``local_data`` object instead:
```python
>>> from vega_datasets import local_data
>>> local_data.list_datasets()
['airports', 'anscombe', 'barley', 'burtin', 'cars', 'crimea', 'driving', 'iowa-electricity', 'iris', 'seattle-temps', 'seattle-weather', 'sf-temps', 'stocks', 'us-employment',]
```
We plan to add more local datasets in the future, subject to size and licensing constraints. See the [local datasets issue](https://github.com/altair-viz/vega_datasets/issues/1) if you would like to help with this.
## Dataset Information
If you want more information about any dataset, you can use the ``description`` property:
```python
>>> data.iris.description
'This classic dataset contains lengths and widths of petals and sepals for 150 iris flowers, drawn from three species. It was introduced by R.A. Fisher in 1936 [1]_.'
```
This information is also part of the ``data.iris`` doc string.
Descriptions are not yet included for all the datasets in the package; we hope to add more information on this in the future.
[![build status](http://img.shields.io/travis/altair-viz/vega_datasets/master.svg?style=flat)](https://travis-ci.org/altair-viz/vega_datasets)
A Python package for offline access to [vega datasets](https://github.com/vega/vega-datasets).
This package has several goals:
- Provide straightforward access in Python to the datasets made available at [vega-datasets](https://github.com/vega/vega-datasets).
- return the results in the form of a Pandas dataframe.
- wherever dataset size and/or license constraints make it possible, bundle the dataset with the package so that datasets can be loaded in the absence of a web connection.
Currently the package bundles a half-dozen datasets, and falls back to using HTTP requests for the others.
## Installation
```
$ pip install vega_datasets
```
## Usage
The main object in this library is ``data``:
```python
>>> from vega_datasets import data
```
It contains attributes that access all available datasets, locally if
available. For example, here is the well-known iris dataset:
```python
>>> df = data.iris()
>>> df.head()
petalLength petalWidth sepalLength sepalWidth species
0 1.4 0.2 5.1 3.5 setosa
1 1.4 0.2 4.9 3.0 setosa
2 1.3 0.2 4.7 3.2 setosa
3 1.5 0.2 4.6 3.1 setosa
4 1.4 0.2 5.0 3.6 setosa
```
If you're curious about the source data, you can access the URL for any of the available datasets:
```python
>>> data.iris.url
'https://vega.github.io/vega-datasets/data/iris.json'
```
For datasets bundled with the package, you can also find their location on disk:
```python
>>> data.iris.filepath
'/lib/python3.6/site-packages/vega_datasets/data/iris.json'
```
## Available Datasets
To list all the available datsets, use ``list_datasets``:
```python
>>> data.list_datasets()
['7zip', 'airports', 'anscombe', 'barley', 'birdstrikes', 'budget', 'budgets', 'burtin', 'cars', 'climate', 'co2-concentration', 'countries', 'crimea', 'disasters', 'driving', 'earthquakes', 'ffox', 'flare', 'flare-dependencies', 'flights-10k', 'flights-200k', 'flights-20k', 'flights-2k', 'flights-3m', 'flights-5k', 'flights-airport', 'gapminder', 'gapminder-health-income', 'gimp', 'github', 'graticule', 'income', 'iris', 'jobs', 'londonBoroughs', 'londonCentroids', 'londonTubeLines', 'lookup_groups', 'lookup_people', 'miserables', 'monarchs', 'movies', 'normal-2d', 'obesity', 'points', 'population', 'population_engineers_hurricanes', 'seattle-temps', 'seattle-weather', 'sf-temps', 'sp500', 'stocks', 'udistrict', 'unemployment', 'unemployment-across-industries', 'us-10m', 'us-employment', 'us-state-capitals', 'weather', 'weball26', 'wheat', 'world-110m', 'zipcodes']
```
To list local datasets (i.e. those that are bundled with the package and can be used without a web connection), use the ``local_data`` object instead:
```python
>>> from vega_datasets import local_data
>>> local_data.list_datasets()
['airports', 'anscombe', 'barley', 'burtin', 'cars', 'crimea', 'driving', 'iowa-electricity', 'iris', 'seattle-temps', 'seattle-weather', 'sf-temps', 'stocks', 'us-employment',]
```
We plan to add more local datasets in the future, subject to size and licensing constraints. See the [local datasets issue](https://github.com/altair-viz/vega_datasets/issues/1) if you would like to help with this.
## Dataset Information
If you want more information about any dataset, you can use the ``description`` property:
```python
>>> data.iris.description
'This classic dataset contains lengths and widths of petals and sepals for 150 iris flowers, drawn from three species. It was introduced by R.A. Fisher in 1936 [1]_.'
```
This information is also part of the ``data.iris`` doc string.
Descriptions are not yet included for all the datasets in the package; we hope to add more information on this in the future.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
vega_datasets-0.6.0.tar.gz
(211.9 kB
view details)
Built Distribution
File details
Details for the file vega_datasets-0.6.0.tar.gz
.
File metadata
- Download URL: vega_datasets-0.6.0.tar.gz
- Upload date:
- Size: 211.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.14.2 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.23.3 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a38d25bcb9dbf06d4b8adfa25fc4fed19f329e3be4d9b55ebd1395574c2bf30b |
|
MD5 | 5c63c3aa7c3a334b80d89ea8f0f9eea1 |
|
BLAKE2b-256 | 23af021b51c73c4676544fdf47e80f82b2ed5f96e5717e4fb9d611dd661420ce |
Provenance
File details
Details for the file vega_datasets-0.6.0-py2.py3-none-any.whl
.
File metadata
- Download URL: vega_datasets-0.6.0-py2.py3-none-any.whl
- Upload date:
- Size: 211.0 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.14.2 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.23.3 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 278d49d1054f85ca562348460bacffffe3c0c1fb22f78793f2e9bbb58b422677 |
|
MD5 | 75826065b048ca1abd8a9f5d12a6247b |
|
BLAKE2b-256 | 8daa1f9f6e3b2c0632660ed95af2746a0e91a4740117f1985261f30a7e7a97bc |