datapackage

Utilities to work with Data Packages as defined on dataprotocols.org

These details have been verified by PyPI

Maintainers

akariv okfn pwalsh roll tryggvib

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
- Information Technology
License
- OSI Approved :: MIT License
Programming Language
Topic
- Utilities

Project description

# DataPackage.py

[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
[![Build Status](https://travis-ci.org/frictionlessdata/datapackage-py.svg?branch=master)](https://travis-ci.org/frictionlessdata/datapackage-py)
[![Windows Build Status](https://ci.appveyor.com/api/projects/status/github/frictionlessdata/datapackage-py?branch=master&svg=true)](https://ci.appveyor.com/project/vitorbaptista/datapackage-py)
[![Test Coverage](https://coveralls.io/repos/frictionlessdata/datapackage-py/badge.svg?branch=master&service=github)](https://coveralls.io/github/frictionlessdata/datapackage-py)
![Support Python versions 2.7, 3.3, 3.4 and 3.5](https://img.shields.io/badge/python-2.7%2C%203.3%2C%203.4%2C%203.5-blue.svg)

A model for working with [Data Packages].

[Data Packages]: http://dataprotocols.org/data-packages/

## Install

```
pip install datapackage
```

## Examples

### Reading a Data Package and its resource

```python
import datapackage

dp = datapackage.DataPackage('http://data.okfn.org/data/core/gdp/datapackage.json')
brazil_gdp = [{'Year': int(row['Year']), 'Value': float(row['Value'])}
for row in dp.resources[0].data if row['Country Code'] == 'BRA']

max_gdp = max(brazil_gdp, key=lambda x: x['Value'])
min_gdp = min(brazil_gdp, key=lambda x: x['Value'])
percentual_increase = max_gdp['Value'] / min_gdp['Value']

msg = (
'The highest Brazilian GDP occured in {max_gdp_year}, when it peaked at US$ '
'{max_gdp:1,.0f}. This was {percentual_increase:1,.2f}% more than its '
'minimum GDP in {min_gdp_year}.'
).format(max_gdp_year=max_gdp['Year'],
max_gdp=max_gdp['Value'],
percentual_increase=percentual_increase,
min_gdp_year=min_gdp['Year'])

print(msg)
# The highest Brazilian GDP occured in 2011, when it peaked at US$ 2,615,189,973,181. This was 172.44% more than its minimum GDP in 1960.
```

### Validating a Data Package

```python
import datapackage

dp = datapackage.DataPackage('http://data.okfn.org/data/core/gdp/datapackage.json')
try:
dp.validate()
except datapackage.exceptions.ValidationError as e:
# Handle the ValidationError
pass
```

### Retrieving all validation errors from a Data Package

```python
import datapackage

# This descriptor has two errors:
# * It has no "name", which is required;
# * Its resource has no "data", "path" or "url".
descriptor = {
'resources': [
{},
]
}

dp = datapackage.DataPackage(descriptor)

for error in dp.iter_errors():
# Handle error
```

### Creating a Data Package

```python
import datapackage

dp = datapackage.DataPackage()
dp.descriptor['name'] = 'my_sleep_duration'
dp.descriptor['resources'] = [
{'name': 'data'}
]

resource = dp.resources[0]
resource.descriptor['data'] = [
7, 8, 5, 6, 9, 7, 8
]

with open('datapackage.json', 'w') as f:
f.write(dp.to_json())
# {"name": "my_sleep_duration", "resources": [{"data": [7, 8, 5, 6, 9, 7, 8], "name": "data"}]}
```

### Using a schema that's not in the local cache

```python
import datapackage
import datapackage.registry

# This constant points to the official registry URL
# You can use any URL or path that points to a registry CSV
registry_url = datapackage.registry.Registry.DEFAULT_REGISTRY_URL
registry = datapackage.registry.Registry(registry_url)

descriptor = {} # The datapackage.json file
schema = registry.get('tabular') # Change to your schema ID

dp = datapackage.DataPackage(descriptor, schema)
```

### Push/pull Data Package to storage

Package provides `push_datapackage` and `pull_datapackage` utilities to
push and pull to/from storage.

This functionality requires `jsontableschema` storage plugin installed. See
[plugins](#https://github.com/frictionlessdata/jsontableschema-py#plugins)
section of `jsontableschema` docs for more information. Let's imagine
we have installed `jsontableschema-mystorage` (not a real name) plugin.

Then we could push and pull datapackage to/from the storage:

> All parameters should be used as keyword arguments.

```python
from datapackage import push_datapackage, pull_datapackage

# Push
push_datapackage(
descriptor='descriptor_path',
backend='mystorage', **<mystorage_options>)

# Import
pull_datapackage(
descriptor='descriptor_path', name='datapackage_name',
backend='mystorage', **<mystorage_options>)
```

Options could be a SQLAlchemy engine or a BigQuery project and dataset name etc.
Detailed description you could find in a concrete plugin documentation.

See concrete examples in
[plugins](#https://github.com/frictionlessdata/jsontableschema-py#plugins)
section of `jsontableschema` docs.

## Developer notes

These notes are intended to help people that want to contribute to this
package itself. If you just want to use it, you can safely ignore them.

### Updating the local schemas cache

We cache the schemas from <https://github.com/dataprotocols/schemas>
using git-subtree. To update it, use:

git subtree pull --prefix datapackage/schemas https://github.com/dataprotocols/schemas.git master --squash

Project details

These details have been verified by PyPI

Maintainers

akariv okfn pwalsh roll tryggvib

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Intended Audience
- Developers
- Information Technology
License
- OSI Approved :: MIT License
Programming Language
Topic
- Utilities

Release history Release notifications | RSS feed

1.15.4

Mar 12, 2024

1.15.2

Feb 24, 2021

1.15.1

Sep 26, 2020

1.15.0

Aug 14, 2020

1.14.1

Jun 9, 2020

1.14.0

May 20, 2020

1.13.1

May 18, 2020

1.13.0

Mar 31, 2020

1.12.2

Mar 27, 2020

1.12.1

Mar 27, 2020

1.12.0

Mar 25, 2020

1.11.1

Feb 12, 2020

1.11.0

Dec 17, 2019

1.10.2

Dec 17, 2019

1.10.1

Dec 15, 2019

1.10.0

Oct 31, 2019

1.9.3

Oct 21, 2019

1.9.2

Sep 18, 2019

1.9.1

Sep 18, 2019

1.9.0

Sep 3, 2019

1.8.0

Aug 27, 2019

1.7.0

Aug 16, 2019

1.6.2

Jun 7, 2019

1.6.1

Jun 7, 2019

1.6.0

Apr 11, 2019

1.5.2

Apr 11, 2019

1.5.1

Oct 18, 2018

1.5.0

Oct 16, 2018

1.4.0

Oct 12, 2018

1.3.5

Oct 8, 2018

1.3.4

Sep 4, 2018

1.3.3

Sep 3, 2018

1.3.2

Jul 12, 2018

1.3.1

Jul 6, 2018

1.2.6

Apr 6, 2018

1.2.5

Apr 6, 2018

1.2.4

Apr 6, 2018

1.2.3

Mar 22, 2018

1.2.2

Mar 6, 2018

1.2.1

Feb 12, 2018

1.2.0

Jan 18, 2018

1.1.5

Dec 20, 2017

1.1.4

Oct 12, 2017

1.1.3

Oct 11, 2017

1.1.2

Oct 3, 2017

1.1.1

Oct 2, 2017

1.1.0

Oct 1, 2017

1.0.5

Sep 28, 2017

1.0.4

Sep 20, 2017

1.0.3

Sep 18, 2017

1.0.2

Sep 11, 2017

1.0.1

Sep 6, 2017

1.0.0

Sep 4, 2017

1.0.0a14 pre-release

Aug 31, 2017

1.0.0a13 pre-release

Aug 30, 2017

1.0.0a12 pre-release

Aug 29, 2017

1.0.0a11 pre-release

Aug 22, 2017

1.0.0a10 pre-release

Aug 21, 2017

1.0.0a9 pre-release

Aug 21, 2017

1.0.0a8 pre-release

Aug 21, 2017

1.0.0a6 pre-release

May 25, 2017

1.0.0a5 pre-release

May 13, 2017

1.0.0a4 pre-release

May 3, 2017

1.0.0a3 pre-release

May 3, 2017

1.0.0a2 pre-release

Apr 18, 2017

0.8.9

May 25, 2017

0.8.8

Mar 31, 2017

0.8.7

Mar 2, 2017

0.8.6

Feb 7, 2017

0.8.5

Dec 13, 2016

0.8.4

Oct 28, 2016

0.8.1

Aug 16, 2016

0.8.0

Jul 18, 2016

This version

0.7.0

Jul 18, 2016

0.6.1

May 24, 2016

0.6.0

May 3, 2016

0.5.4

Sep 10, 2015

0.5.3

Aug 21, 2015

0.5.2

Apr 29, 2015

0.5.1

Apr 9, 2015

0.5.0

Apr 9, 2015

0.4.3

Nov 4, 2014

0.4.2

Oct 28, 2014

0.4.1

Oct 1, 2014

0.4.0

Oct 1, 2014

0.3.1

Aug 11, 2014

0.3.0

Jan 21, 2014

0.2.1

Oct 11, 2013

0.2.0

Oct 10, 2013

0.1.3

Oct 10, 2013

0.1.2

Jun 28, 2013

0.1.1

Jun 24, 2013

0.1.0

Jun 6, 2013

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datapackage-0.7.0.tar.gz (20.9 kB view details)

Uploaded Jul 18, 2016 Source

File details

Details for the file datapackage-0.7.0.tar.gz.

File metadata

Download URL: datapackage-0.7.0.tar.gz
Upload date: Jul 18, 2016
Size: 20.9 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for datapackage-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`4e8648e0245d0c4bb967140b82780a865a1d4c70662f2df56f6c09b4fb53d200`
MD5	`8888181ac68f31786cac65eef8b5f3b2`
BLAKE2b-256	`4f75c81088897f8e021725872cdd080610e1c7ad3296648a9d4e2adc59fa75d1`