Skip to main content

Generate Pandas data frames, load and extract data, based on JSON Table Schema descriptors.

Project description

Travis
Coveralls
PyPi
Gitter

Generate and load Pandas data frames based on JSON Table Schema descriptors.

Installation

$ pip install datapackage
$ pip install jsontableschema-pandas

Quick start

You can easily load resources from a data package as Pandas data frames by simply using datapackage.push_datapackage function:

>>> import datapackage

>>> data_url = 'http://data.okfn.org/data/core/country-list/datapackage.json'
>>> storage = datapackage.push_datapackage(data_url, 'pandas')

>>> storage.tables
['data___data']

>>> type(storage['data___data'])
<class 'pandas.core.frame.DataFrame'>

>>> storage['data___data'].head()
             Name Code
0     Afghanistan   AF
1   Åland Islands   AX
2         Albania   AL
3         Algeria   DZ
4  American Samoa   AS

Also it is possible to pull your existing data frame into a data package:

>>> datapackage.pull_datapackage('/tmp/datapackage.json', 'country_list', 'pandas', tables={
...     'data': storage['data___data'],
... })
Storage

Tabular Storage

Package implements Tabular Storage interface.

We can get storage this way:

>>> from jsontableschema_pandas import Storage

>>> storage = Storage()

Storage works as a container for Pandas data frames. You can define new data frame inside storage using storage.create method:

>>> storage.create('data', {
...     'primaryKey': 'id',
...     'fields': [
...         {'name': 'id', 'type': 'integer'},
...         {'name': 'comment', 'type': 'string'},
...     ]
... })

>>> storage.tables
['data']

>>> storage['data'].shape
(0, 0)

Use storage.write to populate data frame with data:

>>> storage.write('data', [(1, 'a'), (2, 'b')])

>>> storage['data']
   comment
id
1        a
2        b

Also you can use tabulator to populate data frame from external data file:

>>> import tabulator

>>> with tabulator.topen('data/comments.csv', with_headers=True) as data:
...     storage.write('data', data)

>>> storage['data']
   comment
id
1        a
2        b
1     good

As you see, subsequent writes simply appends new data on top of existing ones.

Contributing

Please read the contribution guideline:

How to Contribute

Thanks!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jsontableschema-pandas-0.1.3.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

jsontableschema_pandas-0.1.3-py2.py3-none-any.whl (8.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file jsontableschema-pandas-0.1.3.tar.gz.

File metadata

File hashes

Hashes for jsontableschema-pandas-0.1.3.tar.gz
Algorithm Hash digest
SHA256 cf6f29ea214723c852a24ba3a981e74729ac5bdbc479dd21d36dbb099cdd5416
MD5 9380a95d454502a97619952f1f038ff2
BLAKE2b-256 71619785d419475539770d2da4a0597f6b6bc902e7bcb2a93e463123cdf51a20

See more details on using hashes here.

Provenance

File details

Details for the file jsontableschema_pandas-0.1.3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for jsontableschema_pandas-0.1.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7a8287f9114002e88a58a2ba783cfc1744c880f82f576dbb8b67f934214fb6c8
MD5 7dc2939af52fca1993e4edd2aa729e3b
BLAKE2b-256 20be760e69cad4189cf0ed92a0394e765a742ac6ef1d219ca61778935b012a69

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page