Consistent interface for stream reading and writing tabular data (csv/xls/json/etc)
Project description
Consistent interface for stream reading and writing tabular data (csv/xls/json/etc).
Features
supports various formats: csv/tsv/xls/xlsx/json/native/etc
reads data from variables, filesystem or Internet
streams data instead of using a lot of memory
processes data via simple user processors
saves data using the same interface
Getting Started
Installation
To get started:
$ pip install tabulator
Example
Open tabular stream from csv source:
from tabulator import Stream
with Stream('path.csv', headers=1) as stream:
print(stream.headers) # will print headers from 1 row
for row in stream:
print(row) # will print row values list
Stream
Stream takes the source argument:
<scheme>://path/to/file.<format>
and uses corresponding Loader and Parser to open and start to iterate over the tabular stream. Also user can pass scheme and format explicitly as constructor arguments. User can force Tabulator to use encoding of choice to open the table passing encoding argument.
In this example we use context manager to call stream.open() on enter and stream.close() when we exit:
stream can be iterated like file-like object returning row by row
stream can be used for manual iterating with iter(keyed/extended) function
stream can be read into memory using read(keyed/extended) function with row count limit
headers can be accessed via headers property
rows sample can be accessed via sample property
stream pointer can be set to start via reset method
stream could be saved to filesystem using save method
Below the more expanded example is presented:
from tabulator import Stream
def skip_even_rows(extended_rows):
for number, headers, row in extended_rows:
if number % 2:
yield (number, headers, row)
stream = Stream('http://example.com/source.xls',
headers=1, encoding='utf-8', sample_size=1000,
post_parse=[skip_even_rows], sheet=1)
stream.open()
print(stream.sample) # will print sample
print(stream.headers) # will print headers list
print(stream.read(limit=10)) # will print 10 rows
stream.reset()
for keyed_row in stream.iter(keyed=True):
print keyed_row # will print row dict
for extended_row in stream.iter(extended=True):
print extended_row # will print (number, headers, row)
stream.reset()
stream.save('target.csv')
stream.close()
For the full list of options see - https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/stream.py#L17
API Reference
Snapshot
Stream(source, headers=None, scheme=None, format=None, encoding=None, post_parse=None, sample_size=None, **options) closed/open/close/reset headers -> list sample -> rows iter(keyed/extended=False) -> (generator) (keyed/extended)row[] read(keyed/extended=False, limit=None) -> (keyed/extended)row[] save(target, format=None, encoding=None, **options) exceptions ~cli
Detailed
Contributing
Please read the contribution guideline:
Thanks!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tabulator-0.8.0.tar.gz
.
File metadata
- Download URL: tabulator-0.8.0.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 66754ffb64f54cb467ffe89fb01aa2fbe163e1a7c63c77959525415e1d865d45 |
|
MD5 | 0f682cfcb30a434dc0f3bd6e5869e012 |
|
BLAKE2b-256 | 67ecccc945a97afe5416b667cb480e35e67f9346c385673256742d571dddc9c2 |
Provenance
File details
Details for the file tabulator-0.8.0-py2.py3-none-any.whl
.
File metadata
- Download URL: tabulator-0.8.0-py2.py3-none-any.whl
- Upload date:
- Size: 28.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 069cc8f150f423fae80bb28aed664c51c19e9f3431f0531a597ef01e2a410cb6 |
|
MD5 | afcb03bc78ebc5f36dd4043a5b712b39 |
|
BLAKE2b-256 | c1a5d52b6a10885e3d2d047f1f1b7858f7dc3163a8558af899740573cc285cf8 |