Skip to main content

Consistent interface for stream reading and writing tabular data (csv/xls/json/etc)

Project description

Travis
Coveralls
PyPi
Gitter

Consistent interface for stream reading and writing tabular data (csv/xls/json/etc).

Features

  • supports various formats: csv/tsv/xls/xlsx/json/native/etc

  • reads data from variables, filesystem or Internet

  • streams data instead of using a lot of memory

  • processes data via simple user processors

  • saves data using the same interface

Getting Started

Installation

To get started:

$ pip install tabulator

Quick Start

Open tabular stream from csv source:

from tabulator import Stream

with Stream('path.csv', headers=1) as stream:
    for row in stream:
        print(row)  # will print row values list

Stream takes the source argument:

<scheme>://path/to/file.<format>

and uses corresponding Loader and Parser to open and start to iterate over the tabular stream. Also user can pass scheme and format explicitly as constructor arguments. User can force Tabulator to use encoding of choice to open the table passing encoding argument.

In this example we use context manager to call stream.open() on enter and stream.close() when we exit:

  • stream can be iterated like file-like object returning row by row

  • stream can be used for manual iterating with iter(keyed/extended) function

  • stream can be read into memory using read(keyed/extended) function with row count limit

  • headers can be accessed via headers property

  • rows sample can be accessed via sample property

  • stream pointer can be set to start via reset method

  • stream could be saved to filesystem using save method

Advanced Usage

To get full control over the process you can use more parameters. Below the more expanded example is presented:

from tabulator import Stream

def skip_even_rows(extended_rows):
    for number, headers, row in extended_rows:
        if number % 2:
            yield (number, headers, row)

stream = Stream('http://example.com/source.xls',
    headers=1, encoding='utf-8', sample_size=1000,
    post_parse=[skip_even_rows], parser_options={delimeter': ',', quotechar: '|'})
stream.open()
print(stream.sample)  # will print sample
print(stream.headers)  # will print headers list
print(stream.read(limit=10))  # will print 10 rows
stream.reset()
for keyed_row in stream.iter(keyed=True):
    print keyed_row  # will print row dict
for extended_row in stream.iter(extended=True):
    print extended_row  # will print (number, headers, row)
stream.reset()
stream.save('target.csv')
stream.close()

Read more

Thanks!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabulator-0.7.1.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

tabulator-0.7.1-py2.py3-none-any.whl (27.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tabulator-0.7.1.tar.gz.

File metadata

  • Download URL: tabulator-0.7.1.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tabulator-0.7.1.tar.gz
Algorithm Hash digest
SHA256 7fb0c1f7ac6b047173a11c2d7029855a5c0ecd0d893c6d18824946d0b9b60ba4
MD5 ffa117920b5bcc10be94404d03d9cb3a
BLAKE2b-256 525120c0200fd22e910d5189bb829cae80f9471e41099ab2f650e86d6d85ac10

See more details on using hashes here.

Provenance

File details

Details for the file tabulator-0.7.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for tabulator-0.7.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 34b44320e4d7dc211a880b3c48962d9a8ffc77272eb1399da80c6fc112e1cb8a
MD5 937fd191d3ad643b3b7c75f08dab6ebe
BLAKE2b-256 103acd16310ec7a5832d8f86ce108581f524c5a6527f3b4326525f7ae667b674

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page