Skip to main content

A utility library that provides a consistent interface for reading tabular data.

Project description

tabulator-py
============

`|Travis| <https://travis-ci.org/frictionlessdata/tabulator-py>`_
`|Coveralls| <https://coveralls.io/r/frictionlessdata/tabulator-py?branch=master>`_
`|PyPi| <https://pypi-hypernode.com/pypi/tabulator>`_
`|Gitter| <https://gitter.im/frictionlessdata/chat>`_

A utility library that provides a consistent interface for reading
tabular data.

Getting Started
---------------

Installation
~~~~~~~~~~~~

To get started (under development):

::

$ pip install tabulator

Simple interface
~~~~~~~~~~~~~~~~

Fast access to the table with ``topen`` (stands for ``table open``)
function:

::

from tabulator import topen, processors

with topen('path.csv', with_headers=True) as table:
for row in table:
print(row)
print(row.get('header'))

For the most use cases ``topen`` function is enough. It takes the
``source`` argument:

``<scheme>://path/to/file.<format>`` and uses corresponding ``Loader``
and ``Parser`` to open and start to iterate over the table. Also user
can pass ``scheme`` and ``format`` explicitly as function arguments. The
last ``topen`` argument is ``encoding`` - user can force Tabulator to
use encoding of choice to open the table.

Read more about ``topen`` -
`documentation <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/topen.py>`_.

Function ``topen`` returns ``Table`` instance. We use context manager to
call ``table.open()`` on enter and ``table.close()`` when we exit: -
table can be iterated like file-like object returning row by row - table
can be read row by bow using ``readrow`` method (it returns row tuple) -
table can be read into memory using ``read`` function (return list or
row tuples) with ``limit`` of output rows as parameter. - headers can be
accessed via ``headers`` property - table pointer can be set to start
via ``reset`` method.

Read more about ``Table`` -
`documentation <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/table.py>`_.

In the example above we use ``processors.Headers`` to extract headers
from the table (via ``with_headers=True`` shortcut). Processors is a
powerfull Tabulator concept. Parsed data goes thru pipeline of
processors to be updated before returning as table row.

Read more about ``Processor`` -
`documentation <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/processors/api.py>`_.

Read a processors tutorial -
`tutorial <https://github.com/frictionlessdata/tabulator-py/blob/master/docs/processors.md>`_.

Advanced interface
~~~~~~~~~~~~~~~~~~

To get full control over the process you can use more parameters. Below
all parts of Tabulator are presented:

::

from tabulator import topen, processors, loaders, parsers

table = topen('path.csv',
loader_options={'encondig': 'utf-8'},
parser_options={'delimeter': ',', quotechar: '|'},
loader_class=loaders.File,
parser_class=parsers.CSV,
iterator_class=CustomIterator,
table_class=CustomTable)
table.add_processor(processors.Headers(skip=1))
headers = table.headers
contents = table.read(limit=10)
print(headers, contents)
table.close()

Also ``Table`` class can be instantiated by user (see documentation).
But there is no difference between it and ``topen`` call with extended
list of parameters except ``topen`` also calls the ``table.open()``
method.

Design Overview
---------------

Tabulator uses modular architecture to be fully extensible and flexible.
It uses loosely coupled modules like ``Loader``, ``Parser`` and
``Processor`` to provide clear data flow.

.. figure:: docs/diagram.png
:align: center
:alt: diagram

diagram
Documentation
-------------

API documentation is presented as docstrings: - High-level: -
`topen <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/topen.py>`_
- Core elements: -
`Row <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/row.py>`_
-
`Table <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/table.py>`_
-
`Iterator <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/iterator.py>`_
- Plugin elements: - `Loader
API <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/loaders/api.py>`_
- `Parser
API <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/parsers/api.py>`_
- `Processor
API <https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/processors/api.py>`_

Contributing
------------

Please read the contribution guideline:

`How to Contribute <CONTRIBUTING.md>`_

Thanks!

.. |Travis| image:: https://img.shields.io/travis/frictionlessdata/tabulator-py/master.svg
.. |Coveralls| image:: http://img.shields.io/coveralls/frictionlessdata/tabulator-py.svg?branch=master
.. |PyPi| image:: https://img.shields.io/pypi/v/tabulator.svg
.. |Gitter| image:: https://img.shields.io/gitter/room/frictionlessdata/chat.svg

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabulator-0.3.12.tar.gz (18.1 kB view details)

Uploaded Source

Built Distribution

tabulator-0.3.12-py2.py3-none-any.whl (33.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tabulator-0.3.12.tar.gz.

File metadata

  • Download URL: tabulator-0.3.12.tar.gz
  • Upload date:
  • Size: 18.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tabulator-0.3.12.tar.gz
Algorithm Hash digest
SHA256 71439890f65785b0c5ca8fe537e59fdab86a7a91b49db64673643058b2cd89ac
MD5 9f4deffe4a36d19399b5762f98fb7229
BLAKE2b-256 034ef9b900e23c2386368707698b5d471cbf7fcc197400eb33daf9aa15b65a66

See more details on using hashes here.

Provenance

File details

Details for the file tabulator-0.3.12-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for tabulator-0.3.12-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e983e9b0090b8ca4b76b011dd8d0b8d3f9487a523c791d0fe7b259b9919eca31
MD5 22125fa78aff7421dcf24b5407ec192d
BLAKE2b-256 b59ffa7cf6f5519bf9ede1f12bb00ab5321ff92849018a3c4b6b31cf60d865b2

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page