tabulator

Consistent interface for stream reading and writing tabular data (csv/xls/json/etc)

These details have been verified by PyPI

Maintainers

brew callmealien okfn pwalsh

These details have not been verified by PyPI

Project links

Homepage

Project description

A library for reading and writing tabular data (csv/xls/json/etc).

Version v1.0 includes deprecated API removal and provisional API changes. Please read a migration guide.

Features

supports various formats: csv/tsv/xls/xlsx/json/ndjson/ods/gsheet/inline/sql/etc
reads data from local, remote, stream or text sources
streams data instead of using a lot of memory
processes data via simple user processors
saves data using the same interface
custom loaders, parsers and writers

Getting started

Installation

The package use semantic versioning. It means that major versions could include breaking changes. It’s highly recommended to specify tabulator version range if you use setup.py or requirements.txt file e.g. tabulator<2.0.

$ pip install tabulator # OR "sudo pip install tabulator"

Examples

It’s pretty simple to start with tabulator:

from tabulator import Stream

with Stream('path.csv', headers=1) as stream:
    stream.headers # [header1, header2, ..]
    for row in stream:
        row  # [value1, value2, ..]

There is an examples directory containing other code listings.

Documentation

The whole public API of this package is described here and follows semantic versioning rules. Everyting outside of this readme are private API and could be changed without any notification on any new version.

Stream

The Stream class represents a tabular stream. It takes the source argument in a form of source string or object:

<scheme>://path/to/file.<format>

and uses corresponding Loader and Parser to open and start to iterate over the tabular stream. Also user can pass scheme and format explicitly as constructor arguments. There are also alot other options described in sections below.

Let’s create a simple stream object to read csv file:

from tabulator import Stream

stream = Stream('data.csv')

This action just instantiate a stream instance. There is no actual IO interactions or source validity checks. We need to open the stream object.

stream.open()

This call will validate data source, open underlaying stream and read the data sample (if it’s not disabled). All possible exceptions will be raised on stream.open call not on constructor call.

After work with the stream is done it could be closed:

stream.close()

The Stream class supports Python context manager interface so calls above could be written using with syntax. It’s a common and recommended way to use tabulator stream:

with Stream('data.csv') as stream:
  # use stream

Now we could iterate over rows in our tabular data source. It’s important to understand that tabulator uses underlaying streams not loading it to memory (just one row at time). So the stream.iter() interface is the most effective way to use the stream:

for row in stream.iter():
  row # [value1, value2, ..]

But if you need all the data in one call you could use stream.read() function instead of stream.iter() function. But if you just run it after code snippet above the stream.read() call will return an empty list. That another important following of stream nature of tabulator - the Stream instance just iterates over an underlaying stream. The underlaying stream has internal pointer (for example as file-like object has). So after we’ve iterated over all rows in the first listing the pointer is set to the end of stream.

stream.read() # []

The recommended way is to iterate (or read) over stream just once (and save data to memory if needed). But there is a possibility to reset the steram pointer. For some sources it will not be effective (another HTTP request for remote source). But if you work with local file as a source for example it’s just a cheap file.seek() call:

stream.reset()
stream.read() # [[value1, value2, ..], ..]

The Stream class supports saving tabular data stream to the filesystem. Let’s reset stream again (dont’ forget about the pointer) and save it to the disk:

stream.reset()
stream.save('data-copy.csv')

The full session will be looking like this:

from tabulator import Stream

with Stream('data.csv') as stream:
  for row in stream.iter():
    row # [value1, value2, ..]
  stream.reset()
  stream.read() # [[value1, value2, ..], ..]
  stream.reset()
  stream.save('data-copy.csv')

It’s just a pretty basic Stream introduction. Please read the full documentation below and about Stream arguments in more detail in following sections. There are many other goodies like headers extraction, keyed output, post parse processors and many more!

Stream(source, **options)

Create stream class instance.

source (any) - stream source in a form based on scheme argument
headers (list/int) - headers list or source row number containing headers. If number is given for plain source headers row and all rows before will be removed and for keyed source no rows will be removed. See headers section.
scheme (str) - source scheme with file as default. For the most cases scheme will be inferred from source. See a list of supported schemas below. See schemes section.
format (str) - source format with None (detect) as default. For the most cases format will be inferred from source. See a list of supported formats below. See formats section.
encoding (str) - source encoding with None (detect) as default. See encoding section.
allow_html (bool) - a flag to allow html. See allow html section.
sample_size (int) - rows count for table.sample. Set to “0” to prevent any parsing activities before actual table.iter call. In this case headers will not be extracted from the source. See sample size section.
bytes_sample_size (int) - sample size in bytes for operations like encoding detection. See bytes sample size section.
force_strings (bool) - if True all output will be converted to strings. See force strings section.
force_parse (bool) - if True on row parsing error a stream will return an empty row instead of raising an exception. See force parse section.
skip_rows (int/str[]) - list of rows to skip by row number or row comment. Example: skip_rows=[1, 2, '#', '//'] - rows 1, 2 and all rows started with # and // will be skipped. See skip rows section.
post_parse (generator[]) - post parse processors (hooks). Signature to follow is processor(extended_rows) -> yield (row_number, headers, row) which should yield one extended row per yield instruction. See post parse section.
custom_loaders (dict) - loaders keyed by scheme. See a section below. See custom loaders section.
custom_parsers (dict) - custom parsers keyed by format. See a section below. See custom parsers section.
custom_writers (dict) - custom writers keyed by format. See a section below. See custom writers section.
<name> (<type>) - loader/parser options. See in the scheme/format section
(Stream) - returns Stream class instance

stream.closed

(bool) - returnsTrue if underlaying stream is closed

stream.open()

Open stream by opening underlaying stream.

stream.close()

Close stream by closing underlaying stream.

stream.reset()

Reset stream pointer to the first row.

stream.headers

(str[]) - returns data headers

stream.scheme

(str) - returns an actual scheme

stream.format

(str) - returns an actual format

stream.encoding

(str) - returns an actual encoding

stream.sample

(list) - returns data sample

stream.iter(keyed=False, extended=False)

Iter stream rows. See keyed and extended rows section.

keyed (bool) - if True yield keyed rows
extended (bool) - if True yield extended rows
(any[]/any{}) - yields row/keyed row/extended row

stream.read(keyed=False, extended=False, limit=None)

Read table rows with count limit. See keyed and extended rows section.

keyed (bool) - return keyed rows
extended (bool) - return extended rows
limit (int) - rows count limit
(list) - returns rows/keyed rows/extended rows

stream.save(target, format=None, encoding=None, **options)

Save stream to filesystem.

target (str) - stream target
format (str) - saving format. See supported formats
encoding (str) - saving encoding
options (dict) - writer options

Schemes

There is a list of all supported schemes.

file

The default scheme. Source should be a file in local filesystem.

stream = Stream('data.csv')

http/https/ftp/ftps

In Python 2 tabulator can’t stream remote data source because of underlaying libraries limitation. The whole data source will be loaded to the memory. In Python 3 there is no such a problem and tabulator is able to stream remote data source as expected.

Source should be a file available via one of this protocols in the web.

stream = Stream('http://example.com/data.csv')

stream

Source should be a file-like python object which supports corresponding protocol.

stream = Stream(open('data.csv'))

text

Source should be a string containing tabular data. In this case format has to be explicitely passed because it’s not possible to infer it from source string.

stream = Stream('text://name,age\nJohn, 21\n', format='csv')

Formats

There is a list of all supported formats. Formats support read operation could be opened by Stream.open() and formats support write operation could be used in Stream.save().

csv

Source should be parsable by csv parser.

stream = Stream('data.csv', delimiter=',')

Operations:

read
write

Options:

delimiter
doublequote
escapechar
quotechar
quoting
skipinitialspace
lineterminator

See options reference in Python documentation.

datapackage

This format is not included to package by default. To use it please install tabulator with an datapackage extras: $ pip install tabulator[datapackage]

Source should be a valid Tabular Data Package see (https://frictionlessdata.io).

stream = Stream('datapackage.json', resource=1)

Operations:

read

Options:

resource - resource index (starting from 0) or resource name

gsheet

Source should be a link to publicly available Google Spreadsheet.

stream = Stream('https://docs.google.com/spreadsheets/d/<id>?usp=sharing')
stream = Stream('https://docs.google.com/spreadsheets/d/<id>edit#gid=<gid>')

inline

Source should be a list of lists or a list of dicts.

stream = Stream([['name', 'age'], ['John', 21], ['Alex', 33]])
stream = Stream([{'name': 'John', 'age': 21}, {'name': 'Alex', 'age': 33}])

Operations:

read

json

Source should be a valid JSON document containing array of arrays or array of objects (see inline format example).

stream = Stream('data.json', property='key1.key2')

Operations:

read

Options:

property - path to tabular data property separated by dots. For example having data structure like {"response": {"data": [...]}} you should set property to response.data.

ndjson

Source should be parsable by ndjson parser.

stream = Stream('data.ndjson')

Operations:

read

ods

This format is not included to package by default. To use it please install tabulator with an ods extras: $ pip install tabulator[ods]

Source should be a valid Open Office document.

stream = Stream('data.ods', sheet=1)

Operations:

read

Options:

sheet - sheet number starting from 1

sql

Source should be a valid database URL supported by sqlalchemy.

stream = Stream('postgresql://name:pass@host:5432/database', table='data')

Operations:

read

Options:

table - database table name to read data (REQUIRED)
order_by - SQL expression to order rows e.g. name desc

tsv

Source should be parsable by tsv parser.

stream = Stream('data.tsv')

Operations:

read

xls/xlsx

For xls format tabulator can’t stream data source because of underlaying libraries limitation. The whole data source will be loaded to the memory. For xlsx format there is no such a problem and tabulator is able to stream data source as expected.

Source should be a valid Excel document.

stream = Stream('data.xls', sheet=1)

Operations:

read

Options:

sheet - sheet number starting from 1
fill_merged_cells - if True it will unmerge and fill all merged cells by a visible value. With this option enabled the parser can’t stream data and load the whole document into memory.

Headers

By default Stream considers all data source rows as values:

with Stream([['name', 'age'], ['Alex', 21]]):
  stream.headers # None
  stream.read() # [['name', 'age'], ['Alex', 21]]

To alter this behaviour headers argument is supported by Stream constructor. This argument could be an integer - row number starting from 1 containing headers:

# Integer
with Stream([['name', 'age'], ['Alex', 21]], headers=1):
  stream.headers # ['name', 'age']
  stream.read() # [['Alex', 21]]

Or it could be a list of strings - user-defined headers:

with Stream([['Alex', 21]], headers=['name', 'age']):
  stream.headers # ['name', 'age']
  stream.read() # [['Alex', 21]]

If headers is a row number and data source is not keyed all rows before this row and this row will be removed from data stream (see first example).

Encoding

Stream constructor accepts encoding argument to ensure needed encoding will be used. As a value argument supported by python encoding name (e.g. ‘latin1’, ‘utf-8’, ..) could be used:

with Stream(source, encoding='latin1') as stream:
  stream.read()

By default an encoding will be detected automatically. If you experience a UnicodeDecodeError parsing your file, try setting this argument to ‘utf-8’.

Allow html

By default Stream will raise exceptions.FormatError on stream.open() call if html contents is detected. It’s not a tabular format and for example providing link to csv file inside html (e.g. GitHub page) is a common mistake.

But sometimes this default behaviour is not what is needed. For example you write custom parser which should support html contents. In this case allow_html option for Stream could be used:

with Stream(sorce_with_html, allow_html=True) as stream:
  stream.read() # no exception on open

Sample size

By default Stream will read some data on stream.open() call in advance. This data is provided as stream.sample. The size of this sample could be set in rows using sample_size argument of stream constructor:

with Stream(two_rows_source, sample_size=1) as stream:
  stream.sample # only first row
  stream.read() # first and second rows

Data sample could be really useful if you want to implement some initial data checks without moving stream pointer as stream.iter/read do. But if you don’t want any interactions with an actual source before first stream.iter/read call just disable data smapling with sample_size=0.

Bytes sample size

On initial reading stage tabulator should detect contents encoding. The argument bytes_sample_size customizes how many bytes will be read to detect encoding:

source = 'data/special/latin1.csv'
with Stream(source) as stream:
    stream.encoding # 'iso8859-2'
with Stream(source, sample_size=0, bytes_sample_size=10) as stream:
    stream.encoding # 'utf-8'

In this example our data file doesn’t include iso8859-2 characters in first 10 bytes. So we could see the difference in encoding detection. Note sample_size usage here - these two parameters are independent. Here we use sample_size=0 to prevent rows sample creation (will fail with bad encoding).

Force strings

Because tabulator support not only sources with string data representation as csv but also sources supporting different data types as json or inline there is a Stream option force_strings to stringify all data values on reading.

Here how stream works without forcing strings:

with Stream([['string', 1, datetime.time(17, 00)]]) as stream:
  stream.read() # [['string', 1, datetime.time(17, 00)]]

The same data source using force_strings option:

with Stream([['string', 1]], force_strings=True) as stream:
  stream.read() # [['string', '1', '17:00:00']]

For all temporal values stream will use ISO format. But if your data source doesn’t support temporal values (for instance json format) Stream just returns it as it is without converting to ISO format.

Force parse

Some data source could be partially mailformed for a parser. For example inline source could have good rows (lists or dicts) and bad rows (for example strings). By default stream.iter/read will raise exceptions.SourceError on the first bad row:

with Stream([[1], 'bad', [3]]) as stream:
  stream.read() # raise exceptions.SourceError

With force_parse option for Stream constructor this default behaviour could be changed. If it’s set to True non-parsable rows will be returned as empty rows:

with Stream([[1], 'bad', [3]]) as stream:
  stream.read() # [[1], [], [3]]

Skip rows

It’s a very common situation when your tabular data contains some rows you want to skip. It could be blank rows or commented rows. Stream constructors accepts skip_rows argument to make it possible. Value of this argument should be a list of integers and strings where:

integer is a row number starting from 1
string is a first row chars indicating that row is a comment

Let’s skip first, second and commented by ‘#’ symbol rows:

source = [['John', 1], ['Alex', 2], ['#Sam', 3], ['Mike', 4]]
with Stream(source, skip_rows=[1, 2, '#']) as stream
  stream.read() # [['Mike', 4]]

Post parse

Skipping rows is a very basic ETL (extrac-transform-load) feature. For more advanced data transormations there are post parse processors.

def skip_odd_rows(extended_rows):
    for row_number, headers, row in extended_rows:
        if not row_number % 2:
            yield (row_number, headers, row)

def multiply_on_two(extended_rows):
    for row_number, headers, row in extended_rows:
        yield (row_number, headers, list(map(lambda value: value * 2, row)))


with Stream([[1], [2], [3], [4]], post_parse=[skip_odd_rows, multiply_on_two]) as stream:
  stream.read() # [[4], [8]]

Post parse processor gets extended rows ([row_number, headers, row]) iterator and must yields updated extended rows back. This interface is very powerful because every processors have full control on iteration process could skip rows, catch exceptions etc.

Processors will be applied to source from left to right. For example in listing above multiply_on_two processor gets rows from skip_odd_rows processor.

Keyed and extended rows

Stream methods stream.iter/read() accept keyed and extended flags to vary data structure of output data row.

By default a stream returns every row as a list:

with Stream([['name', 'age'], ['Alex', 21]]) as stream:
  stream.read() # [['Alex', 21]]

With keyed=True a stream returns every row as a dict:

with Stream([['name', 'age'], ['Alex', 21]]) as stream:
  stream.read(keyed=True) # [{'name': 'Alex', 'age': 21}]

And with extended=True a stream returns every row as a tuple contining row number starting from 1, headers as a list and row as a list:

with Stream([['name', 'age'], ['Alex', 21]]) as stream:
  stream.read(extended=True) # (1, ['name', 'age'], ['Alex', 21])

Custom loaders

To create a custom loader Loader interface should be implemented and passed to Stream constructor as custom_loaders={'scheme': CustomLoader} argument.

For example let’s implement a custom loader:

from tabulator import Loader

class CustomLoader(Loader):
  options = []
  def __init__(self, bytes_sample_size, **options):
        pass
  def load(self, source, mode='t', encoding=None, allow_zip=False):
    # load logic

with Stream(source, custom_loaders={'custom': CustomLoader}) as stream:
  stream.read()

There are more examples in internal tabulator.loaders module.

Loader.options

List of supported custom options.

Loader(bytes_sample_size, **options)

bytes_sample_size (int) - sample size in bytes
options (dict) - loader options
(Loader) - returns Loader class instance

loader.load(source, mode='t', encoding=None, allow_zip=False)

source (str) - table source
mode (str) - text stream mode: ‘t’ or ‘b’
encoding (str) - encoding of source
allow_zip (bool) - if false will raise on zip format
(file-like) - returns file-like object of bytes or chars based on mode argument

Custom parsers

To create a custom parser Parser interface should be implemented and passed to Stream constructor as custom_parsers={'format': CustomParser} argument.

For example let’s implement a custom parser:

from tabulator import Parser

class CustomParser(Parser):
  options = []
  def __init__(self, loader, force_parse, **options):
    self.__loader = loader
  @property
  def closed(self):
    return False
  def open(self, source, encoding=None):
    # open logic
  def close(self):
    # close logic
  def reset(self):
    raise NotImplemenedError()
  @property
  def extended_rows():
    # extended rows logic

with Stream(source, custom_parsers={'custom': CustomParser}) as stream:
  stream.read()

There are more examples in internal tabulator.parsers module.

Parser.options

List of supported custom options.

Parser(loader, force_parse, **options)

Create parser class instance.

loader (Loader) - loader instance
force_parse (bool) - if True parser must yield (row_number, None, []) if there is an row in parsing error instead of stopping the iteration by raising an exception
options (dict) - parser options
(Parser) - returns Parser class instance

parser.closed

(bool) - returns True if parser is closed

parser.open(source, encoding=None)

Open underlaying stream. Parser gets byte or text stream from loader

to start emit items from this stream.

source (str) - table source
encoding (str) - encoding of source

parser.close()

Close underlaying stream.

parser.reset()

Reset items and underlaying stream. After reset call iterations over items will start from scratch.

parser.encoding

(str) - returns an actual encoding

parser.extended_rows

(iterator) - returns extended rows iterator

Custom writers

To create a custom writer Writer interface should be implemented and passed to Stream constructor as custom_writers={'format': CustomWriter} argument.

For example let’s implement a custom writer:

from tabulator import Writer

class CustomWriter(Writer):
  options = []
  def __init__(self, **options):
        pass
  def save(self, source, target, headers=None, encoding=None):
    # save logic

with Stream(source, custom_writers={'custom': CustomWriter}) as stream:
  stream.save(target)

There are more examples in internal tabulator.writers module.

Writer.options

List of supported custom options.

Writer(**options)

Create writer class instance.

options (dict) - writer options
(Writer) - returns Writer class instance

writer.save(source, target, headers=None, encoding=None)

Save source data to target.

source (str) - data source
source (str) - save target
headers (str[]) - optional headers
encoding (str) - encoding of source

Validate

For cases you don’t need open the source but want to know is it supported by tabulator or not you could use validate function. It also let you know what exactly is not supported raising correspondig exception class.

from tabulator import validate, exceptions

try:
  tabular = validate('data.csv')
except exceptions.TabulatorException:
  tabular = False

validate(source, scheme=None, format=None)

Validate if this source has supported scheme and format.

source (any) - data source
scheme (str) - data scheme
format (str) - data format
(exceptions.SchemeError) - raises if scheme is not supported
(exceptions.FormatError) - raises if format is not supported
(bool) - returns True if scheme/format is supported

Exceptions

exceptions.TabulatorException

Base class for all tabulator exceptions.

exceptions.IOError

All underlaying input-output errors.

exceptions.HTTPError

All underlaying HTTP errors.

exceptions.SourceError

This class of exceptions covers all source errors like bad data structure for JSON.

exceptions.SchemeError

For example this exceptions will be used if you provide not supported source scheme like bad://source.csv.

exceptions.FormatError

For example this exceptions will be used if you provide not supported source format like http://source.bad.

exceptions.EncodingError

All errors related to encoding problems.

CLI

It’s a provisional API. If you use it as a part of other program please pin concrete goodtables version to your requirements file.

The library ships with a simple CLI to read tabular data:

$ tabulator data/table.csv
id, name
1, english
2, 中国人

$ tabulator

Usage: cli.py [OPTIONS] SOURCE

Options:
  --headers INTEGER
  --scheme TEXT
  --format TEXT
  --encoding TEXT
  --limit INTEGER
  --help             Show this message and exit.

Contributing

The project follows the Open Knowledge International coding standards.

Recommended way to get started is to create and activate a project virtual environment. To install package and development dependencies into active environment:

$ make install

To run tests with linting and coverage:

$ make test

For linting pylama configured in pylama.ini is used. On this stage it’s already installed into your environment and could be used separately with more fine-grained control as described in documentation - https://pylama.readthedocs.io/en/latest/.

For example to sort results by error type:

$ pylama --sort <path>

For testing tox configured in tox.ini is used. It’s already installed into your environment and could be used separately with more fine-grained control as described in documentation - https://testrun.org/tox/latest/.

For example to check subset of tests against Python 2 environment with increased verbosity. All positional arguments and options after -- will be passed to py.test:

tox -e py27 -- -v tests/<path>

Under the hood tox uses pytest configured in pytest.ini, coverage and mock packages. This packages are available only in tox envionments.

Changelog

Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted commit history.

v1.5

New API added:

Argument bytes_sample_size for the Stream constructor

v1.4

Improved behaviour:

updated encoding name to a canonical form

v1.3

New API added:

stream.scheme
stream.format
stream.encoding

Promoted provisional API to stable API:

Loader (custom loaders)
Parser (custom parsers)
Writer (custom writers)
validate

v1.2

Improved behaviour:

autodetect common csv delimiters

v1.1

New API added:

added fill_merged_cells argument to xls/xlsx formats

v1.0

New API added:

published Loader/Parser/Writer API
added Stream argument force_strings
added Stream argument force_parse
added Stream argument custom_writers

Deprecated API removal:

removed topen and Table - use Stream instead
removed Stream arguments loader/parser_options - use **options instead

Provisional API changed:

updated Loader/Parser/Writer API - please use an updated version

v0.15

Provisional API added:

unofficial support for Stream arguments custom_loaders/parsers

Project details

These details have been verified by PyPI

Maintainers

brew callmealien okfn pwalsh

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.53.5

Mar 21, 2021

1.53.4

Feb 24, 2021

1.53.2

Feb 11, 2021

1.53.1

Nov 30, 2020

1.53.0

Nov 9, 2020

1.52.5

Nov 2, 2020

1.52.4

Sep 26, 2020

1.52.3

Jun 18, 2020

1.52.2

Jun 18, 2020

1.52.1

Jun 15, 2020

1.52.0

Jun 10, 2020

1.51.3

Jun 3, 2020

1.51.2

Jun 3, 2020

1.51.1

Jun 3, 2020

1.50.0

Jun 1, 2020

1.49.4

Jun 1, 2020

1.49.3

Jun 1, 2020

1.49.2

May 27, 2020

1.48.0

May 20, 2020

1.47.0

May 20, 2020

1.46.1

May 19, 2020

1.46.0

May 19, 2020

1.45.1

May 18, 2020

1.45.0

May 18, 2020

1.44.7

May 14, 2020

1.44.6

May 14, 2020

1.44.5

May 14, 2020

1.44.4

May 14, 2020

1.44.3

May 13, 2020

1.44.2

May 11, 2020

1.44.1

May 11, 2020

1.44.0

May 7, 2020

1.42.0

May 4, 2020

1.41.0

Apr 30, 2020

1.40.0

Apr 29, 2020

1.39.1

Apr 29, 2020

1.39.0

Apr 29, 2020

1.38.4

Apr 23, 2020

1.38.3

Apr 22, 2020

1.38.2

Apr 8, 2020

1.38.1

Mar 25, 2020

1.37.1

Mar 25, 2020

1.36.1

Mar 25, 2020

1.36.0

Mar 16, 2020

1.35.0

Feb 17, 2020

1.34.1

Feb 17, 2020

1.34.0

Feb 4, 2020

1.33.0

Jan 30, 2020

1.32.0

Jan 29, 2020

1.31.2

Dec 19, 2019

1.31.1

Dec 17, 2019

1.31.0

Dec 2, 2019

1.30.0

Nov 19, 2019

1.29.0

Oct 30, 2019

1.28.0

Oct 21, 2019

1.27.0

Oct 14, 2019

1.26.1

Sep 21, 2019

1.25.1

Sep 18, 2019

1.25.0

Sep 18, 2019

1.24.3

Sep 17, 2019

1.24.2

Aug 27, 2019

1.24.1

Aug 21, 2019

1.24.0

Aug 16, 2019

1.23.0

Jul 7, 2019

1.22.0

Jun 28, 2019

1.21.0

May 27, 2019

1.20.0

Apr 24, 2019

1.19.3

Apr 17, 2019

1.19.1

Apr 11, 2019

1.19.0

Nov 6, 2018

1.18.0

Oct 29, 2018

1.17.1

Oct 22, 2018

1.17.0

Oct 15, 2018

1.16.0

Oct 15, 2018

1.15.0

Oct 8, 2018

1.14.4

Oct 4, 2018

1.14.3

Sep 17, 2018

1.14.2

Jul 26, 2018

1.14.1

Jul 17, 2018

1.14.0

Mar 21, 2018

1.13.0

Dec 27, 2017

1.12.2

Nov 24, 2017

1.12.1

Nov 22, 2017

1.12.0

Nov 10, 2017

1.11.1

Oct 30, 2017

1.11.0

Oct 27, 2017

1.10.0

Oct 20, 2017

1.9.0

Oct 20, 2017

1.8.0

Oct 17, 2017

1.7.1

Oct 12, 2017

1.7.0

Oct 12, 2017

1.6.0

Oct 5, 2017

This version

1.5.0

Sep 6, 2017

1.4.1

Aug 28, 2017

1.3.1

Aug 22, 2017

1.3.0

Aug 8, 2017

1.2.0

Aug 3, 2017

1.1.0

Jun 20, 2017

1.0.0

Jun 5, 2017

1.0.0a5 pre-release

May 18, 2017

1.0.0a4 pre-release

May 17, 2017

1.0.0a1 pre-release

May 10, 2017

0.15.1

May 3, 2017

0.15.0

Apr 23, 2017

0.14.2

Mar 2, 2017

0.14.1

Feb 21, 2017

0.14.0

Jan 24, 2017

0.13.0

Jan 13, 2017

0.12.1

Dec 6, 2016

0.12.0

Nov 28, 2016

0.11.2

Nov 18, 2016

0.11.1

Nov 9, 2016

0.11.0

Nov 9, 2016

0.10.5

Nov 3, 2016

0.10.4

Oct 29, 2016

0.10.3

Oct 29, 2016

0.10.2

Oct 29, 2016

0.10.1

Oct 28, 2016

0.10.0

Oct 27, 2016

0.9.0

Oct 27, 2016

0.8.0

Oct 26, 2016

0.7.6

Oct 20, 2016

0.7.5

Oct 13, 2016

0.7.4

Sep 23, 2016

0.7.2

Sep 14, 2016

0.7.1

Sep 14, 2016

0.7.0

Sep 14, 2016

0.6.2

Sep 13, 2016

0.6.1

Sep 13, 2016

0.6.0

Sep 13, 2016

0.5.0

Aug 16, 2016

0.4.0

May 11, 2016

0.3.14

May 10, 2016

0.3.13

Mar 29, 2016

0.3.12

Mar 29, 2016

0.3.9

Mar 28, 2016

0.3.8

Mar 28, 2016

0.3.7

Mar 26, 2016

0.3.6

Mar 15, 2016

0.3.5

Feb 18, 2016

0.3.3

Feb 17, 2016

0.3.2

Feb 8, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabulator-1.5.0.tar.gz (46.1 kB view hashes)

Uploaded Sep 6, 2017 Source

Built Distribution

tabulator-1.5.0-py2.py3-none-any.whl (56.6 kB view hashes)

Uploaded Sep 6, 2017 Python 2 Python 3

Hashes for tabulator-1.5.0.tar.gz

Hashes for tabulator-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`203f46a282a55754d5e63838a58c0da2919c396a1b37c6b5d3c4aaab1979c818`
MD5	`97a6ce9b5ee2832dfa4ef53a1891a9dd`
BLAKE2b-256	`d386254b355eb8fc0734b96c5b1b129e048f196f39f1ff95ae4aac6f605c8a6c`

Hashes for tabulator-1.5.0-py2.py3-none-any.whl

Hashes for tabulator-1.5.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`b0bf45a4c2a13535f2086e56ba94f460ad1062bb978dc1cf4f8b1695bcff4e6b`
MD5	`dd892abd8ce017154682fccc8c66b059`
BLAKE2b-256	`5bc3b43f8eaf0127b311ef1d4fe3cb60def9db45c996f046391b8c988fd6e06d`

tabulator 1.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Features

Getting started

Installation

Examples

Documentation

Stream

Stream(source, **options)

stream.closed

stream.open()

stream.close()

stream.reset()

stream.headers

stream.scheme

stream.format

stream.encoding

stream.sample

stream.iter(keyed=False, extended=False)

stream.read(keyed=False, extended=False, limit=None)

stream.save(target, format=None, encoding=None, **options)

Schemes

file

http/https/ftp/ftps

stream

text

Formats

csv

datapackage

gsheet

inline

json

ndjson

ods

sql

tsv

xls/xlsx

Headers

Encoding

Allow html

Sample size

Bytes sample size

Force strings

Force parse

Skip rows

Post parse

Keyed and extended rows

Custom loaders

Loader.options

Loader(bytes_sample_size, **options)

loader.load(source, mode='t', encoding=None, allow_zip=False)

Custom parsers

Parser.options

Parser(loader, force_parse, **options)

parser.closed

parser.open(source, encoding=None)

parser.close()

parser.reset()

parser.encoding

parser.extended_rows

Custom writers

Writer.options

Writer(**options)

writer.save(source, target, headers=None, encoding=None)

Validate

validate(source, scheme=None, format=None)

Exceptions

exceptions.TabulatorException

exceptions.IOError

exceptions.HTTPError

exceptions.SourceError

exceptions.SchemeError

exceptions.FormatError