An instant JSON API for your CSV

These details have not been verified by PyPI

Project links

Homepage

Project description

csvapi

"Instantly" publish an API for a CSV hosted anywhere on the internet. Also supports Excel files.

This tool is used by data.gouv.fr to show a preview of hosted CSV and XLS files.

Installation

Requires Python 3.7+ and a Unix OS with the file command available.

python3 -m venv pyenv && . pyenv/bin/activate
pip install csvapi

For development:

pip install -r requirements/develop.pip
pip install -e .

Quickstart

csvapi serve -h 0.0.0.0 -p 8000

Command line options

$ csvapi serve --help
Usage: csvapi serve [OPTIONS]

Options:
    --ssl-key TEXT             Path to SSL key
    --ssl-cert TEXT            Path to SSL certificate
    --cache / --no-cache       Do not parse CSV again if DB already exists
    -w, --max-workers INTEGER  Max number of ThreadPoolExecutor workers
    --reload                   Automatically reload if code change detected
    --debug                    Enable debug mode - useful for development
    -p, --port INTEGER         port for server, defaults to 8001
    -h, --host TEXT            host for server, defaults to 127.0.0.1
    --dbs DIRECTORY            Where to store sqlite DBs
    --help                     Show this message and exit.

Deploy

With SSL, using Hypercorn:

hypercorn csvapi.webservice:app -b 0.0.0.0:443 --keyfile key.pem --ca-certs cert.pem

See the documentation for more options.

You can use the environment variable CSVAPI_CONFIG_FILE to point to a custom configuration file.

API usage

Conversion

/apify?url=http://somewhere.com/a/file.csv

This converts a CSV to an SQLite database (w/ agate) and returns the following response:

{"ok": true, "endpoint": "http://localhost:8001/api/cde857960e8dc24c9cbcced673b496bb"}

Parameters

Some parameters can be used in the query string.

`encoding`

default: automatic detection

You can force an encoding (e.g. utf-8) using this parameter, instead of relying on the automatic detection.

Data API

This is the endpoint attribute of the previous response.

/api/<md5-url-hash>

This queries a previously converted API file and returns the first 100 rows like this:

    {
        "ok": true,
        "rows": [[], []],
        "columns": [],
        "query_ms": 1
    }

Parameters

Some parameters can be used in the query string.

`_size`

default: 100

This will limit the query to a certain number of rows. For instance to get only 250 rows:

/api/<md5-url-hash>?_size=250

`_sort` and `_sort_desc`

Use those to sort by a column. sort will sort by ascending order, sort_desc by descending order.

/api/<md5-url-hash>?_sort=<column-name>

`_offset`

Use this to add on offset. Combined with _size it allows pagination.

/api/<md5-url-hash>?_size=1&_offset=1

`_shape`

default: lists

The _shape argument is used to specify the format output of the json. It can take the value objects to get an array of objects instead of an array of arrays:

/api/<md5-url-hash>?_shape=objects

For instance, instead of returning:

{
    "ok": true,
    "query_ms": 0.4799365997,
    "rows": [
        [1, "Justice", "0101", 57663310],
        [2, "Justice", "0101", 2255129],
        [3, "Justice", "0101", 36290]
    ],
    "columns": ["rowid", "Mission", "Programme", "Consommation de CP"]
}

It will return:

{
    "ok": true,
    "query_ms": 2.681016922,
    "rows": [
    {
        "rowid": 1,
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 57663310
    },
    {
        "rowid": 2,
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 2255129
    },
    {
        "rowid": 3,
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 36290
    }],
    "columns": ["rowid", "Mission", "Programme", "Consommation de CP"]
}

`_rowid`

default: show

The _rowid argument is used to display or hide rowids in the returned data. Use _rowid=hide to hide.

/api/<md5-url-hash>?_shape=objects&_rowid=hide

{
    "ok": true,
    "query_ms": 2.681016922,
    "rows": [
    {
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 57663310
    },
    {
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 2255129
    },
    {
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 36290
    }],
    "columns": ["Mission", "Programme", "Consommation de CP"]
}

`_total`

default: show

The _total argument is used to display or hide the total number of rows (independent of pagination) in the returned data. Use _total=hide to hide.

{
    "ok": true,
    "query_ms": 2.681016922,
    "rows": [
    {
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 57663310
    },
    {
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 2255129
    },
    {
        "Mission": "Justice",
        "Programme": "0101",
        "Consommation de CP": 36290
    }],
    "columns": ["Mission", "Programme", "Consommation de CP"],
    "total": 3
}

Column based filters

By adding {column}__{comparator}={value} to the query string, you can filter the results based on the following criterions:

{column} must be a valid column in your CSV
{comparator} is exact (SQL = {value}) or contains (SQL LIKE %{value}%)
{value} is the value you're filtering the column against

You can add multiple filters, they will be joined with a AND at the SQL level.

Production deployment

Some example Ansible 2 roles are available here.

Credits

Inspired by the excellent Datasette.

Changelog

1.2.1 (2021-04-29)

Upgrade raven to sentry-sdk (a bit dirty so far)

1.2.0 (2021-04-29)

Add profiling support #77
Fix bug in filters w/ blanks in column names #77

1.1.0 (2021-03-23)

Use aiosqlite #76

1.0.6 (2020-12-14)

Better parsing fallback #71

1.0.5 (2020-11-17)

Parsing view now raises exception on http error response codes #69

1.0.4 (2020-10-26)

Protect custom type testers against None values #66
Fix xlsx file support #67

1.0.3 (2020-03-04)

Fix packaging problem

1.0.2 (2020-03-04)

Fix XLS parsing #60

1.0.1 (2020-01-03)

Fix aiohttp import #52

1.0.0 (2020-01-03)

Add filters support #50
Replace requests by aiohttp for asynchronous http requests. Also replace every format() string to use only f"strings. #46

0.1.0 (2019-09-06)

Upgrade to Quart-0.9.1 :warning: requires python-3.7 #21
Parse hours, SIREN and SIRET as text #42

0.0.9 (2019-01-18)

Upgrade to Quart-0.6.6 and hypercorn-0.4.6 #16

0.0.8 (2018-10-04)

Try to parse CSV w/o sniffing (excel dialect) after sniffing if it fails

0.0.7 (2018-09-17)

MAX_FILE_SIZE config variable #13
Add filter by referrer feature (REFERRERS_FILTER) #14

0.0.6 (2018-09-10)

Compute the total number of rows in a table #12

0.0.5 (2018-09-10)

Make CSV sniff limit a config variable and raise the default value #11
Properly handle not found (404) errors

0.0.4 (2018-09-04)

FORCE_SSL config variable

0.0.3 (2018-08-31)

Sentry support via SENTRY_DSN config variable

0.0.2 (2018-08-30)

CSVAPI_CONFIG_FILE env var support

0.0.1 (2018-08-30)

Initial version

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

2.2.1.dev870 pre-release

Nov 4, 2022

2.2.0

Nov 4, 2022

2.1.1857

Nov 4, 2022

2.1.806

Oct 13, 2022

2.1.2.dev862 pre-release

Nov 4, 2022

2.1.1

Oct 25, 2022

2.1.0

Oct 13, 2022

2.0.1.dev786 pre-release

Oct 5, 2022

2.0.1.dev769 pre-release

Oct 5, 2022

2.0.1.dev722 pre-release

Sep 15, 2022

2.0.1.dev716 pre-release

Sep 15, 2022

2.0.0

Sep 15, 2022

2.0.0.dev708 pre-release

Sep 15, 2022

2.0.0.dev691 pre-release

Sep 14, 2022

2.0.0.dev686 pre-release

Sep 14, 2022

1.3.0.dev641 pre-release

Sep 6, 2022

1.3.0.dev604 pre-release

Aug 26, 2022

1.3.0.dev599 pre-release

Aug 26, 2022

This version

1.2.1

Apr 29, 2021

1.2.0

Apr 29, 2021

1.1.0

Mar 23, 2021

1.0.6

Dec 14, 2020

1.0.5

Nov 17, 2020

1.0.4

Oct 26, 2020

1.0.3

Mar 4, 2020

1.0.3.dev0 pre-release

Mar 4, 2020

1.0.1

Jan 3, 2020

1.0.0

Jan 3, 2020

0.1.0

Sep 6, 2019

0.0.9

Jan 18, 2019

0.0.8

Oct 4, 2018

0.0.7

Sep 17, 2018

0.0.6

Sep 10, 2018

0.0.5

Sep 10, 2018

0.0.4

Sep 4, 2018

0.0.3

Aug 31, 2018

0.0.2

Aug 30, 2018

0.0.1

Aug 30, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

csvapi-1.2.1-py3-none-any.whl (16.5 kB view hashes)

Uploaded Apr 29, 2021 Python 3

Hashes for csvapi-1.2.1-py3-none-any.whl

Hashes for csvapi-1.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`64071f473b810fac480b8427841ab8b938a5d44ee88e9f12eaab4c58655d11b8`
MD5	`35b67941336efa6e050aa4bf229ed85e`
BLAKE2b-256	`85bb33b11e28e6011ae1bd69945c0a33f003a07c68eae5bc628b2ac103b053dc`

csvapi 1.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

csvapi

Installation

Quickstart

Command line options

Deploy

API usage

Conversion

Parameters

encoding

Data API

Parameters

_size

_sort and _sort_desc

_offset

_shape

_rowid

_total

Column based filters

Production deployment

Credits

Changelog

1.2.1 (2021-04-29)

1.2.0 (2021-04-29)

1.1.0 (2021-03-23)

1.0.6 (2020-12-14)

1.0.5 (2020-11-17)

1.0.4 (2020-10-26)

1.0.3 (2020-03-04)

1.0.2 (2020-03-04)

1.0.1 (2020-01-03)

1.0.0 (2020-01-03)

0.1.0 (2019-09-06)

0.0.9 (2019-01-18)

0.0.8 (2018-10-04)

0.0.7 (2018-09-17)

0.0.6 (2018-09-10)

0.0.5 (2018-09-10)

0.0.4 (2018-09-04)

0.0.3 (2018-08-31)

0.0.2 (2018-08-30)

0.0.1 (2018-08-30)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

`encoding`

`_size`

`_sort` and `_sort_desc`

`_offset`

`_shape`

`_rowid`

`_total`