An instant JSON API for your CSV
Project description
csvapi
"Instantly" publish an API for a CSV hosted anywhere on the internet. Also supports Excel files.
Installation
Requires Python 3.6+ and a Unix OS with the file
command available.
python3 -m venv pyenv && . pyenv/bin/activate
pip install csvapi
For development:
pip install -r requirements/develop.pip
pip install -e .
Quickstart
csvapi serve -h 0.0.0.0 -p 8000
Command line options
$ csvapi serve --help
Usage: csvapi serve [OPTIONS]
Options:
--ssl-key TEXT Path to SSL key
--ssl-cert TEXT Path to SSL certificate
-w, --max-workers INTEGER Max number of ThreadPoolExecutor workers
--cache / --no-cache Do not parse CSV again if DB already exists
--reload Automatically reload if code change detected
--debug Enable debug mode - useful for development
-p, --port INTEGER port for server, defaults to 8001
-h, --host TEXT host for server, defaults to 127.0.0.1
--dbs DIRECTORY Where to store sqlite DBs
--help Show this message and exit.
Deploy
With SSL, using Hypercorn:
hypercorn csvapi.webservice:app -b 0.0.0.0:443 --keyfile key.pem --ca-certs cert.pem
See the documentation for more options.
You can use the environment variable CSVAPI_CONFIG_FILE
to point to a custom configuration file.
API usage
Conversion
/apify?url=http://somewhere.com/a/file.csv
This converts a CSV to an SQLite database (w/ agate
) and returns the following response:
{"ok": true, "endpoint": "http://localhost:8001/api/cde857960e8dc24c9cbcced673b496bb"}
Parameters
Some parameters can be used in the query string.
encoding
default: automatic detection
You can force an encoding (e.g. utf-8
) using this parameter, instead of relying on the automatic detection.
Data API
This is the endpoint
attribute of the previous response.
/api/<md5-url-hash>
This queries a previously converted API file and returns the first 100 rows like this:
{
"ok": true,
"rows": [[], []],
"columns": [],
"query_ms": 1
}
Parameters
Some parameters can be used in the query string.
_size
default: 100
This will limit the query to a certain number of rows. For instance to get only 250 rows:
/api/<md5-url-hash>?_size=250
_sort
and _sort_desc
Use those to sort by a column. sort
will sort by ascending order, sort_desc
by descending order.
/api/<md5-url-hash>?_sort=<column-name>
_offset
Use this to add on offset. Combined with _size
it allows pagination.
/api/<md5-url-hash>?_size=1&_offset=1
_shape
default: lists
The _shape
argument is used to specify the format output of the json. It can take the value objects
to get an array of objects instead of an array of arrays:
/api/<md5-url-hash>?_shape=objects
For instance, instead of returning:
{
"ok": true,
"query_ms": 0.4799365997,
"rows": [
[1, "Justice", "0101", 57663310],
[2, "Justice", "0101", 2255129],
[3, "Justice", "0101", 36290]
],
"columns": ["rowid", "Mission", "Programme", "Consommation de CP"]
}
It will return:
{
"ok": true,
"query_ms": 2.681016922,
"rows": [
{
"rowid": 1,
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 57663310
},
{
"rowid": 2,
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 2255129
},
{
"rowid": 3,
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 36290
}],
"columns": ["rowid", "Mission", "Programme", "Consommation de CP"]
}
_rowid
default: show
The _rowid
argument is used to display or hide rowids in the returned data. Use _rowid=hide
to hide.
/api/<md5-url-hash>?_shape=objects&_rowid=hide
{
"ok": true,
"query_ms": 2.681016922,
"rows": [
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 57663310
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 2255129
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 36290
}],
"columns": ["Mission", "Programme", "Consommation de CP"]
}
_total
default: show
The _total
argument is used to display or hide the total number of rows (independent of pagination) in the returned data. Use _total=hide
to hide.
{
"ok": true,
"query_ms": 2.681016922,
"rows": [
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 57663310
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 2255129
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 36290
}],
"columns": ["Mission", "Programme", "Consommation de CP"],
"total": 3
}
Credits
Inspired by the excellent Datasette.
Changelog
0.0.6 (2018-09-10)
- Compute the total number of rows in a table #12
0.0.5 (2018-09-10)
- Make CSV sniff limit a config variable and raise the default value #11
- Properly handle not found (404) errors
0.0.4 (2018-09-04)
- FORCE_SSL config variable
0.0.3 (2018-08-31)
- Sentry support via SENTRY_DSN config variable
0.0.2 (2018-08-30)
- CSVAPI_CONFIG_FILE env var support
0.0.1 (2018-08-30)
- Initial version
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.