An instant JSON API for your CSV
Project description
csvapi
"Instantly" publish an API for a CSV hosted anywhere on the internet. Also supports Excel files.
This tool is used by data.gouv.fr to show a preview of hosted CSV and XLS files.
Installation
Requires Python 3.7+ and a Unix OS with the file
command available.
python3 -m venv pyenv && . pyenv/bin/activate
pip install csvapi
For development:
pip install -r requirements/develop.pip
pip install -e .
Quickstart
csvapi serve -h 0.0.0.0 -p 8000
Command line options
$ csvapi serve --help
Usage: csvapi serve [OPTIONS]
Options:
--ssl-key TEXT Path to SSL key
--ssl-cert TEXT Path to SSL certificate
--cache / --no-cache Do not parse CSV again if DB already exists
-w, --max-workers INTEGER Max number of ThreadPoolExecutor workers
--reload Automatically reload if code change detected
--debug Enable debug mode - useful for development
-p, --port INTEGER port for server, defaults to 8001
-h, --host TEXT host for server, defaults to 127.0.0.1
--dbs DIRECTORY Where to store sqlite DBs
--help Show this message and exit.
Deploy
With SSL, using Hypercorn:
hypercorn csvapi.webservice:app -b 0.0.0.0:443 --keyfile key.pem --ca-certs cert.pem
See the documentation for more options.
You can use the environment variable CSVAPI_CONFIG_FILE
to point to a custom configuration file.
API usage
Conversion
/apify?url=http://somewhere.com/a/file.csv
This converts a CSV to an SQLite database (w/ agate
) and returns the following response:
{"ok": true, "endpoint": "http://localhost:8001/api/cde857960e8dc24c9cbcced673b496bb"}
Parameters
Some parameters can be used in the query string.
encoding
default: automatic detection
You can force an encoding (e.g. utf-8
) using this parameter, instead of relying on the automatic detection.
Data API
This is the endpoint
attribute of the previous response.
/api/<md5-url-hash>
This queries a previously converted API file and returns the first 100 rows like this:
{
"ok": true,
"rows": [[], []],
"columns": [],
"query_ms": 1
}
Parameters
Some parameters can be used in the query string.
_size
default: 100
This will limit the query to a certain number of rows. For instance to get only 250 rows:
/api/<md5-url-hash>?_size=250
_sort
and _sort_desc
Use those to sort by a column. sort
will sort by ascending order, sort_desc
by descending order.
/api/<md5-url-hash>?_sort=<column-name>
_offset
Use this to add on offset. Combined with _size
it allows pagination.
/api/<md5-url-hash>?_size=1&_offset=1
_shape
default: lists
The _shape
argument is used to specify the format output of the json. It can take the value objects
to get an array of objects instead of an array of arrays:
/api/<md5-url-hash>?_shape=objects
For instance, instead of returning:
{
"ok": true,
"query_ms": 0.4799365997,
"rows": [
[1, "Justice", "0101", 57663310],
[2, "Justice", "0101", 2255129],
[3, "Justice", "0101", 36290]
],
"columns": ["rowid", "Mission", "Programme", "Consommation de CP"]
}
It will return:
{
"ok": true,
"query_ms": 2.681016922,
"rows": [
{
"rowid": 1,
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 57663310
},
{
"rowid": 2,
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 2255129
},
{
"rowid": 3,
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 36290
}],
"columns": ["rowid", "Mission", "Programme", "Consommation de CP"]
}
_rowid
default: show
The _rowid
argument is used to display or hide rowids in the returned data. Use _rowid=hide
to hide.
/api/<md5-url-hash>?_shape=objects&_rowid=hide
{
"ok": true,
"query_ms": 2.681016922,
"rows": [
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 57663310
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 2255129
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 36290
}],
"columns": ["Mission", "Programme", "Consommation de CP"]
}
_total
default: show
The _total
argument is used to display or hide the total number of rows (independent of pagination) in the returned data. Use _total=hide
to hide.
{
"ok": true,
"query_ms": 2.681016922,
"rows": [
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 57663310
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 2255129
},
{
"Mission": "Justice",
"Programme": "0101",
"Consommation de CP": 36290
}],
"columns": ["Mission", "Programme", "Consommation de CP"],
"total": 3
}
Column based filters
By adding {column}__{comparator}={value}
to the query string, you can filter the results based on the following criterions:
{column}
must be a valid column in your CSV{comparator}
isexact
(SQL= {value}
) orcontains
(SQLLIKE %{value}%
){value}
is the value you're filtering the column against
You can add multiple filters, they will be joined with a AND
at the SQL level.
Production deployment
Some example Ansible 2 roles are available here.
Credits
Inspired by the excellent Datasette.
Changelog
1.2.0 (2021-04-29)
1.1.0 (2021-03-23)
- Use aiosqlite #76
1.0.6 (2020-12-14)
- Better parsing fallback #71
1.0.5 (2020-11-17)
- Parsing view now raises exception on http error response codes #69
1.0.4 (2020-10-26)
1.0.3 (2020-03-04)
- Fix packaging problem
1.0.2 (2020-03-04)
- Fix XLS parsing #60
1.0.1 (2020-01-03)
- Fix aiohttp import #52
1.0.0 (2020-01-03)
- Add filters support #50
- Replace requests by aiohttp for asynchronous http requests. Also replace every format() string to use only f"strings. #46
0.1.0 (2019-09-06)
0.0.9 (2019-01-18)
- Upgrade to Quart-0.6.6 and hypercorn-0.4.6 #16
0.0.8 (2018-10-04)
- Try to parse CSV w/o sniffing (excel dialect) after sniffing if it fails
0.0.7 (2018-09-17)
0.0.6 (2018-09-10)
- Compute the total number of rows in a table #12
0.0.5 (2018-09-10)
- Make CSV sniff limit a config variable and raise the default value #11
- Properly handle not found (404) errors
0.0.4 (2018-09-04)
- FORCE_SSL config variable
0.0.3 (2018-08-31)
- Sentry support via SENTRY_DSN config variable
0.0.2 (2018-08-30)
- CSVAPI_CONFIG_FILE env var support
0.0.1 (2018-08-30)
- Initial version
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file csvapi-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: csvapi-1.2.0-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f8f86a2564807be3f10fdaecbd65dc1252b56e5aaa41ad45ef6cab5cdd6da0a |
|
MD5 | ebf212c35e8835c45ee280fb837260e8 |
|
BLAKE2b-256 | b2b38c94039a5cc8d86617d24238a2131c314d42c9b8b2d077be36a0b7b8b8f9 |