A flask blueprint providing an API for accessing and searching an ElasticSearch index created from source datapackages
Project description
apies
apies is a flask blueprint providing an API for accessing and searching an ElasticSearch index created from source datapackages.
endpoints
/get
/search/count
`/search/
download/<doctypes>
Downloads search results in either csv, xls or xlsx format.
Query parameters that can be send:
- types_formatted: The type of the documents to search
- search_term: The Elastic search query
- size: Number of hits to return
- offset: Whether or not term offsets should be returned
- filters: What offset to use for the pagination
- dont_highlight:
- from_date: If there should be a date range applied to the search, and from what date
- to_date: If there should be a date range applied to the search, and until what date
- order:
- file_format: The format of the file to be returned, either 'csv', 'xls' or 'xlsx'. If not passed the file format will be xlsx
- file_name: The name of the file to be returned, by default the name will be 'search_results'
- column_mapping: If the columns should get a different name then in the original data, a column map can be send, for example:
{
"עיר": "address.city",
"תקציב": "details.budget"
}
For example, get a csv file with column mapping:
http://localhost:5000/api/download/jobs?q=engineering&size=2&file_format=csv&file_name=my_results&column_mapping={%22mispar%22:%22Job%20ID%22}
Or get an xslx file without column mapping:
http://localhost:5000/api/download/jobs?q=engineering&size=2&file_format=xlsx&file_name=my_results
configuration
Flask configuration for this blueprint:
from apies import apies_blueprint
import elasticsearch
app.register_blueprint(
apies_blueprint(['path/to/datapackage.json', Package(), ...],
elasticsearch.Elasticsearch(...),
{'doc-type-1': 'index-for-doc-type-1', ...},
'index-for-documents',
dont_highlight=['fields', 'not.to', 'highlight'],
text_field_rules=lambda schema_field: [], # list of tuples: ('exact'/'inexact'/'natural', <field-name>)
multi_match_type='most_fields',
multi_match_operator='and'),
url_prefix='/search/'
)
local development
You can start a local development server by following these steps:
-
Install Dependencies:
a. Install Docker locally
b. Install Python dependencies:
$ pip install dataflows dataflows-elasticsearch $ pip install -e .
-
Go to the
sample/
directory -
Start ElasticSearch locally:
$ ./start_elasticsearch.sh
This script will wait and poll the server until it's up and running. You can test it yourself by running:
$ curl -s http://localhost:9200 { "name" : "99cd2db44924", "cluster_name" : "docker-cluster", "cluster_uuid" : "nF9fuwRyRYSzyQrcH9RCnA", "version" : { "number" : "7.4.2", "build_flavor" : "default", "build_type" : "docker", "build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96", "build_date" : "2019-10-28T20:40:44.881551Z", "build_snapshot" : false, "lucene_version" : "8.2.0", "minimum_wire_compatibility_version" : "6.8.0", "minimum_index_compatibility_version" : "6.0.0-beta1" }, "tagline" : "You Know, for Search" }
-
Load data into the database
$ DATAFLOWS_ELASTICSEARCH=localhost:9200 python load_fixtures.py
You can test that data was loaded:
$ curl -s http://localhost:9200/jobs-job/_count?pretty { "count" : 1757, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 } }
-
Start the sample server
$ python server.py * Serving Flask app "server" (lazy loading) * Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead. * Debug mode: off * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
-
Now you can hit the server's endpoints, for example:
$ curl -s 'localhost:5000/api/search/jobs?q=engineering&size=2' | jq 127.0.0.1 - - [26/Jun/2019 10:45:31] "GET /api/search/jobs?q=engineering&size=2 HTTP/1.1" 200 - { "search_counts": { "_current": { "total_overall": 617 } }, "search_results": [ { "score": 18.812, "source": { "# Of Positions": "5", "Additional Information": "TO BE APPOINTED TO ANY CIVIL <em>ENGINEERING</em> POSITION IN BRIDGES, CANDIDATES MUST POSSESS ONE YEAR OF CIVIL <em>ENGINEERING</em> EXPERIENCE IN BRIDGE DESIGN, BRIDGE CONSTRUCTION, BRIDGE MAINTENANCE OR BRIDGE INSPECTION.", "Agency": "DEPARTMENT OF TRANSPORTATION", "Business Title": "Civil Engineer 2", "Civil Service Title": "CIVIL ENGINEER", "Division/Work Unit": "<em>Engineering</em> Review & Support", ... }
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for apies-1.5.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e78da660a8ad7fc7119e6018e9b83d000fe96f4308e1ae5a8dcc4c0460a493a |
|
MD5 | 87572811a28179a3258d1a0e1323547a |
|
BLAKE2b-256 | 1846d8e532d2aa1ba35d6c8024385ddc5ebc2b783f605ac4b5fed0356c411e00 |