Skip to main content

A flask blueprint providing an API for accessing and searching an ElasticSearch index created from source datapackages

Project description

apies

Travis Coveralls PyPI - Python Version

apies is a flask blueprint providing an API for accessing and searching an ElasticSearch index created from source datapackages.

endpoints

/get

/search/count

`/search/

download/<doctypes>

Downloads search results in either csv, xls or xlsx format.

Query parameters that can be send:

  • types_formatted: The type of the documents to search
  • search_term: The Elastic search query
  • size: Number of hits to return
  • offset: Whether or not term offsets should be returned
  • filters: What offset to use for the pagination
  • dont_highlight:
  • from_date: If there should be a date range applied to the search, and from what date
  • to_date: If there should be a date range applied to the search, and until what date
  • order:
  • file_format: The format of the file to be returned, either 'csv', 'xls' or 'xlsx'. If not passed the file format will be xlsx
  • file_name: The name of the file to be returned, by default the name will be 'search_results'
  • column_mapping: If the columns should get a different name then in the original data, a column map can be send, for example:
{
  "עיר": "address.city",
  "תקציב": "details.budget"
}

For example, get a csv file with column mapping:

http://localhost:5000/api/download/jobs?q=engineering&size=2&file_format=csv&file_name=my_results&column_mapping={%22mispar%22:%22Job%20ID%22}

Or get an xslx file without column mapping:

http://localhost:5000/api/download/jobs?q=engineering&size=2&file_format=xlsx&file_name=my_results

configuration

Flask configuration for this blueprint:

    from apies import apies_blueprint
    import elasticsearch

    app.register_blueprint(
        apies_blueprint(['path/to/datapackage.json', Package(), ...],
                        elasticsearch.Elasticsearch(...), 
                        {'doc-type-1': 'index-for-doc-type-1', ...}, 
                        'index-for-documents',
                        dont_highlight=['fields', 'not.to', 'highlight'],
                        text_field_rules=lambda schema_field: [], # list of tuples: ('exact'/'inexact'/'natural', <field-name>)
                        multi_match_type='most_fields',
                        multi_match_operator='and'),
        url_prefix='/search/'
    )

local development

You can start a local development server by following these steps:

  1. Install Dependencies:

    a. Install Docker locally

    b. Install Python dependencies:

    $ pip install dataflows dataflows-elasticsearch
    $ pip install -e .
    
  2. Go to the sample/ directory

  3. Start ElasticSearch locally:

    $ ./start_elasticsearch.sh
    

    This script will wait and poll the server until it's up and running. You can test it yourself by running:

    $ curl -s http://localhost:9200
         {
         "name" : "99cd2db44924",
         "cluster_name" : "docker-cluster",
         "cluster_uuid" : "nF9fuwRyRYSzyQrcH9RCnA",
         "version" : {
             "number" : "7.4.2",
             "build_flavor" : "default",
             "build_type" : "docker",
             "build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
             "build_date" : "2019-10-28T20:40:44.881551Z",
             "build_snapshot" : false,
             "lucene_version" : "8.2.0",
             "minimum_wire_compatibility_version" : "6.8.0",
             "minimum_index_compatibility_version" : "6.0.0-beta1"
         },
         "tagline" : "You Know, for Search"
         }
    
  4. Load data into the database

    $ DATAFLOWS_ELASTICSEARCH=localhost:9200 python load_fixtures.py
    

    You can test that data was loaded:

    $ curl -s http://localhost:9200/jobs-job/_count?pretty
     {
         "count" : 1757,
         "_shards" : {
             "total" : 1,
             "successful" : 1,
             "skipped" : 0,
             "failed" : 0
         }
     }
    
  5. Start the sample server

    $ python server.py 
     * Serving Flask app "server" (lazy loading)
     * Environment: production
     WARNING: Do not use the development server in a production environment.
     Use a production WSGI server instead.
     * Debug mode: off
     * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
    
  6. Now you can hit the server's endpoints, for example:

         $ curl -s 'localhost:5000/api/search/jobs?q=engineering&size=2' | jq
         127.0.0.1 - - [26/Jun/2019 10:45:31] "GET /api/search/jobs?q=engineering&size=2 HTTP/1.1" 200 -
         {
             "search_counts": {
                 "_current": {
                 "total_overall": 617
                 }
             },
             "search_results": [
                 {
                 "score": 18.812,
                 "source": {
                     "# Of Positions": "5",
                     "Additional Information": "TO BE APPOINTED TO ANY CIVIL <em>ENGINEERING</em> POSITION IN BRIDGES, CANDIDATES MUST POSSESS ONE YEAR OF CIVIL <em>ENGINEERING</em> EXPERIENCE IN BRIDGE DESIGN, BRIDGE CONSTRUCTION, BRIDGE MAINTENANCE OR BRIDGE INSPECTION.",
                     "Agency": "DEPARTMENT OF TRANSPORTATION",
                     "Business Title": "Civil Engineer 2",
                     "Civil Service Title": "CIVIL ENGINEER",
                     "Division/Work Unit": "<em>Engineering</em> Review & Support",
             ...
         }
    

Project details


Release history Release notifications | RSS feed

This version

1.4.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apies-1.4.1.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

apies-1.4.1-py2.py3-none-any.whl (15.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file apies-1.4.1.tar.gz.

File metadata

  • Download URL: apies-1.4.1.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.8

File hashes

Hashes for apies-1.4.1.tar.gz
Algorithm Hash digest
SHA256 27dc7a927e4e68ef4b0f4dc1370b069a0b0e961c033059e37983e3698fbc8d6f
MD5 fd0a6d38845fc0a3233b4ca2426ddfa6
BLAKE2b-256 05db3f115014b4f18e4ff26b5b7223e889dfc5d1307faad90d3534ab7812bfb7

See more details on using hashes here.

Provenance

File details

Details for the file apies-1.4.1-py2.py3-none-any.whl.

File metadata

  • Download URL: apies-1.4.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 15.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.8

File hashes

Hashes for apies-1.4.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f44e1ecb486613a412d401ec0c64f8927b2ef7b17d85ea8dbf66876edb0c1109
MD5 db64867965d1ad73be81543e6af19722
BLAKE2b-256 15bf57a85c88af3393d102df711f54b20fa27ef8ed42a8d4fef04c12acb17c60

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page