Python utilities for creating an ElasticSearch database containing information on available CLI modules
Project description
The files in this repository allow you to create an Elasticsearch database containing information on available CLI modules. The idea is that we have a public Kibana dashboard listing CLI modules from multiple sources, so there is a script with two modes:
- ‘extract’ mode
Extracts JSON descriptions from a set of CLI modules (in one or more common directories).
# ./ctk_cli_indexer.py extract --help usage: ctk_cli_indexer.py extract [-h] [--json_filename JSON_FILENAME] base_directory [base_directory ...] positional arguments: base_directory directories (at least one) in which to search for CLI module executables, or direct paths to executables optional arguments: -h, --help show this help message and exit --json_filename JSON_FILENAME, -o JSON_FILENAME
This is to be run by the administrators of sites that offer CLI modules, and the idea is that the resulting .json files are published on some website.
- ‘index’ mode
Takes a JSON file (or a list of CLI executables) and updates an Elasticsearch database. An identifier for the source of the CLI modules is passed as first parameter, and the script takes care to delete old documents in the database (CLIs that got removed), and will also maintain timestamps of the last change of each CLI (i.e. not re-upload stuff that did not change, as well as mark each change with the modification time of the CLI executable that introduced the change). Instead of passing a JSON file, you may also pass a list of directories or CLI executables directly.
# ./ctk_cli_indexer.py index --help usage: ctk_cli_indexer.py index [-h] [--host HOST] [--port PORT] source_name path [path ...] positional arguments: source_name identifier for the source (e.g. 'slicer' or 'nifty-reg') of this set of CLI modules (will be used to remove old documents from this source from the Elasticsearch index if they are no longer present) path one or more directories in which to search for CLI module executables, paths to CLI executables, or (exactly one) JSON file as created by `extract` subcommand optional arguments: -h, --help show this help message and exit --host HOST hostname elasticsearch is listening on (default: localhost) --port PORT port elasticsearch is listening on (default: 9200)
This script should be run by a cron job (i.e. setup by a CTK administrator), from a script that pulls the above-mentioned .json URLs regularly and updates a central database. A Kibana dashboard will then give interested people an overview over the available modules from multiple sites.
System Prerequisites
The following software packages are required to be installed on your system:
The following python packages will be automatically installed if not present (see requirements.txt, listed here in case you prefer to install them via your system’s package manager):
Installation for user
Use pip (or easy_install) for installation from pypi:
pip install ctk-cli-indexer
Installation for developer
First download the source:
git clone git://github.com/commontk/ctk-cli-indexer.git
To use the module, you must install some external python package dependencies:
cd ctk-cli-indexer pip install -r requirements.txt
Elasticsearch Setup
In order to use this code, you must have access to a running Elasticsearch server. This section shall give just the basic instructions for getting started. First, download the latest stable elasticsearch and kibana tarballs (logstash is not necessary / used here).
Elasticsearch is written in Java, so you can basically unpack the tarball and run bin/elasticsearch, and the server should be running on http://localhost:9200/ (yes, you can just try that URL in the browser, and you should get some status JSON). This default location is also built into the indexer script, so you may immediately start indexing. One may use http://localhost:9200/cli/cli/_search?pretty=1 to check whether there is data in the index.
Kibana is a purely browser-based web application (based on client-side HTML and JS), so you can serve the files using any kind of HTTP server, e.g.:
cd kibana-3.1.1/ python -m SimpleHTTPServer
which will serve Kibana on http://localhost:8000/ You may even be able to use Kibana without any HTTP server, just by opening kibana-x.y.z/index.html within your browser. In that case, you may want to edit config.js to point to the server like this:
elasticsearch: "http://localhost:9200",
That’s it! If you see the welcome dashboard in the browser, you’re all set. Note that you can even store dashboards within Kibana; by default, they will be stored within Elasticsearch, so you don’t even have to care about filesystem access.
First Steps with Kibana
I suggest to start with a blank dashboard (link at the bottom of the default dashboard). Start by going to the dashboard settings (cog in the upper right corner) and under “Index”, select ‘cli’ as default index and enable autocompletion under “Preload Fields”.
Next, add rows (“Rows” tab in the dashboard settings), for instance, one with 200px height, one with 300px, and a third with 500px. Don’t forget to press “Create Row” for each row (in particular, also for the last one), then press “Save”.
Within each row, there is an (invisible) 12-column layout, so you want to add “widgets” now that span either 3 or 4 such columns. Start with “Terms” widgets only, try different fields (e.g. “license”), and different view options (in particular, the bar/pie/table styles).
The widgets allow interactive filtering, e.g. click on a specific term to filter by license / author / source / category; active filters will be listed and can be cleared at the top (make sure that line is not collapsed). There is also a search row where you can try entering keywords.
The last row (which we made particularly high) was intended for a “Table” widget (like on the sample dashboard), which can be used to list all matching documents.
Now play around with the various options, and don’t forget to save your dashboard (floppy symbol near the upper right corner). If you enable “Save to > Export” and “Load from > Local file” under “Controls” in the dashboard settings, you can also download/upload the dashboard as JSON. Furthermore, you can make the dashboard your default/home dashboard. Within this repository, you also find an example dashboard in the file cli_dashboard.json.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ctk-cli-indexer-0.6.tar.gz
.
File metadata
- Download URL: ctk-cli-indexer-0.6.tar.gz
- Upload date:
- Size: 8.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 529d0e63c6b6a6215a7c407d1c270ccde2eaf9080042b2a5834762603af5c0c4 |
|
MD5 | be5d78129224757aae96f0368c51f07d |
|
BLAKE2b-256 | 34faa2635978fabe221ca2b40f15c3763f80f33ff1d744a8ad32182dca71b3c6 |