opensearch integration with CastleCMS and Plone
Project description
wildcard.hps
CastleCMS and Plone integration with OpenSearch
This product was forked from collective.elasticsearch in order to provide integration with OpenSearch instead of ElasticSearch. OpenSearch itself is a fork of ElasticSearch and compatible with, at least, the ES 7.10.x series of releases (at least at opensearch-py 1.1.0). Compatibility may diverge in the future, and while the collective.elasticsearch package will likely try to maintain compatibility with ElasticSearch, wildcard.hps is intended to maintain compatibility with OpenSearch.
Quickstart
First, start up an instance (for official guides, see the opensearch project documentation)
$ docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:latest
$ curl -XGET https://localhost:9200 -u 'admin:admin' -k
Second, setup Plone/CastleCMS:
- add
wildcard.hps
to theeggs
section of your buildout - run buildout
- restart your instance, using relevant Environment Variables to connect to your opensearch instance
- install the 'Wildcard HPS' product
- under the 'Wildcard HPS' control panel, click 'Convert Catalog' then 'Rebuild Catalog'
Configuration Settings are passed as environment variables. See the "Configuration" section below for more details.
Overview
This package aims to index all fields the portal_catalog
indexes
and allows you to delete the Title
, Description
and SearchableText
indexes which can provide significant improvement to performance and RAM usage.
OpenSearch queries are ONLY used when Title, Description and SearchableText text are in the query. Otherwise, Plone's default catalog will be used. This is because Plone's default catalog is faster on normal queries than using OpenSearch.
Configuration
Configuration for OpenSearch connections, and custom index naming, is done through Environment Variables. This allows per-instance customization without the need to modify site data, and allows for many deployments to use the same cluster(s) without needing to do per-site customized index names.
Available Environrment Variable Options:
HPS_ZOPE_CONF_PATH
- path to a zope.conf to get a Zope app instance
- NOTE: this is only needed for the
reindex_hps
script that gets installed. Seewildcard/hps/scripts/reindex.py
.
HPS_OVERRIDE_LOGGING
- if present, will tell the
reindex_hps
script to override the root logging configuration, and print logging to console at INFO level.
- if present, will tell the
HPS_FORCE_ENABLE
- default: no
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- will force the "enabled" lookup to be True
HPS_INSTANCE_INDEX_PREFIX
- default: None
- a string value prepended to index names used by the Plone instances this addon is installed into
HPS_INCLUDE_TRASHED_BY_DEFAULT
- default: no
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- will default searchResults to include trashed entries (which are not included by default)
HPS_FOCE_EXTERNAL_INDEXES
- default: None
- a list of object properties that will be included in the externally index object (IE the indexed object in opensearch)
OPENSEARCH_HOSTS
- default: https://admin:admin@localhost:9200
- a list of RFC-1738 formated urls. multiple urls can be specified by putting a space between urls.
- NOTE: for now, the opensearch-py (1.1.0) does not respect the HTTP auth info that is formatted
as part of the URL, instead use
OPENSEARCH_HTTP_USERNAME
andOPENSEARCH_HTTP_PASSWORD
to pass the same HTTP auth to each request to any node listed as a host.
OPENSEARCH_HTTP_USERNAME
- default: None
- a username to use in all connections to any node in the
OPENSEARCH_HOSTS
list
OPENSEARCH_HTTP_PASSWORD
- default: None
- a password to use in all connections to any node in the
OPENSEARCH_HOSTS
list
OPENSEARCH_TIMEOUT
- default connection timeout
OPENSEARCH_RETRY_ON_TIMEOUT
- default: Off
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- retry connection to different node when connection fails
OPENSEARCH_SNIFF_ON_START
- default: False
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- refresh nodes before doing anything
OPENSEARCH_SNIFF_ON_CONNECTION_FAIL
- default: False
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- refresh nodes after a node fails to respond
OPENSEARCH_SNIFFER_TIMEOUT
- default: None
- refresh node list on this time (in seconds) interval
OPENSEARCH_SNIFF_TIMEOUT
- default: 0.1
- timeout of sniff request
OPENSEARCH_USE_SSL
- default: False
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- connections to OpenSearch will use SSL
OPENSEARCH_VERIFY_CERTS
- default: True
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- verify SSL certificates when using SSL connections to OpenSearch
OPENSEARCH_SSL_SHOW_WARN
- default: True
- accepted values (all other values are equivalent to False): Yes, True, 1, On
- when verifying SSL certificates is disabled, then a warning will be shown by default
OPENSEARCH_CERTS_PATH
- default: None
- a path to a directory containing CA Certificates used in SSL verification
OPENSEARCH_CLIENT_CERT_PATH
- default: None
- a path to a PEM formated SSL client certificate for SSL client auth
OPENSEARCH_CLIENT_CERT_KEY
--- default: None
- a path to a PEM formated SSL client key for SSL client auth
Compatibility
Only tested with Plone 5 with Dexterity types.
Only compatible with versions of OpenSearch (and ElasticSearch) compatible
with the opensearch-py
library.
For ElasticSearch integration, see collective.elasticsearch.
State
Support for all index column types is done EXCEPT for the DateRecurringIndex index column type. If you are doing a full text search along with a query that contains a DateRecurringIndex column, it will not work.
Celery support
This package comes with Celery support where all indexing operations will be pushed into celery to be run asynchronously.
Please see instructions for collective.celery to see how this works.
Running tests
First, start an instance of OpenSearch.
Second,
$ virtualenv ./env
$ ./env/bin/pip install -r requirements.txt
$ ./env/bin/buildout -c buildout.cfg
$ ./bin/test
Changelog
1.4.3 (2023-10-11)
- handle unicode for index data derived from IAdditionalIndexDataProvider adapters
1.4.2 (2023-05-15)
- abstract unicode handling code for hook when getting index data, and handle tuples, lists, and dict values
1.4.1 (2023-05-11)
- handle unicode error and fix bug in hook when getting index data
1.4.0 (2022-11-04)
- allow a custom prefix to be defined for fetching connection settings from the environment (default to the previous hard-coded 'OPENSEARCH_' value)
1.3.0 (2022-08-17)
- add HPS_FORCE_EXTERNAL_INDEXES
- update default set returned when external indexes setting is not configured yet
1.2.1 (2022-06-23)
- fix some view name's in the control panel templates
1.2.0 (2022-05-25)
- add HPS_INCLUDE_TRASHED_BY_DEFAULT env for disabling a filter on searchResults from WildcardHPSCatalog (see readme entry for HPS_INCLUDE_TRASHED_BY_DEFAULT)
1.1.1 (2022-05-12)
- add property on wildcard.hps.opensearch.WildcardHPSCatalog for the instance prefix
1.1.0 (2022-05-12)
- initial fork from: https://github.com/collective/collective.elasticsearch/commit/d21bf7b9311a9fc923283eeff11c42f4145180b4 this fork aims to primarily maintain compatibility with the OpenSearch project, which itself has forked from ElasticSearch 7.10.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.