Skip to main content

udata search service

Project description

udata-search-service

A search service for udata. The idea is to have search service separated from the udata MongoDB. The indexation update is made using real-time messages with Kafka.

See the following architecture schema: Udata Search Service architecture schema

Getting started

You can follow this recommended architecture for your code:

$WORKSPACE
├── fs
├── udata
│   ├── ...
│   └── setup.py
│		└── udata.cfg
├── udata-front
│   ├── ...
│   └── setup.py
└── udata-search-service
    ├── ...
    └── pyproject.toml

Clone the repository:

cd $WORKSPACE
git clone git@github.com:opendatateam/udata-search-service.git

Start the different services using docker-compose:

cd udata-search-service
docker-compose up

This will start:

  • an elasticsearch
  • a kafka broker
  • a zookeper
  • a kafka consumer
  • a search app

Initialize the elasticsearch indices on setup.

docker-compose run --entrypoint /bin/bash web -c 'udata-search-service init-es'

You can feed the elasticsearch by publishing messages to Kafka. Using udata, when you modify objects, indexation messages will be sent and will be consumed by the kafka consumer.

If you want to reindex your local mongo base, you can run:

cd $WORKSPACE/udata/
source ./venv/bin/activate
udata search index

You can query the search service with the search service api, ex: http://localhost:5000/api/1/datasets/?q=toilettes%20à%20rennes

Development

You can create a virtualenv, activate it and install the requirements with the following commands.

python3 -m venv venv
source venv/bin/activate
make deps
make install

You can start the Elasticsearch and Kafak broker and zookeper using the docker compose. You can start the consumer locally with the following command:

udata-search-service consume-kafka

You can start the web search service with the following command:

udata-search-service run

Troubleshooting

  • If the elasticsearch service exits with an error 137, it is killed due to out of memory error. You should read the following points.
  • If you are short on RAM, you can limit heap memory by setting ES_JAVA_OPTS=-Xms750m -Xmx750m as environment variable when starting the elasticsearch service.
  • If you are on MAC and still encounter RAM memory issues, you should increase Docker limit memory to 4GB instead of default 2GB.
  • If you are on Linux, you may need to double the vm.max_map_count. You can set it with the following command: sysctl -w vm.max_map_count=262144.
  • If you are on Linux, you may encounter permissions issues. You can either create the volume or change the user to the current user using chown.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata-search-service-0.0.0.dev48.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

udata_search_service-0.0.0.dev48-py2.py3-none-any.whl (14.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file udata-search-service-0.0.0.dev48.tar.gz.

File metadata

File hashes

Hashes for udata-search-service-0.0.0.dev48.tar.gz
Algorithm Hash digest
SHA256 45c13cb354ae1c41e89b97b4c9dcfb14e09d33ece2c839f0e87075bb14391785
MD5 50d8be018209f54ca3f042aa957a6361
BLAKE2b-256 f3191f8058f7637494378e917c2c3123b2ae1276d0f9a8bfc2055765b74855a9

See more details on using hashes here.

File details

Details for the file udata_search_service-0.0.0.dev48-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for udata_search_service-0.0.0.dev48-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 bebbf49c32c85084441113f21d7f3526dc33090b06ecd3afad0657ca4c9902d1
MD5 b4155e1e9b6b445684ff42b5bcae7b60
BLAKE2b-256 1edf4e7f4546eeba839c219249f9cf0eda95251660ce8b259eb87d8f781e9570

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page