udata search service

Project description

udata-search-service

A search service for udata. The idea is to have search service separated from the udata MongoDB. The indexation update is made using real-time messages with Kafka.

See the following architecture schema: Udata Search Service architecture schema

Getting started

You can follow this recommended architecture for your code:

$WORKSPACE
├── fs
├── udata
│   ├── ...
│   └── setup.py
│		└── udata.cfg
├── udata-front
│   ├── ...
│   └── setup.py
└── udata-search-service
    ├── ...
    └── pyproject.toml

Clone the repository:

cd $WORKSPACE
git clone git@github.com:opendatateam/udata-search-service.git

Start the different services using docker-compose:

cd udata-search-service
docker-compose up

This will start:

an elasticsearch
a kafka broker
a zookeper
a kafka consumer
a search app

Initialize the elasticsearch indices on setup.

# Locally
udata-search-service init-es

# In the docker context
docker-compose run --entrypoint /bin/bash web -c 'udata-search-service init-es'

This will create the following indices:

{UDATA_INSTANCE_NAME}-dataset-{yyyy}-{mm}-{dd}-{HH}-{MM}
{UDATA_INSTANCE_NAME}-reuse-{yyyy}-{mm}-{dd}-{HH}-{MM}
{UDATA_INSTANCE_NAME}-organization-{yyyy}-{mm}-{dd}-{HH}-{MM}

Configure your udata to use the search service, by updating the following variables in your udata.cfg. Ex in local:

    SEARCH_SERVICE_API_URL = 'http://127.0.0.1:5000/api/1/'
    KAFKA_URI = 'localhost:9092'

You can feed the elasticsearch by publishing messages to Kafka. Using udata, when you modify objects, indexation messages will be sent and will be consumed by the kafka consumer. If you want to reindex your local mongo base in udata, you can run:

cd $WORKSPACE/udata/
source ./venv/bin/activate
udata search index

Make sure to have the corresponding UDATA_INSTANCE_NAME specified in your udata settings.

After a reindexation, you'll need to change the alias by using the following command:

# Locally
udata-search-service set-alias <index-suffix>

# In the docker context
docker-compose run --entrypoint /bin/bash web -c 'udata-search-service set-alias <index-suffix>'

You can query the search service with the search service api, ex: http://localhost:5000/api/1/datasets/?q=toilettes%20à%20rennes

Development

You can create a virtualenv, activate it and install the requirements with the following commands.

python3 -m venv venv
source venv/bin/activate
make deps
make install

You can start the Elasticsearch and Kafak broker and zookeper using the docker compose. You can start the consumer locally with the following command:

udata-search-service consume-kafka

You can start the web search service with the following command:

udata-search-service run

Deployment

The project depends on Kafka and ElasticSearch 7.16.

Elasticsearch requires the Analysis ICU plugin for your specific version. On Debian, you can take a look at these instructions for installation.

You will need a Kafka broker and zookeeper. You can follow the quick-start instructions to start all services in correct order.

You will need to start a search service app a kafka consumer. You can start these using uWSGI.

Troubleshooting

If the elasticsearch service exits with an error 137, it is killed due to out of memory error. You should read the following points.
If you are short on RAM, you can limit heap memory by setting ES_JAVA_OPTS=-Xms750m -Xmx750m as environment variable when starting the elasticsearch service.
If you are on MAC and still encounter RAM memory issues, you should increase Docker limit memory to 4GB instead of default 2GB.
If you are on Linux, you may need to double the vm.max_map_count. You can set it with the following command: sysctl -w vm.max_map_count=262144.
If you are on Linux, you may encounter permissions issues. You can either create the volume or change the user to the current user using chown.

Project details

Release history Release notifications | RSS feed

2.2.1.dev298 pre-release

Nov 7, 2024

2.2.0

Nov 7, 2024

2.1.1.dev295 pre-release

Nov 7, 2024

2.1.1.dev290 pre-release

Nov 5, 2024

2.1.1.dev279 pre-release

Oct 7, 2024

2.1.0

Oct 7, 2024

2.0.3.dev276 pre-release

Oct 7, 2024

2.0.3.dev274 pre-release

Jul 1, 2024

2.0.3.dev272 pre-release

May 29, 2024

2.0.3.dev265 pre-release

Nov 7, 2023

2.0.3.dev259 pre-release

Sep 1, 2023

2.0.2

Sep 1, 2023

2.0.2.dev256 pre-release

Sep 1, 2023

2.0.2.dev254 pre-release

Aug 25, 2023

2.0.2.dev250 pre-release

May 16, 2023

2.0.1

May 16, 2023

2.0.1.dev246 pre-release

May 16, 2023

2.0.1.dev241 pre-release

May 4, 2023

2.0.1.dev231 pre-release

Jan 9, 2023

2.0.0

Jan 9, 2023

1.0.4.dev228 pre-release

Jan 9, 2023

1.0.4.dev225 pre-release

Jan 9, 2023

1.0.4.dev223 pre-release

Dec 13, 2022

1.0.4.dev203 pre-release

Aug 30, 2022

1.0.4.dev198 pre-release

Jul 11, 2022

1.0.3

Jul 11, 2022

1.0.3.dev197 pre-release

Jul 11, 2022

1.0.3.dev191 pre-release

Jul 4, 2022

1.0.3.dev187 pre-release

Jul 4, 2022

1.0.3.dev181 pre-release

Jun 23, 2022

1.0.3.dev179 pre-release

Jun 23, 2022

1.0.3.dev172 pre-release

Jun 9, 2022

This version

1.0.2

Jun 9, 2022

1.0.1

Mar 30, 2022

1.0.1.dev168 pre-release

Jun 8, 2022

1.0.1.dev155 pre-release

May 19, 2022

1.0.1.dev147 pre-release

May 12, 2022

1.0.1.dev142 pre-release

May 9, 2022

1.0.1.dev132 pre-release

Mar 30, 2022

1.0.0

Mar 30, 2022

1.0.0.dev127 pre-release

Mar 30, 2022

0.0.0

Feb 22, 2022

0.0.0.dev121 pre-release

Mar 29, 2022

0.0.0.dev119 pre-release

Mar 23, 2022

0.0.0.dev116 pre-release

Mar 17, 2022

0.0.0.dev112 pre-release

Mar 15, 2022

0.0.0.dev110 pre-release

Mar 15, 2022

0.0.0.dev101 pre-release

Mar 10, 2022

0.0.0.dev97 pre-release

Mar 10, 2022

0.0.0.dev89 pre-release

Mar 9, 2022

0.0.0.dev85 pre-release

Mar 3, 2022

0.0.0.dev48 pre-release

Feb 25, 2022

0.0.0.dev43 pre-release

Feb 25, 2022

0.0.0.dev1 pre-release

Feb 24, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata-search-service-1.0.2.tar.gz (40.2 kB view details)

Uploaded Jun 9, 2022 Source

Built Distribution

udata_search_service-1.0.2-py2.py3-none-any.whl (18.3 kB view details)

Uploaded Jun 9, 2022 Python 2 Python 3

File details

Details for the file udata-search-service-1.0.2.tar.gz.

File metadata

Download URL: udata-search-service-1.0.2.tar.gz
Upload date: Jun 9, 2022
Size: 40.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.27.1

File hashes

Hashes for udata-search-service-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`41ed80830d4f635e1280375e221ae2ddd503abacc0e875a9f2ed6973ae53b0b0`
MD5	`9a268c4f71b907b03f21a91d0f8ea189`
BLAKE2b-256	`bc6ab8a0eb83a357b647d07312760aa160364f7c2c764236f9b95377f6f251b8`

See more details on using hashes here.

File details

Details for the file udata_search_service-1.0.2-py2.py3-none-any.whl.

File metadata

Download URL: udata_search_service-1.0.2-py2.py3-none-any.whl
Upload date: Jun 9, 2022
Size: 18.3 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-requests/2.27.1

File hashes

Hashes for udata_search_service-1.0.2-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`28289976f4f5517b7dfaf085e65155297c0b76dbdd25bd5297bc9e160371eb08`
MD5	`c60e9d69813ebfffe3cf90544063ade7`
BLAKE2b-256	`7bab006d02630eae32ed122837f7d2e69b1739945227ae6af809546f892d35d8`