udata analysis service
Project description
udata-analysis-service
This service's purpose is to analyse udata datalake files to enrich the metadata, starting with CSVs. It uses csv-detective to detect the type and format of CSV columns by checking both headers and contents.
Installation
Install udata-analysis-service:
pip install udata-analysis-service
Rename the .env.sample
to .env
and fill it with the right values.
REDIS_URL = redis://localhost:6381/0
REDIS_HOST = localhost
REDIS_PORT = 6381
KAFKA_HOST = localhost
KAFKA_PORT = 9092
KAFKA_API_VERSION = 2.5.0
MINIO_URL = https://object.local.dev/
MINIO_USER = sample_user
MINIO_PWD = sample_pwd
ROWS_TO_ANALYSE_PER_FILE=500
CSV_DETECTIVE_REPORT_BUCKET = benchmark-de
CSV_DETECTIVE_REPORT_FOLDER = report
TABLESCHEMA_BUCKET = benchmark-de
TABLESCHEMA_FOLDER = schemas
UDATA_INSTANCE_NAME=udata
Usage
Start the Kafka consumer:
udata-analysis-service consume
Start the Celery worker:
udata-analysis-service work
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for udata-analysis-service-0.0.1.dev34.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67c3c5040c8708b5a340c4924b093bb6ae775eff719a545a9880aa9e01d3e238 |
|
MD5 | f1a37b1d1f7c4112fb05b6769a2b012a |
|
BLAKE2b-256 | 3c5b4213afaf08b3886c1650f022a5d4dc98f1f18c636b7c3e83676fe2626f69 |
Close
Hashes for udata_analysis_service-0.0.1.dev34-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3816a2946bea8803eed9595f0fd9b6149c1f5df2e013384e6ad9c1f7c15f4a3a |
|
MD5 | 36a66d033e003cbb2791ae0641a9ca4c |
|
BLAKE2b-256 | 11eef3fe98b1b546fed7cb13c92e74f5569dbf1dd28835ae9c5ec0c6c1029901 |