python tools to assist with standardized data ingestion workflows for the OS-Climate project
Project description
osc-ingest-tools
python tools to assist with standardized data ingestion workflows
Install from PyPi
pip install osc-ingest-tools
Examples
>>> from osc_ingest_trino import *
>>> import pandas as pd
>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()
>>> df
First Name Age In Years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df)
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df, inplace=True)
>>> df
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null string
1 age_in_years 3 non-null Int64
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes
>>> p = create_table_schema_pairs(df)
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
Adding custom type mappings to create_table_schema_pairs
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])
>>> enforce_sql_column_names(df, inplace=True)
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null object
1 age_in_years 3 non-null int64
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes
>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
Development
Patches may be contributed via pull requests to https://github.com/os-climate/osc-ingest-tools.
All changes must pass the automated test suite, along with various static checks.
Black code style and isort import ordering are enforced.
Enabling automatic formatting via pre-commit is recommended:
pip install black isort pre-commit
pre-commit install
To ensure compliance with static check tools, developers may wish to run;
pip install black isort
# auto-sort imports
isort .
# auto-format code
black .
Code can then be tested using tox.
# run static checks and tests
tox
# run only tests
tox -e py3
# run only static checks
tox -e static
# run tests and produce a code coverage report
tox -e cov
build and upload a new release
- update all occurrences of
__version__
python3 setup.py clean
python3 setup.py sdist
twine check dist/*
twine upload dist/*
- push latest to repo
- create new release on github
upload test or release candidate:
- twine upload --repository-url https://test.pypi.org/legacy/ dist/*
python packaging resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file osc-ingest-tools-0.3.1.tar.gz
.
File metadata
- Download URL: osc-ingest-tools-0.3.1.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 518c1a754814dc2d5ab44319ec27656e7dafdad0672fc12cb6474975a8261df5 |
|
MD5 | e6272954491fa5e05d26c60c43755d58 |
|
BLAKE2b-256 | ea8c76a0a2b14d049f86e26785078cea15b82d59e19a7f1523d172c009ef90b1 |
File details
Details for the file osc_ingest_tools-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: osc_ingest_tools-0.3.1-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1be7514c4666c1f04a80850e6e1ead2296c1a3e8622d7ab800a7969efd6a7a9e |
|
MD5 | 3b12224195ffc34cf7f5f48713eb9255 |
|
BLAKE2b-256 | dd36fbcdad4a826c2c661e21f8bf4a0258bda8b1bce3787a4bdfe2c5b556f7e5 |