python tools to assist with standardized data ingestion workflows for the OS-Climate project
Project description
osc-ingest-tools
python tools to assist with standardized data ingestion workflows
Install from PyPi
pip install osc-ingest-tools
Examples
>>> from osc_ingest_trino import *
>>> import pandas as pd
>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()
>>> df
First Name Age In Years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df)
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df, inplace=True)
>>> df
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null string
1 age_in_years 3 non-null Int64
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes
>>> p = create_table_schema_pairs(df)
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
Adding custom type mappings to create_table_schema_pairs
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])
>>> enforce_sql_column_names(df, inplace=True)
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null object
1 age_in_years 3 non-null int64
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes
>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
Development
Patches may be contributed via pull requests to https://github.com/os-climate/osc-ingest-tools.
All changes must pass the automated test suite, along with various static checks.
Black code style and isort import ordering are enforced.
Enabling automatic formatting via pre-commit is recommended:
pip install black isort pre-commit
pre-commit install
To ensure compliance with static check tools, developers may wish to run;
pip install black isort
# auto-sort imports
isort .
# auto-format code
black .
Code can then be tested using tox.
# run static checks and tests
tox
# run only tests
tox -e py3
# run only static checks
tox -e static
# run tests and produce a code coverage report
tox -e cov
Releasing
To release a new version of this library, authorized developers should;
- Prepare a signed release commit updating
version
in setup.py - Tag the commit using Semantic Versioning prepended with "v"
- Push the tag
E.g.,
git commit -sm "Release v0.3.4"
git tag v0.3.4
git push --follow-tags
A Github workflow will then automatically release the version to PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file osc-ingest-tools-0.4.3.tar.gz
.
File metadata
- Download URL: osc-ingest-tools-0.4.3.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a0b0ba5d70a90211bb4eea0f1476325cdfaea456fb60883c5301fcfba134290 |
|
MD5 | ba5bf6a1721e3d719913ac6817487e2d |
|
BLAKE2b-256 | f4cd1680d2c142eb7fa384dd504517d51226474e4e77b62dbce5e0922f194d9c |
File details
Details for the file osc_ingest_tools-0.4.3-py3-none-any.whl
.
File metadata
- Download URL: osc_ingest_tools-0.4.3-py3-none-any.whl
- Upload date:
- Size: 14.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 246f01788f8c6921459f37cb336e870b0725de4f7f9afb4718694cbd3b19d76b |
|
MD5 | 110d44cb59fc809d42d756f90aac21ba |
|
BLAKE2b-256 | a2cf06e15c28921ef9952a00b28c91846216383ccd9b9461b3609f62eb08ec2a |