python tools to assist with standardized data ingestion workflows for the OS-Climate project
Project description
osc-ingest-tools
python tools to assist with standardized data ingestion workflows
Install from PyPi
pip install osc-ingest-tools
Examples
>>> from osc_ingest_trino import *
>>> import pandas as pd
>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()
>>> df
First Name Age In Years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df)
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> enforce_sql_column_names(df, inplace=True)
>>> df
first_name age_in_years
0 tom 10
1 nick 15
2 juli 14
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null string
1 age_in_years 3 non-null Int64
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes
>>> p = create_table_schema_pairs(df)
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
Adding custom type mappings to create_table_schema_pairs
>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])
>>> enforce_sql_column_names(df, inplace=True)
>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 first_name 3 non-null object
1 age_in_years 3 non-null int64
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes
>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})
>>> print(p)
first_name varchar,
age_in_years bigint
>>>
build and upload a new release
- update all occurrences of
__version__
python3 setup.py clean
python3 setup.py sdist
twine check dist/*
twine upload dist/*
- push latest to repo
- create new release on github
upload test or release candidate:
- twine upload --repository-url https://test.pypi.org/legacy/ dist/*
python packaging resources
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file osc-ingest-tools-0.2.1.tar.gz
.
File metadata
- Download URL: osc-ingest-tools-0.2.1.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 144ea91dcf3350ca500a84d2f6e442aa6cee5c94f0e5748bdb08bc36afa2f828 |
|
MD5 | 4038f84b284f96df2cc5c026b6296111 |
|
BLAKE2b-256 | 5ab140fbc5d979d379597cd1df9c6e00f2df023d04a5df66885f4b60719282ac |