Skip to main content

python tools to assist with standardized data ingestion workflows for the OS-Climate project

Project description

osc-ingest-tools

python tools to assist with standardized data ingestion workflows

Install from PyPi

pip install osc-ingest-tools

Examples

>>> from osc_ingest_trino import *

>>> import pandas as pd

>>> data = [['tom', 10], ['nick', 15], ['juli', 14]]

>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years']).convert_dtypes()

>>> df
  First Name  Age In Years
0        tom            10
1       nick            15
2       juli            14

>>> enforce_sql_column_names(df)
  first_name  age_in_years
0        tom            10
1       nick            15
2       juli            14

>>> enforce_sql_column_names(df, inplace=True)

>>> df
  first_name  age_in_years
0        tom            10
1       nick            15
2       juli            14

>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   first_name    3 non-null      string
 1   age_in_years  3 non-null      Int64 
dtypes: Int64(1), string(1)
memory usage: 179.0 bytes

>>> p = create_table_schema_pairs(df)

>>> print(p)
    first_name varchar,
    age_in_years bigint

>>> 

Adding custom type mappings to create_table_schema_pairs

>>> df = pd.DataFrame(data, columns = ['First Name', 'Age In Years'])

>>> enforce_sql_column_names(df, inplace=True)

>>> df.info(verbose=True)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   first_name    3 non-null      object
 1   age_in_years  3 non-null      int64 
dtypes: int64(1), object(1)
memory usage: 176.0+ bytes

>>> p = create_table_schema_pairs(df, typemap={'object':'varchar'})

>>> print(p)
    first_name varchar,
    age_in_years bigint

>>>

build and upload a new release

  • update all occurrences of __version__
  • python3 setup.py clean
  • python3 setup.py sdist
  • twine check dist/*
  • twine upload dist/*
  • push latest to repo
  • create new release on github

upload test or release candidate:

python packaging resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

osc-ingest-tools-0.3.0.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

osc_ingest_tools-0.3.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file osc-ingest-tools-0.3.0.tar.gz.

File metadata

  • Download URL: osc-ingest-tools-0.3.0.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.0 requests-toolbelt/0.9.1 urllib3/1.26.7 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.2

File hashes

Hashes for osc-ingest-tools-0.3.0.tar.gz
Algorithm Hash digest
SHA256 fd36dae09be1eddc75534414962595e22e989b0165c73547a87a6ad0794f698b
MD5 2d0436de48c8bbe136a5e15594c1c4b0
BLAKE2b-256 50da0090f478c14ee65f423bf17c559e617ecae56f123d7e8044a0cfca515c6c

See more details on using hashes here.

File details

Details for the file osc_ingest_tools-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: osc_ingest_tools-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for osc_ingest_tools-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 487d82f04f57078d24d0f32ba66cea2bc507d3cedbc1deda7469c470abcc7aa6
MD5 6c2d8e145ae88f9d470c814cf78ad98b
BLAKE2b-256 c64a4daa3a0fe3af5141233fc2474a6727ec5c68b671c2ba04044dc17158bbce

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page