Skip to main content

A utility library for working with Table Schema in Python

Project description

dataflows-aws

Travis Coveralls

Dataflows's processors to work with AWS

Features

  • dump_to_s3 processor
  • change_acl_on_s3 processor

Contents

Getting Started

Installation

The package use semantic versioning. It means that major versions could include breaking changes. It's recommended to specify package version range in your setup/requirements file e.g. package>=1.0,<2.0.

$ pip install dataflows-aws

Examples

These processors have to be used as a part of data flow. For example:

flow = Flow(
    load('data/data.csv'),
    dump_to_s3(
        bucket=bucket,
        acl='private',
        path='my/datapackage',
        endpoint_url=os.environ['S3_ENDPOINT_URL'],
    ),
)
flow.process()

Documentation

dump_to_s3

Saves the DataPackage to AWS S3.

Parameters

  • bucket - Name of the bucket where DataPackage will be stored (should already be created!)
  • acl - ACL to provide the uploaded files. Default is 'public-read' (see boto3 docs for more info).
  • path - Path (key/prefix) to the DataPackage. May contain format string available for datapackage.json Eg: my/example/path/{owner}/{name}/{version}
  • content_type - content type to use when storing files in S3. Defaults to text/plain (usual S3 default is binary/octet-stream but we prefer text/plain).
  • endpoint_url - api endpoint to allow using S3 compatible services (e.g. 'https://ams3.digitaloceanspaces.com')

change_acl_on_s3

Changes ACL of object in given Bucket with given path aka prefix.

Parameters

  • bucket - Name of the bucket where objects are stored
  • acl - Available options 'private'|'public-read'|'public-read-write'|'authenticated-read'|'aws-exec-read'|'bucket-owner-read'|'bucket-owner-full-control'
  • path - Path (key/prefix) to the DataPackage.
  • endpoint_url - api endpoint to allow using S3 compatible services (e.g. 'https://ams3.digitaloceanspaces.com')

Contributing

The project follows the Open Knowledge International coding standards.

The recommended way to get started is to create and activate a project virtual environment. To install package and development dependencies into your active environment:

$ make install

To run tests with linting and coverage:

$ make test

For linting, pylama (configured in pylama.ini) is used. At this stage it's already installed into your environment and could be used separately with more fine-grained control as described in documentation - https://pylama.readthedocs.io/en/latest/.

For example to sort results by error type:

$ pylama --sort <path>

For testing, tox (configured in tox.ini) is used. It's already installed into your environment and could be used separately with more fine-grained control as described in documentation - https://testrun.org/tox/latest/.

For example to check subset of tests against Python 2 environment with increased verbosity. All positional arguments and options after -- will be passed to py.test:

tox -e py37 -- -v tests/<path>

Under the hood tox uses pytest (configured in pytest.ini), coverage and mock packages. These packages are available only in tox envionments.

Changelog

Here described only breaking and the most important changes. The full changelog and documentation for all released versions can be found in the nicely formatted commit history.

v0.x

  • an initial processors implementation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataflows-aws-0.2.3.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

dataflows_aws-0.2.3-py2.py3-none-any.whl (7.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file dataflows-aws-0.2.3.tar.gz.

File metadata

  • Download URL: dataflows-aws-0.2.3.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.5

File hashes

Hashes for dataflows-aws-0.2.3.tar.gz
Algorithm Hash digest
SHA256 fc3f1c7cdecc5c1787a0820e3dda8351a2230be630f01be04963a72fc8542fc2
MD5 7d73a1e943ef3f86e5a920319e05bfc9
BLAKE2b-256 b9adda15ce2564d6dd8cf1857f7c2bd4ea5f051fc8c19342cd0354fc2e359ab2

See more details on using hashes here.

Provenance

File details

Details for the file dataflows_aws-0.2.3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for dataflows_aws-0.2.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 eb7b5d331216ec27536a1b7a8db1793c14cf5aecdc3849ff90796f7cf5a127c3
MD5 50a7d24a9c8f74a63b4b56c5b6b95aea
BLAKE2b-256 ab444ea0cc8b83ec6d32cf1ef6d345cfc38670d6a4d25fe000c31678946c35f9

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page