Skip to main content

A Python API that enables data consumers and distributors to easily use and share datasets, and establishes a standard for exchanging data assets.

Project description

PyPI PyPI - Python Version PyPI - Implementation Gitter Runtime Tests Lint Docs Development Environment

ParData (homophone of partake) is a Python API that enables data consumers and distributors to easily use and share datasets, and establishes a standard for exchanging data assets. It enables:

  • a data scientist to have a simpler and more unified way to begin working with a wide range of datasets, and

  • a data distributor to have a consistent, safe, and open source way to share datasets with interested communities.

Install the Package & its Dependencies

To install the latest version of ParData, run

$ pip install pardata

Alternatively, if you have downloaded the source, switch to the source directory (same directory as this README file, cd /path/to/pardata-source) and run

$ pip install -U .

Quick Start

Import the package and load a dataset. ParData will download WikiText-103 dataset (version 1.0.1) if it’s not already downloaded, and then load it.

import pardata
wikitext103_data = pardata.load_dataset('wikitext103')

View available ParData datasets and their versions.

>>> pardata.list_all_datasets()
{'claim_sentences_search': ('1.0.2',), ..., 'wikitext103': ('1.0.1',)}

To view your globally set configs for ParData, such as your default data directory, use pardata.get_config.

>>> pardata.get_config()
Config(DATADIR=PosixPath('dir/to/download/load/from'), ..., DATASET_SCHEMA_FILE_URL='file/to/load/datasets/from')

By default, pardata.load_dataset downloads to and loads from ~/.pardata/data/<dataset-name>/<dataset-version>/. To change the default data directory, use pardata.init.

pardata.init(DATADIR='new/dir/to/download/load/from')

Load a previously downloaded dataset using pardata.load_dataset. With the new default data dir set, ParData now searches for the Groningen Meaning Bank dataset (version 1.0.2) in new/dir/to/download/load/from/gmb/1.0.2/.

gmb_data = load_dataset('gmb', version='1.0.2', download=False)  # assuming GMB dataset was already downloaded

To learn more about ParData, check out the documentation and the tutorial.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pardata-0.4.0.tar.gz (13.2 MB view details)

Uploaded Source

Built Distribution

pardata-0.4.0-py3-none-any.whl (45.1 kB view details)

Uploaded Python 3

File details

Details for the file pardata-0.4.0.tar.gz.

File metadata

  • Download URL: pardata-0.4.0.tar.gz
  • Upload date:
  • Size: 13.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.6

File hashes

Hashes for pardata-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b310b6e256e56092a8218126eec8021cc4cf204e44a36309054ac2e9dd78b80f
MD5 0879874a1175e459e5e5c707e35697f0
BLAKE2b-256 c8a87b809d0e0048a9cb3d8ea3f5635f9039eb38712f30c8971fff514efa0bd0

See more details on using hashes here.

File details

Details for the file pardata-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: pardata-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 45.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.6

File hashes

Hashes for pardata-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bc6d8561937fbe3879a3c03dbb062a41b2c214749e5b538193ae849c4267984b
MD5 59fcbb8d401fc2bef257c1a2698ed058
BLAKE2b-256 a0be1fb95324e172cc8e19119e3e5d7a983df402206d61a0a584e1179e894150

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page