Skip to main content

A package containing common components for the roocs project

Project description

roocs-utils

Pypi Build Status Documentation

A package containing common components for the roocs project

Features

    1. Data Inventories

1. Data Inventories

The module roocs_utils.inventory provides tools for writing inventories of the known data holdings in a YAML format.

For each project in roocs_utils/etc/roocs.ini there are options to set the file paths for the inputs and outputs of this inventory maker. A list of datasets to include in the inventory needs to be provided. The path to this list for each project can be set in roocs_utils/etc/roocs.ini

Creating batches

Once the list of datasets is collated a number of batches must be created:

$ python roocs_utils/inventory/cli.py create-batches -p c3s-cmip6

The option -p is required to specify the project.

Creating inventory records

Once the batches are created, the inventory maker can be run - either locally or on lotus. The settings for how many datasets to be included in a batch and the maximum duration of each job on lotus can also be changed in roocs_utils/etc/roocs.ini.

Each batch can be run idependently, e.g. running batch 1 locally:

$ python roocs_utils/inventory/cli.py run -p c3s-cmip6 -b 1 -r local

or running all batches on lotus:

$ python roocs_utils/inventory/cli.py run -p c3s-cmip6 -r lotus

This creates a pickle file containing an ordered dictionary of the inventory for each dataset. It also creates a pickle file for any errors.

Viewing records and errors

To view the records:

$ python roocs_utils/inventory/cli.py list -p c3s-cmip6

and to see any errors:

$ python roocs_utils/inventory/cli.py show-errors -p c3s-cmip6

To just get a count of how many datasets have been scanned:

$ python roocs_utils/inventory/cli.py list -p c3s-cmip6 -c

Writing the inventory

The final command is to write the inventory to a yaml file. There are 2 options for this.

$ python roocs_utils/inventory/cli.py write -p c3s-cmip6 -v files

writes the inventory file c3s-cmip6-inventory-files.yml and includes the file names for each dataset:

- path: ScenarioMIP/CCCma/CanESM5/ssp370/r1i1p1f1/Amon/rsutcs/gn/v20190429
  ds_id: c3s-cmip6.ScenarioMIP.CCCma.CanESM5.ssp370.r1i1p1f1.Amon.rsutcs.gn.v20190429
  var_id: rsutcs
  array_dims: time lat lon
  array_shape: 1032 64 128
  time: 2015-01-16T12:00:00 2100-12-16T12:00:00
  latitude: -87.86 87.86
  longitude: 0.00 357.19
  size: 33845952
  size_gb: 0.03
  file_count: 1
  facets:
    mip_era: c3s-cmip6
    activity_id: ScenarioMIP
    institution_id: CCCma
    source_id: CanESM5
    experiment_id: ssp370
    member_id: r1i1p1f1
    table_id: Amon
    variable_id: rsutcs
    grid_label: gn
    version: v20190429
  files:
  - rsutcs_Amon_CanESM5_ssp370_r1i1p1f1_gn_201501-210012.nc
$ python roocs_utils/inventory/cli.py write -p c3s-cmip6 -v c3s

writes the inventory file c3s-cmip6-inventory.yml and does not include file names:

- path: ScenarioMIP/CCCma/CanESM5/ssp370/r1i1p1f1/Amon/rsutcs/gn/v20190429
  ds_id: c3s-cmip6.ScenarioMIP.CCCma.CanESM5.ssp370.r1i1p1f1.Amon.rsutcs.gn.v20190429
  var_id: rsutcs
  array_dims: time lat lon
  array_shape: 1032 64 128
  time: 2015-01-16T12:00:00 2100-12-16T12:00:00
  latitude: -87.86 87.86
  longitude: 0.00 357.19
  size: 33845952
  size_gb: 0.03
  file_count: 1
  facets:
    mip_era: c3s-cmip6
    activity_id: ScenarioMIP
    institution_id: CCCma
    source_id: CanESM5
    experiment_id: ssp370
    member_id: r1i1p1f1
    table_id: Amon
    variable_id: rsutcs
    grid_label: gn
    version: v20190429

Files is the default and will happen when no version is provided.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

roocs_utils-0.3.0.tar.gz (46.7 kB view details)

Uploaded Source

Built Distribution

roocs_utils-0.3.0-py2.py3-none-any.whl (35.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file roocs_utils-0.3.0.tar.gz.

File metadata

  • Download URL: roocs_utils-0.3.0.tar.gz
  • Upload date:
  • Size: 46.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.6.0 requests/2.24.0 setuptools/54.1.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.3

File hashes

Hashes for roocs_utils-0.3.0.tar.gz
Algorithm Hash digest
SHA256 7d636ac67b920b0d7e4715d15e07b1a693b2e29cd7301a54a4cbce945a8e47ac
MD5 11cdcee4ec6c2304f83122dd0b42739a
BLAKE2b-256 94976dc0a23258a0b1d9d236efd015e4b513513411f60bef256da0edc0e89713

See more details on using hashes here.

Provenance

File details

Details for the file roocs_utils-0.3.0-py2.py3-none-any.whl.

File metadata

  • Download URL: roocs_utils-0.3.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 35.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.6.0 requests/2.24.0 setuptools/54.1.1 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.3

File hashes

Hashes for roocs_utils-0.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e5fdee68e7f679f06806556b241678e9c5de7e93ee6f4f16b0850752f44c4956
MD5 3ebb6175063140ff59b88f0ba67d82cb
BLAKE2b-256 3c7c620cf50ac18d0a63051adf76e0623561dacba8443ff10bc0f6b189f33fb3

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page