Skip to main content

An intake plugin for parsing an ESM (Earth System Model) Collection/catalog and loading assets (netCDF files and/or Zarr stores) into xarray datasets.

Project description

GitHub Workflow Status https://img.shields.io/circleci/project/github/NCAR/intake-esm/master.svg?style=for-the-badge&logo=circleci https://img.shields.io/codecov/c/github/NCAR/intake-esm.svg?style=for-the-badge Documentation Status Python Package Index Conda Version Zenodo

Intake-esm

Motivation

Project efforts such as the Coupled Model Intercomparison Project (CMIP) and the Community Earth System Model (CESM) Large Ensemble Project produce a huge of amount climate data persisted on tape, disk storage, object storage components across multiple (in the order of ~ 300,000) data assets. These data assets are stored in netCDF and more recently Zarr formats. Finding, investigating, loading these assets into data array containers such as xarray can be a daunting task due to the large number of files a user may be interested in. Intake-esm aims to address these issues by providing necessary functionality for searching, discovering, data access/loading.

Overview

intake-esm is a data cataloging utility built on top of intake, pandas, and xarray, and it’s pretty awesome!

  • Opening an ESM collection definition file: An ESM (Earth System Model) collection file is a JSON file that conforms to the ESM Collection Specification. When provided a link/path to an esm collection file, intake-esm establishes a link to a database (CSV file) that contains data assets locations and associated metadata (i.e., which experiement, model, the come from). The collection JSON file can be stored on a local filesystem or can be hosted on a remote server.

    >>> import intake
    >>> col_url = "https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json"
    >>> col = intake.open_esm_datastore(col_url)
  • Search and Discovery: intake-esm provides functionality to execute queries against the database:

    >>> cat = col.search(experiment_id=['historical', 'ssp585'], table_id='Oyr',
    ...          variable_id='o2', grid_label='gn')
  • Access: when the user is satisfied with the results of their query, they can ask intake-esm to load data assets (netCDF/HDF files and/or Zarr stores) into xarray datasets:

    >>> dset_dict = cat.to_dataset_dict(zarr_kwargs={'consolidated': True, 'decode_times': False},
    ...                        cdf_kwargs={'chunks': {}, 'decode_times': False})

See documentation for more information.

Installation

Intake-esm can be installed from PyPI with pip:

pip install intake-esm

It is also available from conda-forge for conda installations:

conda install -c conda-forge intake-esm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intake-esm-2020.3.16.1.tar.gz (243.5 kB view details)

Uploaded Source

Built Distribution

intake_esm-2020.3.16.1-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file intake-esm-2020.3.16.1.tar.gz.

File metadata

  • Download URL: intake-esm-2020.3.16.1.tar.gz
  • Upload date:
  • Size: 243.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.8.2

File hashes

Hashes for intake-esm-2020.3.16.1.tar.gz
Algorithm Hash digest
SHA256 772c4f63482bada32a6d69a47d05448f000146e90c22f01d640758a907da8c75
MD5 ccdea1ab4a22ee17d5489f52cef35bd3
BLAKE2b-256 cd02eb09329255dbe8dc430d0153907c1f721f7dca6fb8ddc807ee9e1974509b

See more details on using hashes here.

File details

Details for the file intake_esm-2020.3.16.1-py3-none-any.whl.

File metadata

  • Download URL: intake_esm-2020.3.16.1-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.8.2

File hashes

Hashes for intake_esm-2020.3.16.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4910288850b1d0c43e58451d959bf2bbb38033beb535952a8fbaa4b6758b1fc6
MD5 2381e6fa7139d86f02008975a17834b6
BLAKE2b-256 9e12ec96f2fa0e96e924a476912f43aa71c81fe4c213a4c37b4ee21876684e85

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page