An intake plugin for building and loading earth system data sets such as CMIP, CESM Large Ensemble
Project description
Intake-esm
Intake-esm provides an intake plugin for creating file-based Intake catalogs for climate data from project efforts such as the Coupled Model Intercomparison Project (CMIP) and the Community Earth System Model (CESM) Large Ensemble Project. These projects produce a huge of amount climate data persisted on tape, disk storage components across multiple (in the order of ~ 300,000) netCDF files. Finding, investigating, loading these files into data array containers such as xarray can be a daunting task due to the large number of files a user may be interested in. Intake-esm addresses this issue in three steps:
Datasets Collection Curation in form of YAML files. These YAML files provide information about data locations, access pattern, directory structure, etc. intake-esm uses these YAML files in conjunction with file name templates to construct a local database. Each row in this database consists of a set of metadata such as experiment, modeling realm, frequency corresponding to data contained in one netCDF file.
>>> import intake >>> col = intake.open_esm_metadatastore(collection_name="GLADE-CMIP5")
Search and Discovery: once the database is built, intake-esm can be used for searching and discovering of climate datasets by eliminating the need for the user to know specific locations (file path) of their data set of interest:
>>> cat = col.search(variable=['hfls'], frequency='mon', ... modeling_realm='atmos', ... institute=['CCCma', 'CNRM-CERFACS'])
Access: when the user is satisfied with the results of their query, they can ask intake-esm to load the actual netCDF files into xarray datasets:
>>> dsets = cat.to_xarray(decode_times=True, chunks={'time': 50})
Intake-esm supports data holdings from the following projects:
CMIP: Coupled Model Intercomparison Project (phase 5 and phase 6)
CESM: Community Earth System Model Large Ensemble (LENS), and Decadal Prediction Large Ensemble (DPLE)
MPI-GE: The Max Planck Institute for Meteorology (MPI-M) Grand Ensemble (MPI-GE)
GMET: The Gridded Meteorological Ensemble Tool data
ERA5: ECWMF ERA5 Reanalysis dataset stored on NCAR’s GLADE in /glade/collections/rda/data/ds630.0
NA-CORDEX: The North American CORDEX program dataset residing on NCAR’s GLADE in /glade/collections/cdg/data/cordex/data/
CESM-LENS-AWS: Community Earth System Model Large Ensemble (CESM LENS) data holdings publicly available on Amazon S3 (us-west-2 region)
See documentation for more information.
Installation
Intake-esm can be installed from PyPI with pip:
pip install intake-esm
It is also available from conda-forge for conda installations:
conda install -c conda-forge intake-esm
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for intake_esm-2019.8.23-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2faa4baa1288d07ec4e105538133c3b5dc5204e943e8327ca46e25b8d7829d80 |
|
MD5 | 97ddb7560eb064b5583f367920e6c9ba |
|
BLAKE2b-256 | 08264b96b951d9e5122cafef031b16498e8e5cb31989eaf32963770dcb3dd375 |