Skip to main content

Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge

Project description

pangeo-forge-esgf

Using queries to the ESGF API to generate urls and keyword arguments for receipe generation in pangeo-forge

Install

You can install pangeo-forge-esgf via pip:

pip install pangeo-forge-esgf

If you want all the required dependencies for testing and development simply do:

pip install pangeo-forge-esgf[dev]

Parsing a list of instance ids using wildcards

Pangeo forge recipes require the user to provide exact instance_id's for the datasets they want to be processed. Discovering these with the web search can become cumbersome, especially when dealing with a large number of members/models etc.

pangeo-forge-esgf provides some functions to query the ESGF API based on instance_id values with wildcards.

For example if you want to find all the zonal (uo) and meridonal (vo) velocities available for the lgm experiment of PMIP, you can do:

from pangeo_forge_esgf.parsing import parse_instance_ids
parse_iids = [
    "CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*",
]
# Comma separated values in square brackets will be expanded and the above is equivalent to:
# parse_iids = [
#     "CMIP6.PMIP.*.*.lgm.*.*.[uo, vo].*.*", # this is equivalent to passing
#     "CMIP6.PMIP.*.*.lgm.*.*.vo.*.*",
# ]
iids = []
for piid in parse_iids:
    iids.extend(parse_instance_ids(piid))
iids

and you will get:

['CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gn.v20191002',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.uo.gn.v20200212',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200212',
 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.uo.gr1.v20200911',
 'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.uo.gn.v20200909',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Omon.vo.gn.v20200212',
 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gn.v20191002',
 'CMIP6.PMIP.AWI.AWI-ESM-1-1-LR.lgm.r1i1p1f1.Odec.vo.gn.v20200212',
 'CMIP6.PMIP.MIROC.MIROC-ES2L.lgm.r1i1p1f2.Omon.vo.gr1.v20200911',
 'CMIP6.PMIP.MPI-M.MPI-ESM1-2-LR.lgm.r1i1p1f1.Omon.vo.gn.v20190710']

Eventually I hope I can leverage this functionality to handle user requests in PRs that add wildcard instance_ids, but for now this might be helpful to manually construct lists of instance_ids to submit to a pangeo-forge feedstock.

Generating PGF recipe input (urls) from instance_ids

from pangeo_forge_esgf import get_urls_from_esgf
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']

gives

100%|██████████| 5/5 [00:01<00:00,  4.98it/s]
Processing responses
Processing responses: Expected files per iid
Processing responses: Check for missing iids
Processing responses: Flatten results
Processing responses: Group results
Find responsive urls
100%|██████████| 1/1 [00:00<00:00,  3.25it/s]
['https://esgf-data1.llnl.gov/thredds/fileServer/css03_data/CMIP6/CMIP/CSIRO-ARCCSS/ACCESS-CM2/historical/r1i1p1f1/SImon/sifb/gn/v20200817/sifb_SImon_ACCESS-CM2_historical_r1i1p1f1_gn_185001-201412.nc']

or if you want to see detaile debugging statements

from pangeo_forge_esgf import get_urls_from_esgf, setup_logging
setup_logging('DEBUG')
iids = ['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']
url_dict = await get_urls_from_esgf(iids)
url_dict['CMIP6.CMIP.CSIRO-ARCCSS.ACCESS-CM2.historical.r1i1p1f1.SImon.sifb.gn.v20200817']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pangeo_forge_esgf-0.3.0.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

pangeo_forge_esgf-0.3.0-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file pangeo_forge_esgf-0.3.0.tar.gz.

File metadata

  • Download URL: pangeo_forge_esgf-0.3.0.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pangeo_forge_esgf-0.3.0.tar.gz
Algorithm Hash digest
SHA256 2635e2128dd2f87a532eac91a114028183313ccdd23eb4fb0abcf5dcee061be1
MD5 718bbf2c71c67ba40b4c2db52ae0ed1f
BLAKE2b-256 3e51ef09c32470911ce1322ce41481bd1b5349c51925ebf0a3d555296ac59be9

See more details on using hashes here.

File details

Details for the file pangeo_forge_esgf-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pangeo_forge_esgf-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3cba7bad3e12f8fcb7010bce2e6136fa65d6b14ae9ff268e5927b5e1cc37298e
MD5 3fcaf3f696e4273b3fe1439434c215ee
BLAKE2b-256 e6794395b445b09856ffc2714e8faee5f563b68bfd7653bf8460be1e96881ee3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page