Skip to main content

Integrated registry of biological databases and nomenclatures

Project description

Bioregistry

Tests PyPI PyPI - Python Version PyPI - License DOI

A community-driven integrative registry of biological databases, ontologies, and other resources.
More information here.

⬇️ Download

The bioregistry database can be downloaded directly from here.

The manually curated portions of these data are available under the CC0 1.0 Universal License.

🙏 Contributing

There haven't been any external contributors yet, but if you want to get involved, you can make edits directly to the bioregistry.json file through the GitHub interface.

Things that would be helpful:

  1. For all entries, add a ["wikidata"]["database"] entry. Many ontologies and databases don't have a property in Wikidata because the process of adding a new property is incredibly cautious. However, anyone can add a database as normal Wikidata item with a Q prefix. One example is UniPathway, whose Wikidata database item is Q85719315. If there's no database item on Wikidata, you can even make one! Note: don't mix this up with a paper describing the resource, Q35631060. If you see there's a paper, you can add it under the ["wikidata"]["database"] key.
  2. Adding ["homepage"] entry for any entry that doesn't have an external reference

🚀 Installation

The Bioregistry can be installed from PyPI with:

$ pip install bioregistry

It can be installed in development mode for local curation with:

$ git clone https://github.com/cthoyt/bioregistry.git
$ cd bioregistry
$ pip install -e .

💪 Usage

The Bioregistry can be used to normalize prefixes across MIRIAM and all the (very plentiful) variants that pop up in ontologies in OBO Foundry and the OLS with the normalize_prefix() function.

import bioregistry

# This works for synonym prefixes, like:
assert 'ncbitaxon' == bioregistry.normalize_prefix('taxonomy')

# This works for common mistaken prefixes, like:
assert 'chembl.compound' == bioregistry.normalize_prefix('chembl')

# This works for prefixes that are often written many ways, like:
assert 'eccode' == bioregistry.normalize_prefix('ec-code')
assert 'eccode' == bioregistry.normalize_prefix('EC_CODE')

# If a prefix is not registered, it gives back `None`
assert bioregistry.normalize_prefix('not a real key') is None

Entries in the Bioregistry can be looked up with the get() function.

import bioregistry

entry = bioregistry.normalize_prefix('taxonomy')
# there are lots of mysteries to discover in this dictionary!

The pattern for an entry in the Bioregistry can be looked up quickly with get_pattern() if it exists. It prefers the custom curated, then MIRIAM, then Wikidata pattern.

import bioregistry

assert '^GO:\\d{7}$' == bioregistry.get_pattern('go')

Entries in the Bioregistry can be checked for deprecation with the is_deprecated() function. MIRIAM and OBO Foundry don't often agree - OBO Foundry takes precedence since it seems to be updated more often.

import bioregistry

assert bioregistry.is_deprecated('nmr')
assert not bioregistry.is_deprecated('efo')

The full Bioregistry can be read in a Python project using:

import bioregistry

registry = bioregistry.read_bioregistry()

♻️ Update

The database is automatically updated daily thanks to scheduled workflows in GitHub Actions. The workflow's configuration can be found here and the last run can be seen here. Further, a changelog can be recapitulated from the commits of the GitHub Actions bot.

If you want to manually update the database after installing in development mode, run the following:

$ bioregistry update

⚖️ License

The code in this repository is licensed under the MIT License.

📖 Citation

Hopefully there will be a paper describing this resource on bioRxiv sometime in 2021! Until then, you can use the Zenodo BibTeX or CSL.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bioregistry-0.0.8.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

bioregistry-0.0.8-py3-none-any.whl (175.3 kB view details)

Uploaded Python 3

File details

Details for the file bioregistry-0.0.8.tar.gz.

File metadata

  • Download URL: bioregistry-0.0.8.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for bioregistry-0.0.8.tar.gz
Algorithm Hash digest
SHA256 8ab6c4b738857298523b9f9baa3b1c91d379c9bd3b8eacb85e591d95cac472f3
MD5 377cc701d5a33096b7dd3300e530fa7b
BLAKE2b-256 3344399726b77275c201a407380d221b32e16bc1ecae0e2dd66ccaabb7e70bee

See more details on using hashes here.

Provenance

File details

Details for the file bioregistry-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: bioregistry-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 175.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for bioregistry-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 98f489ea32f3f9de5aa76a1964ab0334190c7871ab3adaa865c86aaf5ed8a379
MD5 4afc154b8ec376b21ba3a3b56cc7e876
BLAKE2b-256 a2e52a1313c2edee567ee035d0cac426f305e233fe1646194ae1426651f6b607

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page