Idiomatic conversion between URIs and compact URIs (CURIEs).
Project description
curies
Idiomatic conversion between URIs and compact URIs (CURIEs).
from curies import Converter
converter = Converter.from_prefix_map({
"CHEBI": "http://purl.obolibrary.org/obo/CHEBI_",
"MONDO": "http://purl.obolibrary.org/obo/MONDO_",
"GO": "http://purl.obolibrary.org/obo/GO_",
# ... and so on
"OBO": "http://purl.obolibrary.org/obo/",
})
>>> converter.compress("http://purl.obolibrary.org/obo/CHEBI_1")
'CHEBI:1'
>>> converter.expand("CHEBI:1")
'http://purl.obolibrary.org/obo/CHEBI_1'
# Unparsable
>>> assert converter.compress("http://example.com/missing:0000000") is None
>>> assert converter.expand("missing:0000000") is None
When some URI prefixes are partially overlapping (e.g.,
http://purl.obolibrary.org/obo/GO_
for GO
and
http://purl.obolibrary.org/obo/
for OBO
), the longest
URI prefix will always be matched. For example, compressing
http://purl.obolibrary.org/obo/GO_0032571
will return GO:0032571
instead of OBO:GO_0032571
.
All loader function work on local file paths, remote URLs, and pre-loaded data structures. For example, a converter can be instantiated from a web-based resource in JSON-LD format:
from curies import Converter
url = "https://raw.githubusercontent.com/biopragmatics/bioregistry/main/exports/contexts/semweb.context.jsonld"
converter = Converter.from_jsonld(url)
Several converters can be instantiated from pre-defined web-based resources:
import curies
# Uses the Bioregistry, an integrative, comprehensive registry
bioregistry_converter = curies.get_bioregistry_converter()
# Uses the OBO Foundry, a registry of ontologies
obo_converter = curies.get_obo_converter()
# Uses the Monarch Initative's project-specific context
monarch_converter = curies.get_monarch_converter()
Apply in bulk to a pandas.DataFrame
with Converter.pd_expand
and
Converter.pd_compress
:
import curies
import pandas as pd
df = pd.read_csv(...)
obo_converter = curies.get_obo_converter()
obo_converter.pd_compress(df, column=0)
obo_converter.pd_expand(df, column=0)
Apply in bulk to a CSV file with Converter.file_expand
and
Converter.file_compress
(defaults to using tab separator):
import curies
path = ...
obo_converter = curies.get_obo_converter()
# modifies file in place
obo_converter.file_compress(path, column=0)
# modifies file in place
obo_converter.file_expand(path, column=0)
Full documentation is available here.
CLI Usage
This package comes with a built-in CLI for running a resolver web application:
$ python -m curies --host 0.0.0.0 --port 8764 bioregistry
The positional argument can be one of the following:
- A pre-defined prefix map to get from the web (bioregistry, go, obo, monarch, prefixcommons)
- A local file path or URL to a prefix map, extended prefix map, or one of several formats. Requires specifying
a
--format
.
The framework can be swapped to use Flask (default) or FastAPI with --framework
. The
server can be swapped to use Werkzeug (default) or Uvicorn with --server
. These functionalities
are also available programmatically, see the docs for more information.
🧑🤝🧑 Related
Other packages that convert between CURIEs and URIs:
- https://github.com/prefixcommons/prefixcommons-py (Python)
- https://github.com/prefixcommons/curie-util (Java)
- https://github.com/geneontology/curie-util-py (Python)
- https://github.com/geneontology/curie-util-es5 (Node.js)
- https://github.com/endoli/curie.rs (Rust)
🚀 Installation
The most recent release can be installed from PyPI with:
$ pip install curies
👐 Contributing
Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.
👋 Attribution
🙏 Acknowledgements
This package heavily builds on the trie
data structure implemented in pytrie
.
⚖️ License
The code in this package is licensed under the MIT License.
🍪 Cookiecutter
This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.
🛠️ For Developers
See developer instructions
The final section of the README is for if you want to get involved by making a code contribution.
Development Installation
To install in development mode, use the following:
$ git clone git+https://github.com/cthoyt/curies.git
$ cd curies
$ pip install -e .
🥼 Testing
After cloning the repository and installing tox
with pip install tox
, the unit tests in the tests/
folder can be
run reproducibly with:
$ tox
Additionally, these tests are automatically re-run with each commit in a GitHub Action.
📖 Building the Documentation
The documentation can be built locally using the following:
$ git clone git+https://github.com/cthoyt/curies.git
$ cd curies
$ tox -e docs
$ open docs/build/html/index.html
The documentation automatically installs the package as well as the docs
extra specified in the setup.cfg
. sphinx
plugins
like texext
can be added there. Additionally, they need to be added to the
extensions
list in docs/source/conf.py
.
📦 Making a Release
After installing the package in development mode and installing
tox
with pip install tox
, the commands for making a new release are contained within the finish
environment
in tox.ini
. Run the following from the shell:
$ tox -e finish
This script does the following:
- Uses Bump2Version to switch the version number in the
setup.cfg
,src/curies/version.py
, anddocs/source/conf.py
to not have the-dev
suffix - Packages the code in both a tar archive and a wheel using
build
- Uploads to PyPI using
twine
. Be sure to have a.pypirc
file configured to avoid the need for manual input at this step - Push to GitHub. You'll need to make a release going with the commit where the version was bumped.
- Bump the version to the next patch. If you made big changes and want to bump the version by minor, you can
use
tox -e bumpversion minor
after.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.