Skip to main content

grlc, the git repository linked data API constructor

Project description

Join the chat at https://gitter.im/grlc DOI Build Status

grlc, the git repository linked data API constructor, automatically builds Web APIs using SPARQL queries stored in git repositories. http://grlc.io/

A cool project that can convert a random SPARQL endpoint into an OpenAPI endpoint

It enables us to quickly integrate any new API requirements in a matter of seconds, without having to worry about configuration or deployment of the system

You can store your SPARQL queries on GitHub and then you can run your queries on your favourite programming language (Python, Javascript, etc.) using a Web API (including swagger documentation) just as easily as loading data from a web page

Contributors: Albert Meroño, Rinke Hoekstra, Carlos Martínez

Copyright: Albert Meroño, VU University Amsterdam
License: MIT License (see LICENSE.txt)

What is grlc ?

grlc is a lightweight server that takes SPARQL queries curated in GitHub repositories, and translates them to Linked Data Web APIs. This enables universal access to Linked Data. Users are not required to know SPARQL to query their data, but instead can access a web API.

Quick tutorial

For a quick usage tutorial check out our wiki walkthrough here

Features

  • Request parameter mappings into SPARQL: grlc is compliant with BASIL's convention on how to map GET/POST request parameters into SPARQL
  • Automatic, user customizable population of parameter values in swagger-ui's dropdown menus via SPARQL triple pattern querying
  • Parameter values as enumerations (i.e. closed lists of values that will fill a dropdown in the UI) can now also be specified in the query decorators to save endpoint requests (see this example)
  • Parameter default values can now also be indicated through decorators (see this example)
  • URL-based content negotiation: you can request for specific content types by attaching them to the operation request URL, e.g. http://localhost:8088/CEDAR-project/Queries/residenceStatus_all.csv will request for results in CSV
  • Pagination of API results, as per the pagination decorator and GitHub's API Pagination Traversal
  • Docker images in Docker Hub for easy deployment
  • Compatibility with Linked Data Fragments servers, RDF dumps, and HTML+RDFa files
  • [NEW] grlc integrates now SPARQLTransformer, allowing the use of queries in JSON (see this example).
  • Generation of provenance in PROV of both the repo history (via Git2PROV) and grlc's activity additions
  • Commit-based API versioning that's coherent with the repo versioning with git hashes
  • SPARQL endpoint address can be set at the query level, repository level, and now also as a query parameter. This makes your APIs endpoint agnostic, and enables for generic and transposable queries!
  • CONSTRUCT queries are now mapped automatically to GET requests, accept parameters in the WHERE clause, and return content in text/turtle or application/ld+json
  • INSERT DATA queries are now mapped automatically to POST requests. Support is limited to queries with no WHERE clause, and parameters are always expected to be values for g (named graph where to insert the data) and data (with the triples to insert, in ntriples format). The INSERT query pattern is so far static, as defined in static.py. Only tested with Virtuoso.

Install and run

grlc.io

The easiest way to use grlc is by visiting grlc.io/ and using this service to convert SPARQL queries on your github repo into a RESTful API.

Pip

If you want to run grlc locally or use it as a library, you can install grlc on your machine. Grlc is registered in PyPi so you can install it using pip.

Prerequisites

  • Python3
  • development files:
sudo apt-get install libevent-dev python-all-dev

pip install

pip install grlc

Grlc includes a command line tool which you can use to start your own grlc server:

grlc-server

Using gunicorn

You can run grlc using gunicorn as follows:

gunicorn grlc.server:app

If you want to use your own gunicorn configuration, for example gunicorn_config.py:

workers = 5
worker_class = 'gevent'
bind = '0.0.0.0:8088'

Then you can run it as:

gunicorn -c gunicorn_config.py grlc.server:app

Note: Since gunicorn does not work under Windows, you can use waitress instead:

waitress-serve --port=8088 grlc.server:app

Grlc library

You can use grlc as a library directly from your own python script. See the usage example to find out more.

Docker

To run grlc via docker, you'll need a working installation of docker. To deploy grlc, just pull the latest image from Docker hub. :

docker run -it --rm -p 8088:80 clariah/grlc

The docker image allows you to setup several environment variable such as GRLC_SERVER_NAME GRLC_GITHUB_ACCESS_TOKEN and GRLC_SPARQL_ENDPOINT:

docker run -it --rm -p 8088:80 -e GRLC_SERVER_NAME=grlc.io -e GRLC_GITHUB_ACCESS_TOKEN=xxx -e GRLC_SPARQL_ENDPOINT=http://dbpedia.org/sparql -e DEBUG=true clariah/grlc

Access token

In order for grlc to communicate with GitHub, you'll need to tell grlc what your access token is:

  1. Get a GitHub personal access token. In your GitHub's profile page, go to Settings, then Developer settings, Personal access tokens, and Generate new token
  2. You'll get an access token string, copy it and save it somewhere safe (GitHub won't let you see it again!)
  3. Edit your docker-compose.yml or docker-compose.default.yml file, and paste this token as value of the environment variable GRLC_GITHUB_ACCESS_TOKEN

If you want to run grlc at system boot as a service, you can find example upstart scripts at upstart/

Usage

grlc assumes a GitHub repository (support for general git repos is on the way) where you store your SPARQL queries as .rq files (like in this one). grlc will create an API operation per such a SPARQL query/.rq file.

If you're seeing this, your grlc instance is up and running, and ready to build APIs. Assuming you got it running at http://localhost:8088/ and your queries are at https://github.com/CEDAR-project/Queries, just point your browser to the following locations:

By default grlc will direct your queries to the DBPedia SPARQL endpoint. To change this either:

  • Add a endpoint parameter to your request: 'http://grlc.io/user/repo/query?endpoint=http://sparql-endpoint/'. You can add a #+ endpoint_in_url: False decorator if you DO NOT want to see the endpoint parameter in the swagger-ui of your API.
  • Add a #+ endpoint: decorator in the first comment block of the query text (preferred, see below)
  • Add the URL of the endpoint on a single line in an endpoint.txt file within the GitHub repository that contains the queries.
  • Or you can directly modify the grlc source code (but it's nicer if the queries are self-contained)

That's it!

Example APIs

Check these out:

You'll find the sources of these and many more in GitHub

Decorator syntax

A couple of SPARQL comment embedded decorators are available to make your swagger-ui look nicer (note all comments start with #+ and the use of ':' is restricted to list-representations and cannot be used in the summary text):

  • To specify a query-specific endpoint, #+ endpoint: http://example.com/sparql.
  • To indicate the HTTP request method, #+ method: GET.
  • To paginate the results in e.g. groups of 100, #+ pagination: 100.
  • To create a summary of your query/operation, #+ summary: This is the summary of my query/operation
  • To assign tags to your query/operation,
    #+ tags:
    #+   - firstTag
    #+   - secondTag
  • To indicate which parameters of your query/operation should get enumerations (and get dropdown menus in the swagger-ui) using values from the SPARQL endpoint,
    #+ enumerate:
    #+   - var1
    #+   - var2
  • These parameters can also be hard-coded into the query decorators to save endpoint requests and speed up the API generation:
#+ enumerate:
#+   - var1:
#+     - value1
#+     - value2

Notice that these should be plain variable names without SPARQL/BASIL conventions (so var1 instead of ?_var1_iri)

See examples at https://github.com/albertmeronyo/lodapi.

Use this GitHub search to see examples from other users of grlc.

Contribute!

grlc needs you to continue bringing Semantic Web content to developers, applications and users. No matter if you are just a curious user, a developer, or a researcher; there are many ways in which you can contribute:

  • File in bug reports
  • Request new features
  • Set up your own environment and start hacking

Check our contributing guidelines for these and more, and join us today!

If you cannot code, that's no problem! There's still plenty you can contribute:

  • Share your experience at using grlc in Twitter (mention the handler @grlcldapi)
  • If you are good with HTML/CSS, let us know

Related tools

  • SPARQL2Git is a Web interface for editing SPARQL queries and saving them in GitHub as grlc APIs.
  • grlcR is a package for R that brings Linked Data into your R environment easily through grlc.
  • Hay's tools lists grlc as a Wikimedia-related tool :-)

This is what grlc users are saying

Academic publications

  • Albert Meroño-Peñuela, Rinke Hoekstra. “grlc Makes GitHub Taste Like Linked Data APIs”. The Semantic Web – ESWC 2016 Satellite Events, Heraklion, Crete, Greece, May 29 – June 2, 2016, Revised Selected Papers. LNCS 9989, pp. 342-353 (2016). (PDF)
  • Albert Meroño-Peñuela, Rinke Hoekstra. “SPARQL2Git: Transparent SPARQL and Linked Data API Curation via Git”. In: Proceedings of the 14th Extended Semantic Web Conference (ESWC 2017), Poster and Demo Track. Portoroz, Slovenia, May 28th – June 1st, 2017 (2017). (PDF)
  • Albert Meroño-Peñuela, Rinke Hoekstra. “Automatic Query-centric API for Routine Access to Linked Data”. In: The Semantic Web – ISWC 2017, 16th International Semantic Web Conference. Lecture Notes in Computer Science, vol 10587, pp. 334-339 (2017). (PDF)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grlc-1.3.2.tar.gz (76.6 kB view details)

Uploaded Source

Built Distribution

grlc-1.3.2-py3-none-any.whl (75.0 kB view details)

Uploaded Python 3

File details

Details for the file grlc-1.3.2.tar.gz.

File metadata

  • Download URL: grlc-1.3.2.tar.gz
  • Upload date:
  • Size: 76.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.5.6

File hashes

Hashes for grlc-1.3.2.tar.gz
Algorithm Hash digest
SHA256 53ff919162130ae22c99954fd4280497f549598278509a5c0cbf2c510c316254
MD5 961b7e5ca453a4f74a8eef530d304449
BLAKE2b-256 4d605b0604abd7331f0a5c83f653e9407f16159979889240726951969dc2adda

See more details on using hashes here.

Provenance

File details

Details for the file grlc-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: grlc-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 75.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.5.6

File hashes

Hashes for grlc-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 44405e149de7980fa6b6743c16920074105cb47d27dae89905345d8f0996799a
MD5 c20a5773bee8b4c59bb990fbb7c32f25
BLAKE2b-256 ba766f819c353a65f984eb4e6b385f8f5e07893ee4f7645d01a674c2bd0c1501

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page