Skip to main content

"GA4GH Variation Representation Specification (VRS) reference implementation (https://github.com/ga4gh/vrs-python/)"

Project description

vrs-python

PyPI version Travis

vrs-python provides Python language support for the GA4GH Variation Representation Specification (VRS).

This repository contains several related components:

  • ga4gh.core package Python language support for certain nascent standards in GA4GH. Eventually, this package should be moved to a distinct repo.

  • ga4gh.vrs package Python language support for VRS.

  • ga4gh.vrs.extras package Python language support for additional functionality, including translating from and to other variant formats and a REST service to similar functionality. ga4gh.vrs.extras requires access to supporting data, as described below.

  • Jupyter notebooks Demonstrations of the functionality of ga4gh.vrs and ga4gh.vrs.extras in the form of easy-to-read notebooks.

VRS-Python and VRS Version Correspondence

The ga4gh/vrs-python repo embeds the ga4gh/vrs repo as a git submodule, and therefore each ga4gh.vrs package on PyPi embeds a particular version of VRS. The correspondences between the packages may be summarized as:

vrs-python branch vrs branch
main main
0.6 1.1
0.7 1.2
0.8 1.3
0.9 metaschema-update

Developers: See the development section below for recommendations for using submodules gracefully (and without causing problems for others!).

Installation

Installing with pip

pip install 'ga4gh.vrs[extras]'

The [extras] argument tells pip to install packages to fulfill the dependencies of the ga4gh.vrs.extras package.

Installing dependencies for ga4gh.vrs.extras

The ga4gh.vrs.extras modules are not part of the VR spec per se. They are bundled with ga4gh.vrs for development and installation convenience. These modules depend directly and indirectly on external data sources of sequences, transcripts, and genome-transcript alignments. This section recommends one way to install the biocommons tools that provide these data.

docker volume create --name=uta_vol
docker volume create --name=seqrepo_vol
docker-compose up

This should start three containers:

  • seqrepo: downloads seqrepo into a docker volume and exits
  • seqrepo-rest-service: a REST service on seqrepo (localhost:5000)
  • uta: a database of transcripts and alignments (localhost:5432)

Check that the containers are running:

$ docker ps
CONTAINER ID        IMAGE                                    //  NAMES
86e872ab0c69        biocommons/seqrepo-rest-service:latest   //  vrs-python_seqrepo-rest-service_1
a40576b8cf1f        biocommons/uta:uta_20180821              //  vrs-python_uta_1

Depending on your network and host, the first run is likely to take 5-15 minutes in order to download and install data. Subsequent startups should be nearly instantaneous.

You can test UTA and seqrepo installations like so:

snafu$ psql -XAt postgres://anonymous@localhost/uta -c 'select count(*) from transcript'
249909

It doesn't work!

Here are some things to try.

  • Bring up one service at a time. For example, if you haven't download seqrepo yet, you might see this:

    snafu$ docker-compose up seqrepo-rest-service
    Starting vrs-python_seqrepo-rest-service_1 ... done
    Attaching to vrs-python_seqrepo-rest-service_1
    seqrepo-rest-service_1  | 2022-07-26 15:59:59 snafu seqrepo_rest_service.__main__[1] INFO Using seqrepo_dir='/usr/local/share/seqrepo/2021-01-29' from command line
    ⋮
    seqrepo-rest-service_1  | OSError: Unable to open SeqRepo directory /usr/local/share/seqrepo/2021-01-29
    vrs-python_seqrepo-rest-service_1 exited with code 1
    

Running the Notebooks

Once installed as described above, type

$ source venv/3.7/bin/activate
$ jupyter notebook --notebook-dir notebooks/

The following jupyter extensions are recommended but not required

$ pip install jupyter_contrib_nbextensions
$ jupyter contrib nbextension install --user
$ jupyter nbextension enable toc2/main

Development

Submodules!

vrs-python embeds vrs as a submodule. When checking out vrs-python and switching branches, it is important to make sure that the submodule tracks vrs-python correctly. The recommended way to do this is git config --global submodule.recurse true. If you don't set submodule.recurse, developers and reviewers must be extremely careful to not accidentally upgrade or downgrade schemas with respect to vrs-python.

Alternatively, see misc/githooks/.

Installing for development

Fork the repo at https://github.com/ga4gh/vrs-python/ .

$ git clone --recurse-submodules git@github.com:YOUR_GITHUB_ID/vrs-python.git
$ cd vrs-python
$ make devready

Testing

This package implements typical unit tests for ga4gh.core and ga4gh.vrs. This package also implements the compliance tests from vrs (vrs/validation) in the tests/validation/ directory.

$ make test

Security Note (from the GA4GH Security Team)

A stand-alone security review has been performed on the specification itself. This implementation is offered as-is, and without any security guarantees. It will need an independent security review before it can be considered ready for use in security-critical applications. If you integrate this code into your application it is AT YOUR OWN RISK AND RESPONSIBILITY to arrange for a security audit.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ga4gh.vrs-2.0.0a0.tar.gz (17.1 MB view details)

Uploaded Source

Built Distribution

ga4gh.vrs-2.0.0a0-py2.py3-none-any.whl (44.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ga4gh.vrs-2.0.0a0.tar.gz.

File metadata

  • Download URL: ga4gh.vrs-2.0.0a0.tar.gz
  • Upload date:
  • Size: 17.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for ga4gh.vrs-2.0.0a0.tar.gz
Algorithm Hash digest
SHA256 7cb716c2d4b27dc67199c50e5b43f2b5d3509b3961ac9034d81acf6906cfb563
MD5 e9d8b665233a003dc7d68d3e9b2c32c4
BLAKE2b-256 4bef09539d508166839fd96f8ed89a480d2a3cb861b295176d46585954a35837

See more details on using hashes here.

File details

Details for the file ga4gh.vrs-2.0.0a0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for ga4gh.vrs-2.0.0a0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 1ef6e52b1f3875ee53f2130e7a35c86dac9d85d4cd9eb9dff18fa6a5499f753a
MD5 9c5a0d3f0381e59a10ac49dfa8c6719b
BLAKE2b-256 7f3ea474c3c3a24d4179c701d2b671226cb26c94b80d90984c686cc0ec96a407

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page