Skip to main content

DataLad support the UKBiobank

Project description

DataLad extension for working with the UKbiobank

GitHub release PyPI version fury.io Build status codecov.io Documentation Status

This software is a DataLad extension that equips DataLad with a set of commands to obtain (and monitor) imaging data releases of the UKbiobank (see documentation for more information).

UKbiobank is a national and international health resource with unparalleled research opportunities, open to all bona fide health researchers. UK Biobank aims to improve the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. It is following the health and well-being of 500,000 volunteer participants and provides health information, which does not identify them, to approved researchers in the UK and overseas, from academia and industry.

Command(s) provided by this extension

  • ukb-init -- Initialize an existing dataset to track a UKBiobank participant
  • ukb-update -- Update an existing dataset of a UKbiobank participant

Installation

Before you install this package, please make sure that you install a recent version of git-annex. Afterwards, install the latest version of datalad-ukbiobank from PyPi. It is recommended to use a dedicated virtualenv:

# create and enter a new virtual environment (optional)
virtualenv --system-site-packages --python=python3 ~/env/datalad
. ~/env/datalad/bin/activate

# install from PyPi
pip install datalad_ukbiobank

Use

To track UKB data for a single participant (example ID: 1234), start by creating and initializing a new dataset:

% datalad create 1234
% cd 1234
% datalad ukb-init --bids 1234 20227_2_0 20227_3_0 25755_2_0 25755_3_0

In this example only two data records with two instances each are selected. However, any other selection is supported too. The --bids flag enables an additional dataset layout with a BIDS-like structure.

After initialization, run ukb-update at any time to (re-)download data from UKB, and update the dataset in order to track changes longitudinally.

datalad -c datalad.ukbiobank.keyfile=<pathtoaccesstoken> ukb-update

This will maintain two or three branches:

  • incoming: tracking the pristine UKB downloads
  • incoming-native: a "native" representation of the extracted downloads for single file access using UKB naming conventions
  • incoming-bids: an alternative dataset layout using BIDS conventions (if enabled with ukb-init --bids)

Changes can then be merged manually into the main branch. Alternatively, ukb-update --merge merges incoming-native (or incoming-bids if enabled) automatically.

Use with pre-downloaded data

Re-download can be avoided (while maintaining all other functionality), if the ukbfetch utility is replaced by a shim that obtains the relevant files from where they have been downloaded to. An example script is provided at tools/ukbfetch_surrogate.sh.

Use on non-UNIX-like operating systems

This code relies on a number of POSIX filesystem features that may make it somewhat hard to get working on Windows. Contributions to port this extension to non-POSIX platforms are welcome, but presently this is not supported.

Support

For general information on how to use or contribute to DataLad (and this extension), please see the DataLad website or the main GitHub project page.

All bugs, concerns and enhancement requests for this software can be submitted here: https://github.com/datalad/ukbiobank/issues

If you have a problem or would like to ask a question about how to use DataLad, please submit a question to NeuroStars.org with a datalad tag. NeuroStars.org is a platform similar to StackOverflow but dedicated to neuroinformatics.

All previous DataLad questions are available here: http://neurostars.org/tags/datalad/

Acknowledgements

This development was supported by European Union’s Horizon 2020 research and innovation programme under grant agreement VirtualBrainCloud (H2020-EU.3.1.5.3, grant no. 826421).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datalad_ukbiobank-0.3.3.tar.gz (35.4 kB view details)

Uploaded Source

Built Distribution

datalad_ukbiobank-0.3.3-py2.py3-none-any.whl (20.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datalad_ukbiobank-0.3.3.tar.gz.

File metadata

  • Download URL: datalad_ukbiobank-0.3.3.tar.gz
  • Upload date:
  • Size: 35.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.1+

File hashes

Hashes for datalad_ukbiobank-0.3.3.tar.gz
Algorithm Hash digest
SHA256 6c1220ea150d74c84edbc900e35902c019d79e2b61d55dc6e450ee1d3db29168
MD5 2fb2ad266986f8f2fdcbea0c5f0ec531
BLAKE2b-256 c7855184cc495ae9ee351b709fdbf85289a70baab4f9f40187157f6e13ce17ce

See more details on using hashes here.

File details

Details for the file datalad_ukbiobank-0.3.3-py2.py3-none-any.whl.

File metadata

  • Download URL: datalad_ukbiobank-0.3.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.9.1+

File hashes

Hashes for datalad_ukbiobank-0.3.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d9fc193317e4a33e04b8471003e030d95e47eceb9e53293b5390eeddc6e44dbf
MD5 ca5cc6fd5eb09f89c19e29618a0484d4
BLAKE2b-256 f5fa5be931309f72111752eb26545a04b98af5e1ffa1b34e5e3e823e8475557f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page