Skip to main content

VICC normalization routine for therapies

Project description

Therapy Normalization

Services and guidelines for normalizing drug (and non-drug therapy) terms

Developer instructions

The following sections include instructions specifically for developers.

Installation

For a development install, we recommend using Pipenv. See the pipenv docs for direction on installing pipenv in your compute environment.

Once installed, from the project root dir, just run:

pipenv sync

Deploying DynamoDB Locally

We use Amazon DynamoDB for our database. To deploy locally, follow these instructions.

Init coding style tests

Code style is managed by flake8 and checked prior to commit.

We use pre-commit to run conformance tests.

This ensures:

  • Check code style
  • Check for added large files
  • Detect AWS Credentials
  • Detect Private Key

Before first commit run:

pre-commit install

Running unit tests

Running unit tests is as easy as pytest.

pipenv run pytest

Updating the therapy normalization database

Before you use the CLI to update the database, run the following in a separate terminal to start DynamoDB on port 8000:

java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb

To change the port, simply add -port value.

Setting Environment Variables

RxNorm requires a UMLS license, which you can register for one here. You must set the RxNORM_API_KEY environment variable to your API key. This can be found in the UTS 'My Profile' area after singing in.

export RXNORM_API_KEY={rxnorm_api_key}

Update source(s)

The sources we currently use are: ChEMBL, NCIt, DrugBank (CC0 data only), RxNorm, ChemIDplus, Wikidata, and HemOnc.org.

To update source(s), simply set --normalizer to the source(s) you wish to update separated by spaces. For example, the following command updates ChEMBL and Wikidata:

python3 -m therapy.cli --normalizer="chembl wikidata"

You can update all sources at once with the --update_all flag:

python3 -m therapy.cli --update_all

The data/ subdirectory within the application should include all source data. The normalizer is capable of acquiring most of these files automatically; the exception is the HemOnc.org data, which must be manually downloaded from the Harvard Dataverse and placed within the data/hemonc subdirectory. Files for all sources should follow the naming convention demonstrated below (with version numbers/dates changed where applicable).

therapy/data
├── chembl
│   └── chembl_27.db
├── chemidplus
│   └── chemidplus_20200327.xml
├── drugbank
│   └── drugbank_5.1.8.csv
├── hemonc
│   ├── hemonc_concepts_20210225.csv
│   ├── hemonc_rels_20210225.csv
│   └── hemonc_synonyms_20210225.csv
├── ncit
│   └── ncit_20.09d.owl
├── rxnorm
│   ├── drug_forms.yaml
│   └── rxnorm_20210104.RRF
└── wikidata
    └── wikidata_20210425.json

Create Merged Concept Groups

The /normalize endpoint relies on merged concept groups. The --update_merged flag generates these groups:

python3 -m therapy.cli --update_merged

Specifying the database URL endpoint

The default URL endpoint is http://localhost:8000. There are two different ways to specify the database URL endpoint.

The first way is to set the --db_url flag to the URL endpoint.

python3 -m therapy.cli --update_all --db_url="http://localhost:8001"

The second way is to set the THERAPY_NORM_DB_URL to the URL endpoint.

export THERAPY_NORM_DB_URL="http://localhost:8001"
python3 -m therapy.cli --update_all

Starting the therapy normalization service

From the project root, run the following:

uvicorn therapy.main:app --reload

Next, view the OpenAPI docs on your local machine:

http://127.0.0.1:8000/therapy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thera-py-0.2.23.tar.gz (36.4 kB view details)

Uploaded Source

Built Distribution

thera_py-0.2.23-py3-none-any.whl (44.8 kB view details)

Uploaded Python 3

File details

Details for the file thera-py-0.2.23.tar.gz.

File metadata

  • Download URL: thera-py-0.2.23.tar.gz
  • Upload date:
  • Size: 36.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.6

File hashes

Hashes for thera-py-0.2.23.tar.gz
Algorithm Hash digest
SHA256 507bd58db66c9146eeff6cae31fb6769d74c0551ce31db87de92c302c74de5d0
MD5 6fecd3dbd4a553e937928535aa1116ba
BLAKE2b-256 8e99974cf6c25d95ebe76c2cd3c8ce50895b1b70ae93ed6390fdd781c3ede03a

See more details on using hashes here.

Provenance

File details

Details for the file thera_py-0.2.23-py3-none-any.whl.

File metadata

  • Download URL: thera_py-0.2.23-py3-none-any.whl
  • Upload date:
  • Size: 44.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.9.6

File hashes

Hashes for thera_py-0.2.23-py3-none-any.whl
Algorithm Hash digest
SHA256 fb8a879be22f5652d40600c394835e3b2bd2001bde162c50aca9c7cfeabcc664
MD5 ab5c739b95a1a25b7b715c845f54b7cf
BLAKE2b-256 02180b45208e7dc5d198bd7cfe415612f47d94f9636fce8dadeab5fdbb29f9b8

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page