VICC normalization routine for therapies
Project description
Therapy Normalization
Services and guidelines for normalizing drug (and non-drug therapy) terms
Developer instructions
The following sections include instructions specifically for developers.
Installation
For a development install, we recommend using Pipenv. See the pipenv docs for direction on installing pipenv in your compute environment.
Once installed, from the project root dir, just run:
pipenv sync
Deploying DynamoDB Locally
We use Amazon DynamoDB for our database. To deploy locally, follow these instructions.
Init coding style tests
Code style is managed by flake8 and checked prior to commit.
We use pre-commit to run conformance tests.
This ensures:
- Check code style
- Check for added large files
- Detect AWS Credentials
- Detect Private Key
Before first commit run:
pre-commit install
Running unit tests
Running unit tests is as easy as pytest.
pipenv run pytest
Updating the therapy normalization database
Before you use the CLI to update the database, run the following in a separate terminal to start DynamoDB on port 8000
:
java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb
To change the port, simply add -port value
.
Setting Environment Variables
RxNorm requires a UMLS license, which you can register for one here.
You must set the RxNORM_API_KEY
environment variable to your API key. This can be found in the UTS 'My Profile' area after singing in.
export RXNORM_API_KEY={rxnorm_api_key}
Update source(s)
The sources we currently use are: ChEMBL, NCIt, DrugBank (CC0 data only), RxNorm, ChemIDplus, Wikidata, and HemOnc.org.
To update source(s), simply set --normalizer
to the source(s) you wish to update separated by spaces. For example, the following command updates ChEMBL and Wikidata:
python3 -m therapy.cli --normalizer="chembl wikidata"
You can update all sources at once with the --update_all
flag:
python3 -m therapy.cli --update_all
The data/
subdirectory within the application should include all source data. The normalizer is capable of acquiring most of these files automatically; the exception is the HemOnc.org data, which must be manually downloaded from the Harvard Dataverse and placed within the data/hemonc
subdirectory. Files for all sources should follow the naming convention demonstrated below (with version numbers/dates changed where applicable).
therapy/data
├── chembl
│ └── chembl_27.db
├── chemidplus
│ └── chemidplus_20200327.xml
├── drugbank
│ └── drugbank_5.1.8.csv
├── hemonc
│ ├── hemonc_concepts_20210225.csv
│ ├── hemonc_rels_20210225.csv
│ └── hemonc_synonyms_20210225.csv
├── ncit
│ └── ncit_20.09d.owl
├── rxnorm
│ ├── drug_forms.yaml
│ └── rxnorm_20210104.RRF
└── wikidata
└── wikidata_20210425.json
Create Merged Concept Groups
The /normalize
endpoint relies on merged concept groups. The --update_merged
flag generates these groups:
python3 -m therapy.cli --update_merged
Specifying the database URL endpoint
The default URL endpoint is http://localhost:8000
.
There are two different ways to specify the database URL endpoint.
The first way is to set the --db_url
flag to the URL endpoint.
python3 -m therapy.cli --update_all --db_url="http://localhost:8001"
The second way is to set the THERAPY_NORM_DB_URL
to the URL endpoint.
export THERAPY_NORM_DB_URL="http://localhost:8001"
python3 -m therapy.cli --update_all
Starting the therapy normalization service
From the project root, run the following:
uvicorn therapy.main:app --reload
Next, view the OpenAPI docs on your local machine:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for thera_py-0.2.16-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0be19bc67203fc298f2077c768ed7e4bad2466e891e4f82895d0ba4b1c17df7b |
|
MD5 | 75b65f801916a0ae7c192f6198997f73 |
|
BLAKE2b-256 | baeef9e708800352c24f019d2447e1599cd433babe4fc230d29b593dd72494a4 |