Automate downloading UMLS data.
Project description
UMLS Downloader
Don't worry about UMLS licensing and distribution rules - just use
umls_downloader
to write code that knows how to download it and use it
automatically.
Installation
$ pip install umls_downloader
Download A Specific Version
import os
from umls_downloader import download_umls
# Get this from https://uts.nlm.nih.gov/uts/edit-profile
api_key = ...
path = download_umls(version="2021AB", api_key=api_key)
# This is where it gets downloaded: ~/.data/bio/umls/2021AB/umls-2021AB-mrconso.zip
expected_path = os.path.join(
os.path.expanduser("~"), ".data", "umls", "2021AB",
"umls-2021AB-mrconso.zip",
)
assert expected_path == path.as_posix()
After it's been downloaded once, it's smart and doesn't need to download again.
It gets stored using pystow
automatically
in the ~/.data/umls
directory.
Automating Configuration of UMLS Credentials
There are two ways to automatically set the username and password so you don't have to worry about getting it and passing it around in your python code:
- Set
UMLS_API_KEY
in the environment - Create
~/.config/umls.ini
and set in the[umls]
section aapi_key
key.
from umls_downloader import download_umls
# Same path as before
path = download_umls(version="2021AB")
Download the Latest Version
First, you'll have to
install bioversions
with pip install bioversions
, whose job it is to look up the latest version of
many databases. Then, you can modify the previous code slightly by omitting
the version
keyword argument:
from umls_downloader import download_umls
# Same path as before (as of November 21st, 2021)
path = download_umls()
Why not an API?
The UMLS provides an API
for access to tiny bits of data at a time. There are even two recent (last 5
years) packages umls-api
connect-umls
that provide a wrapper
around them. However, API access is generally rate limited, difficult to use in
bulk, and slow. For working with UMLS (or any other database, for that matter)in
bulk, it's necessary to download full database dumps.
👋 Attribution
⚖️ License
The code in this package is licensed under the MIT License.
🍪 Cookiecutter
This package was created with @audreyfeldroy's cookiecutter package using @cthoyt's cookiecutter-snekpack template.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for umls_downloader-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a5db4b1c166c89842b29a5e0a351030ad852c40d062d06f8f9fa4feb29fc251 |
|
MD5 | 91c8c76df085169c83914784114f8b46 |
|
BLAKE2b-256 | 188d2baec38529e01fca055b4ab88ac6f61401eb7cb8de9916ccb86bae2b6ebf |