CRATE: clinical records anonymisation and text extraction
Project description
# CRATE
**Clinical Records Anonymisation and Text Extraction (CRATE)**
## Purpose
- Anonymises relational databases.
- Operates a GATE natural language processing (NLP) pipeline.
- Web app for
- querying the anonymised database
- managing a consent-to-contact process
## Key directories and files
- `crate_anon/anonymise/`
- **`anonymise.py`** – core program
- `make_demo_database.py` – create a test database
- `launch_multiprocess_anonymiser.sh` – parallel processing
(multiprocess) launcher for anonymise.py
- `make_demo_database.py` – creates a demonstration database
- `test_anonymisation.py` – generates a comparison of records between
source and destination databases, to check anonymisation.
- **`crate_anon/crateweb/`** – Django web application, as above
- **`docs/`** – documentation
- `crate_anon/nlp_manager/` – NLP interface tool
- `buildjava.py` – script to compile the necessary Java source on your
machine, and create a script to test the pipeline using the ANNIE demo
GATE app.
- `CrateGatePipeline.java` – Java code to interface between
nlp_manager.py (via stdin/stdout) and the Java-based external GATE tools
(via code); must be compiled before use
- `launch_multiprocess_nlp.py` – parallel processing (multiprocess)
launcher for nlp_manager.py
- `nlp_manager.py` – core program to pipe parts of a database to a GATE
program and insert the output back into a database; uses
CrateGatePipeline.java to communicate with the NLP app
- `tools/`
- **`install_virtualenv.sh`** – creates a suitable virtualenv for CRATE
- ...
- `changelog.Debian` – Debian package changelog and general version history
- `LICENCE` – license applicable to CRATE
- `README.rst` – this file
- `setup.py` – file to set up package for distribution, etc.
## Copyright/licensing
- CRATE: copyright © 2015-2017 Rudolf Cardinal (rudolf@pobox.com).
- Licensed under the GNU GPL v3: see LICENSE file.
- Third-party code/libraries included:
- aspects of CamAnonGatePipeline.java are based on demonstration GATE code,
copyright © University of Sheffield, and licensed under the GNU LGPL.
**Clinical Records Anonymisation and Text Extraction (CRATE)**
## Purpose
- Anonymises relational databases.
- Operates a GATE natural language processing (NLP) pipeline.
- Web app for
- querying the anonymised database
- managing a consent-to-contact process
## Key directories and files
- `crate_anon/anonymise/`
- **`anonymise.py`** – core program
- `make_demo_database.py` – create a test database
- `launch_multiprocess_anonymiser.sh` – parallel processing
(multiprocess) launcher for anonymise.py
- `make_demo_database.py` – creates a demonstration database
- `test_anonymisation.py` – generates a comparison of records between
source and destination databases, to check anonymisation.
- **`crate_anon/crateweb/`** – Django web application, as above
- **`docs/`** – documentation
- `crate_anon/nlp_manager/` – NLP interface tool
- `buildjava.py` – script to compile the necessary Java source on your
machine, and create a script to test the pipeline using the ANNIE demo
GATE app.
- `CrateGatePipeline.java` – Java code to interface between
nlp_manager.py (via stdin/stdout) and the Java-based external GATE tools
(via code); must be compiled before use
- `launch_multiprocess_nlp.py` – parallel processing (multiprocess)
launcher for nlp_manager.py
- `nlp_manager.py` – core program to pipe parts of a database to a GATE
program and insert the output back into a database; uses
CrateGatePipeline.java to communicate with the NLP app
- `tools/`
- **`install_virtualenv.sh`** – creates a suitable virtualenv for CRATE
- ...
- `changelog.Debian` – Debian package changelog and general version history
- `LICENCE` – license applicable to CRATE
- `README.rst` – this file
- `setup.py` – file to set up package for distribution, etc.
## Copyright/licensing
- CRATE: copyright © 2015-2017 Rudolf Cardinal (rudolf@pobox.com).
- Licensed under the GNU GPL v3: see LICENSE file.
- Third-party code/libraries included:
- aspects of CamAnonGatePipeline.java are based on demonstration GATE code,
copyright © University of Sheffield, and licensed under the GNU LGPL.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
crate-anon-0.18.14.tar.gz
(532.0 kB
view details)
File details
Details for the file crate-anon-0.18.14.tar.gz
.
File metadata
- Download URL: crate-anon-0.18.14.tar.gz
- Upload date:
- Size: 532.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2942535eaffe4faf0a2143da3ab1fbc3e88e8b13aef9401fe0cbb549fea9dd42 |
|
MD5 | 2bf74d90b805a7ba33bc0703b76a3cc4 |
|
BLAKE2b-256 | 2ac19e0623de60c8768cb886df93a14c96f0e43367aa12a9a696f5e52762a868 |