CLUMPS-PTM driver gene discovery using 3D protein structure (Getz Lab).
Project description
CLUMPS-PTM
An algorithm for identifying 3D clusters ("clumps") of post-translational modifications (PTMs). Developed for the Clinical Proteomic Tumor Atlas Consortium (CPTAC). Full project repoistory for pan-cancer project can be found here.
Author: Shankara Anand
Email: sanand@broadinstitute.org
Requires Python 3.6.0 or higher.
Installation
PIP
pip3 install clumps-ptm
or
Git Clone
git clone git@github.com:getzlab/CLUMPS-PTM.git
cd CLUMPS-PTM
pip3 install -e .
Use
CLUMPS-PTM has 3 general phases of analysis:
- Mapping: taking input PTM proteomic data and mapping them onto PDB structural data.
Mapping relies on the source data and involves programmatic calling of blastp+
depending on the source data-base to map to UNIPROT and ultimately PDB structures. An example notebook that walks through the mapping and demonstrates use of clumps-ptm
API for running these steps programmatically can be found here. Once the mapping is performed once for a new data-set, the mapping file is used as the --maps
flag in clumpsptm
command (below).
- CLUMPS: running the algorithm for identifying statistically significant clustering of PTM sites.
CLUMPS-PTM was designed for use with differential expression proteomic data. Due to the nature of drop-out in Mass-Spectrometry data, we opt for using broad changes in PTM levels across sample groups to interrogate "clumping" of modifications. Thus, the input requires out-put from Limma-Voom differential expression.
usage: clumpsptm [-h] -i INPUT -m MAPS -w WEIGHT -s PDBSTORE [-o OUTPUT_DIR]
[-x XPO] [--threads THREADS] [-v]
[-f [FEATURES [FEATURES ...]]] [-g GROUPING] [-q]
[--min_sites MIN_SITES] [--subset {positive,negative}]
[--protein_id PROTEIN_ID] [--site_id SITE_ID] [--alphafold]
[--alphafold_threshold ALPHAFOLD_THRESHOLD]
Run CLUMPS-PTM.
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
<Required> Input file.
-m MAPS, --maps MAPS <Required> Mapping with index as indices that overlap
input.
-w WEIGHT, --weight WEIGHT
<Required> Weighting for CLUMPS-PTM (ex. logFC).
-s PDBSTORE, --pdbstore PDBSTORE
<Required> path to PDBStore directory.
-o OUTPUT_DIR, --output_dir OUTPUT_DIR
Output directory.
-x XPO, --xpo XPO Soft threshold parameter for truncated Gaussian.
--threads THREADS Number of threads for sampling.
-v, --verbose Verbosity.
-f [FEATURES [FEATURES ...]], --features [FEATURES [FEATURES ...]]
Assays to subset for.
-g GROUPING, --grouping GROUPING
DE group to use.
-q, --use_only_significant_sites
Only use significant sites for CLUMPS-PTM.
--min_sites MIN_SITES
Minimum number of sites.
--subset {positive,negative}
Subset sites.
--protein_id PROTEIN_ID
Unique protein id in input.
--site_id SITE_ID Unique site id in input.
--alphafold Run using alphafold structures.
--alphafold_threshold ALPHAFOLD_THRESHOLD
Threshold confidence level for alphafold sites.
- Post-Processing: post-processing (FDR correction) & visualization in Pymol.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file clumps-ptm-0.0.6.tar.gz
.
File metadata
- Download URL: clumps-ptm-0.0.6.tar.gz
- Upload date:
- Size: 581.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a36b5599fa2702cd7bf48541ce842ee24c8ac27329590ef6acbeea748b09faf |
|
MD5 | 087e11fb25f17e51a38c911f08babd92 |
|
BLAKE2b-256 | 5308c575be8645b0d0cc3ad0fa1a803830c0fb3a95057e0e8c7ee517f9c1658b |
File details
Details for the file clumps_ptm-0.0.6-py3-none-any.whl
.
File metadata
- Download URL: clumps_ptm-0.0.6-py3-none-any.whl
- Upload date:
- Size: 25.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 96934b32fe4391c5250fd8baf5470d67ebd1ad5662dd44603f6252b9f9507f1d |
|
MD5 | 90dee5c263c28e3369b5d4f5a90bec1a |
|
BLAKE2b-256 | 82b263d7c145bc31c1d5c3c13fcb648479118164aa3d546fe5f5e62a6c5e55ae |