Skip to main content

Bayesian NMF methods for mutational signature analysis & transcriptomic profiling on GPUs (Getz Lab).

Project description

SignatureAnalyzer

Automatic Relevance Determination (ARD) - NMF of mutational signature & expression data. Designed for scalability using Pytorch to run using GPUs if available.

  • See docs for a more in-depth description of how to use method.

Requires Python 3.6.0 or higher.

Installation

PIP

pip3 install signatureanalyzer

or

Git Clone
  • git clone --recursive https://github.com/broadinstitute/getzlab-SignatureAnalyzer.git
  • cd getzlab-SignatureAnalyzer
  • pip3 install -e .

Note --recurisve flag is required to clone submodules.

Docker

Link: http://gcr.io/broad-cga-sanand-gtex/signatureanalyzer

  • docker pull gcr.io/broad-cga-sanand-gtex/signatureanalyzer:latest
  • docker run -it --rm gcr.io/broad-cga-sanand-gtex/signatureanalyzer

Source Publications

PCAWG Mutational Signatures

  • Alexandrov, L. B., Kim, J., Haradhvala, N. J., Huang, M. N., Ng, A. W. T., Wu, Y., ... & Islam, S. A. (2020). The repertoire of mutational signatures in human cancer. Nature, 578(7793), 94-101.
  • see: https://www.nature.com/articles/s41586-020-1943-3
  • see ./PCAWG/

SignatureAnalyzer-GPU source publication

SignatureAnalyzer-CPU source publications

  • Kim, J. et al. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48, 600–606 (2016). (https://www.nature.com/articles/ng.3557)

  • Kasar, S. et al. Whole-genome sequencing reveals activation-induced cytidine deaminase signatures during indolent chronic lymphocytic leukaemia evolution. Nat. Commun. 6, 8866 (2015). (https://www.nature.com/articles/ncomms9866)

Mathematical details

  • Tan, V. Y. F., Edric, C. & Evotte, F. Automatic Relevance Determination in Nonnegative Matrix Factorization with the β-Divergence. (2012). (https://arxiv.org/pdf/1111.6085.pdf)

Command Line Interface

usage: signatureanalyzer [-h] [-t {maf,spectra,matrix}] [-n NRUNS] [-o OUTDIR]
                         [--cosmic {cosmic2,cosmic3,cosmic3_exome,cosmic3_DBS,cosmic3_ID,cosmic3_TSB}]
                         [--hg_build HG_BUILD] [--cuda_int CUDA_INT]
                         [--verbose] [--K0 K0] [--max_iter MAX_ITER]
                         [--del_ DEL_] [--tolerance TOLERANCE] [--phi PHI]
                         [--a A] [--b B] [--objective {poisson,gaussian}]
                         [--prior_on_W {L1,L2}] [--prior_on_H {L1,L2}]
                         [--report_freq REPORT_FREQ]
                         [--active_thresh ACTIVE_THRESH] [--cut_norm CUT_NORM]
                         [--cut_diff CUT_DIFF]
                         input

Example:

signatureanalyzer input.maf -n 10 --cosmic cosmic2 --objective poisson

Python API

import signatureanalyzer as sa

# ---------------------
# RUN SIGNATURE ANALYZER
# ---------------------

# Run array of decompositions with mutational signature processing
sa.run_maf(input.maf, outdir='./ardnmf_output/', cosmic='cosmic2', hg_build='./ref/hg19.2bit', nruns=10)

# Run ARD-NMF algorithm standalone
sa.ardnmf(...)

# ---------------------
# LOADING RESULTS
# ---------------------
import pandas as pd

H = pd.read_hdf('nmf_output.h5', 'H')
W = pd.read_hdf('nmf_output.h5', 'W')
Hraw = pd.read_hdf('nmf_output.h5', 'Hraw')
Wraw = pd.read_hdf('nmf_output.h5', 'Wraw')
feature_signatures = pd.read_hdf('nmf_output.h5', 'signatures')
markers = pd.read_hdf('nmf_output.h5', 'markers')
cosine = pd.read_hdf('nmf_output.h5', 'cosine')
log = pd.read_hdf('nmf_output.h5', 'log')

# Output for each run may be found at...
Hrun1 = pd.read_hdf('nmf_output.h5', 'run1/H')
Wrun1 = pd.read_hdf('nmf_output.h5', 'run1/W')
# etc...

# Aggregate output information for each run
aggr = pd.read_hdf('nmf_output.h5', 'aggr')

# ---------------------
# PLOTTING
# ---------------------
sa.pl.marker_heatmap(...)
sa.pl.signature_barplot(...)
sa.pl.stacked_bar(...)
sa.pl.k_dist(...)
sa.pl.consensus_matrix(...)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signatureanalyzer-0.0.7.tar.gz (169.6 kB view details)

Uploaded Source

Built Distribution

signatureanalyzer-0.0.7-py3-none-any.whl (179.1 kB view details)

Uploaded Python 3

File details

Details for the file signatureanalyzer-0.0.7.tar.gz.

File metadata

  • Download URL: signatureanalyzer-0.0.7.tar.gz
  • Upload date:
  • Size: 169.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.4

File hashes

Hashes for signatureanalyzer-0.0.7.tar.gz
Algorithm Hash digest
SHA256 bb87e6566a8aac7ef0abe9e85410b17bd68f29539c9cef42855e5950072ade6b
MD5 f41839145b2fc59b1bb2ad6f7550bffb
BLAKE2b-256 5286e1a6e065c19276deea3894e65b3a50c79d95aa4841ee9a67cc3970deb116

See more details on using hashes here.

File details

Details for the file signatureanalyzer-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: signatureanalyzer-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 179.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.4

File hashes

Hashes for signatureanalyzer-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 5b6c74f9975a13bc28e0dd0b357291fa9e08397e480b0cec9d7439012d3355e7
MD5 e033d73b1f478dbf71be697ecd0e9a0e
BLAKE2b-256 2e5a73be466a9bed218d68abb67cc05ceb82a62e948264e7862eff0ac921666e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page