Skip to main content

Bayesian NMF methods for mutational signature analysis & transcriptomic profiling on GPUs (Getz Lab).

Project description

SignatureAnalyzer

Automatic Relevance Determination (ARD) - NMF of mutational signature & expression data. Designed for scalability using Pytorch to run using GPUs if available.

  • See docs for a more in-depth description of how to use method.

Requires Python 3.6.0 or higher.

Installation

PIP

pip3 install signatureanalyzer

or

Git Clone
  • git clone --recursive https://github.com/broadinstitute/getzlab-SignatureAnalyzer.git
  • cd getzlab-SignatureAnalyzer
  • pip3 install -e .

Note --recurisve flag is required to clone submodules.

Docker

Coming soon.


Source Publications

SignatureAnalyzer-GPU source publication

SignatureAnalyzer-CPU source publications

  • Kim, J. et al. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48, 600–606 (2016). (https://www.nature.com/articles/ng.3557)

  • Kasar, S. et al. Whole-genome sequencing reveals activation-induced cytidine deaminase signatures during indolent chronic lymphocytic leukaemia evolution. Nat. Commun. 6, 8866 (2015). (https://www.nature.com/articles/ncomms9866)

Mathematical details

  • Tan, V. Y. F., Edric, C. & Evotte, F. Automatic Relevance Determination in Nonnegative Matrix Factorization with the β-Divergence. (2012). (https://arxiv.org/pdf/1111.6085.pdf)

Command Line Interface

usage: signatureanalyzer [-h] [-t {maf,spectra,matrix}] [-n NRUNS] [-o OUTDIR]
                         [--cosmic {cosmic2,cosmic3,cosmic3_exome,cosmic3_DBS,cosmic3_ID,cosmic3_TSB}]
                         [--hg_build HG_BUILD] [--cuda_int CUDA_INT]
                         [--verbose] [--K0 K0] [--max_iter MAX_ITER]
                         [--del_ DEL_] [--tolerance TOLERANCE] [--phi PHI]
                         [--a A] [--b B] [--objective {poisson,gaussian}]
                         [--prior_on_W {L1,L2}] [--prior_on_H {L1,L2}]
                         [--report_freq REPORT_FREQ]
                         [--active_thresh ACTIVE_THRESH] [--cut_norm CUT_NORM]
                         [--cut_diff CUT_DIFF]
                         input

Example:

signatureanalyzer input.maf -n 10 --cosmic cosmic2 --objective poisson

Python API

import signatureanalyzer as sa

# ---------------------
# RUN SIGNATURE ANALYZER
# ---------------------

# Run array of decompositions with mutational signature processing
sa.run_maf(input.maf, outdir='./ardnmf_output/', cosmic='cosmic2', hg_build='hg19', nruns=10)

# Run ARD-NMF algorithm standalone
sa.ardnmf(...)

# ---------------------
# LOADING RESULTS
# ---------------------
import pandas as pd

H = pd.read_hdf('nmf_output.h5', 'H')
W = pd.read_hdf('nmf_output.h5', 'W')
Hraw = pd.read_hdf('nmf_output.h5', 'Hraw')
Wraw = pd.read_hdf('nmf_output.h5', 'Wraw')
feature_signatures = pd.read_hdf('nmf_output.h5', 'signatures')
markers = pd.read_hdf('nmf_output.h5', 'markers')
cosine = pd.read_hdf('nmf_output.h5', 'cosine')
log = pd.read_hdf('nmf_output.h5', 'log')

# Output for each run may be found at...
Hrun1 = pd.read_hdf('nmf_output.h5', 'run1/H')
Wrun1 = pd.read_hdf('nmf_output.h5', 'run1/W')
# etc...

# Aggregate output information for each run
aggr = pd.read_hdf('nmf_output.h5', 'aggr')

# ---------------------
# PLOTTING
# ---------------------
sa.pl.marker_heatmap(...)
sa.pl.signature_barplot(...)
sa.pl.stacked_bar(...)
sa.pl.k_dist(...)
sa.pl.consensus_matrix(...)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

signatureanalyzer-0.0.3.tar.gz (159.1 kB view details)

Uploaded Source

Built Distribution

signatureanalyzer-0.0.3-py3-none-any.whl (168.6 kB view details)

Uploaded Python 3

File details

Details for the file signatureanalyzer-0.0.3.tar.gz.

File metadata

  • Download URL: signatureanalyzer-0.0.3.tar.gz
  • Upload date:
  • Size: 159.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.4

File hashes

Hashes for signatureanalyzer-0.0.3.tar.gz
Algorithm Hash digest
SHA256 47331e837e0959b5b37ffece82e9f33e5b51b4d8de6ab460d59a7ee74f035d27
MD5 ff0c0aa32e3d12a564014c14b514ea31
BLAKE2b-256 b64c8eb294aea23d384c2f90f07732b26ddce12ac5fe829c8350ff1e891786ef

See more details on using hashes here.

File details

Details for the file signatureanalyzer-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: signatureanalyzer-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 168.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.4

File hashes

Hashes for signatureanalyzer-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 75099cab514a71f22802383ae80f5f045f102cde4143e939010579ca231b3c4c
MD5 31b394102d4a9c6d01e6be53638ddc9c
BLAKE2b-256 d18e6ab14d7c52838814ebdecb68a10297dd7f5e08d3f19b2267a8586c2425a5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page