Skip to main content

Pandas API for Gene Set Enrichment Analysis in Python (GSEApy, cudaGSEA, GSEA)

Project description

Pandas API for Gene Set Enrichment Analysis in Python (GSEApy, cudaGSEA, GSEA)

  • This Python wrapper around various GSEA implementations aims to provide a unified programming interface, built using the pandas DataFrames and a hierarchy of Pythonic classes.

  • The file exports (providing input for GSEA) were written with performance in mind, using lower level numpy functions where necessary, thus are much faster than usual pandas-based exports.

  • This project aims to allow scientists in the Python community to easily compare different implementations of GSEA, and to integrate those in projects which require high performance GSEA interface.

  • The project is in work-in-progress state and scheduled to have a major refactor and a more complete documentation.

Example usage

from pandas import read_csv
from gsea_api.expression_set import ExpressionSet
from gsea_api.gsea import GSEADesktop
from gsea_api.molecular_signatures_db import GeneMatrixTransposed

reactome_pathways = GeneMatrixTransposed.from_gmt('ReactomePathways.gmt')

gsea = GSEADesktop()

design = ['Disease', 'Disease', 'Disease', 'Control', 'Control', 'Control']
matrix = read_csv('expression_data.csv')

result = gsea.run(
    # note: contrast() is not necessary in this simple case
    ExpressionSet(matrix, design).contrast('Disease', 'Control'),
    reactome_pathways,
    metric='Signal2Noise',
    permutations=1000
)

Installation

To install the API use:

pip3 install gsea_api

Installing GSEA from Broad Institute

Login/register on the official GSEA website and download the gsea_3.0.jar file (or a newer version).

Please place the downloaded file in the thirdparty directory.

Installing GSEApy

To use gsea.py please install it with:

pip3 install gseapy

and link its binary to the thirdparty directory

ln -s virtual_environment_path/bin/gseapy thirdparty/gseapy

Installing cudaGSEA

Please clone this fork of cudaGSEA to thirdparty directory and compile the binary version:

git clone https://github.com/krassowski/cudaGSEA

or use the original version, which does not implement FDR calculations.

Citation

Please cite the authors of the wrapped tools that you use.

References

The initial version of this code was written for my Master thesis project at Imperial College London.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gsea_api-0.1.1.tar.gz (11.0 kB view details)

Uploaded Source

File details

Details for the file gsea_api-0.1.1.tar.gz.

File metadata

  • Download URL: gsea_api-0.1.1.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.2

File hashes

Hashes for gsea_api-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0a6b5c240c7dec4236edcd10da322be85ebd1b023c053345843f5433a4448e27
MD5 1cb99dbe077df581ec8731b4724308ad
BLAKE2b-256 c4fea9236912d737b56d2fc36cc72f96f6cb3a2d380abe28c3e4b09ce10f2229

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page