Generating dense embeddings for proteins using kernel PCA
Project description
Generating dense embeddings for proteins using kernel PCA.
Installation
Install directly from the source with:
$ pip install git+https://jira.iais.fraunhofer.de/stash/scm/meml/protein_vectors.git
Install in development mode with:
$ git clone https://jira.iais.fraunhofer.de/stash/scm/meml/protein_vectors.git
$ cd ratvec
$ pip install -e .
The -e dynamically links the code in the git repository to the Python site-packages so your changes get reflected immediately.
How to Use
ratvec is automatically installs a command line interface. Check it out with
$ ratvec --help
RatVec has four main commands: generate, train, evaluate and optimize:
Generate. Downloads and prepare the SwissProt data set that is showcased in the RatVec paper.
Train. Compute KPCA embeddings on a given data set. Please run the following command to see the arguments:
$ ratvec train --help
Evaluate. Evaluate and optimize KPCA embeddings. Please run the following command to see the arguments:
$ ratvec evaluate --help
Optimize. Evaluate and optimize KPCA embeddings. Please run the following command to see the arguments:
$ ratvec optimize --help
Showcase Dataset
The application presented in the paper (SwissProt dataset [1] used by Boutet et al. [2]) can be downloaded directly from the following website https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/JMFHTN or by running the following command:
$ ratvec generate
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.