Generating dense embeddings for proteins using kernel PCA
Project description
This tool generates low-dimensional, continuous, distributed vector representations for non-numeric entities such as text or biological sequences (e.g. DNA or proteins) via kernel PCA with rational kernels.
The current implementation accepts any input dataset that can be read as a list of strings.
Installation
Install directly from the source with:
$ pip install git+https://github.com/ratvec/ratvec.git
Install in development mode with:
$ git clone https://github.com/ratvec/ratvec.git
$ cd ratvec
$ pip install -e .
The -e dynamically links the code in the git repository to the Python site-packages so your changes get reflected immediately.
How to Use
ratvec automatically installs a command line interface. Check it out with:
$ ratvec --help
RatVec has three main commands: generate, train, and evaluate:
Generate. Downloads and prepare the SwissProt data set that is showcased in the RatVec paper.
$ ratvec generate
Train. Compute KPCA embeddings on a given data set. Please run the following command to see the arguments:
$ ratvec train --help
Evaluate. Evaluate and optimize KPCA embeddings. Please run the following command to see the arguments:
$ ratvec evaluate --help
Showcase Dataset
The application presented in the paper (SwissProt dataset [1] used by Boutet et al. [2]) can be downloaded directly from here or running the following command:
$ ratvec generate
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ratvec-0.1.1.tar.gz
.
File metadata
- Download URL: ratvec-0.1.1.tar.gz
- Upload date:
- Size: 22.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4602778525c84079221ef87b1080b5efb051877dcc9998ff11490bc9904f5773 |
|
MD5 | 538198c62acdf628f15aeb4f515f5192 |
|
BLAKE2b-256 | b5356aece06611f0530799d7cc75590532c7ec67f1df01734772821afd8a1a47 |
Provenance
File details
Details for the file ratvec-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: ratvec-0.1.1-py3-none-any.whl
- Upload date:
- Size: 23.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 764a9aa7b8be54d4354e8717e19f5a068fec3916ec07fbd7cb256c0f33445dd4 |
|
MD5 | 421f22819e83801996dcf104587e3782 |
|
BLAKE2b-256 | b915deb5965c559f133f30c0fa21a5de32d60944a6ae513c5350df7379a205cb |