MS²ReScore: Sensitive PSM rescoring with predicted MS² peak intensities and retention times.
Project description
Sensitive peptide identification rescoring with predicted spectra using MS²PIP, DeepLC, and Percolator.
About MS²ReScore
MS²ReScore performs sensitive peptide identification rescoring with predicted spectra using MS²PIP, DeepLC, and Percolator. This results in more confident peptide identifications, which allows you to get more peptide IDs at the same false discovery rate (FDR) threshold, or to set a more stringent FDR threshold while still retaining a similar number of peptide IDs. MS²ReScore is ideal for challenging proteomics identification workflows, such as proteogenomics, metaproteomics, or immunopeptidomics.
MS²ReScore uses identifications from a Percolator IN (PIN) file, or from the output of one of these search engines:
- MaxQuant: Start from
msms.txt
identification file and directory with.mgf
files. (Be sure to export without FDR filtering!) - MSGFPlus: Start with an
.mzid
identification file and corresponding.mgf
. - X!Tandem: Start with an X!Tandem
.xml
identification file and corresponding.mgf
. - PeptideShaker: Start with a
PeptideShaker Extended PSM Report and corresponding
.mgf
file.
If you use MS²ReScore for your research, please cite the following article:
Accurate peptide fragmentation predictions allow data driven approaches to replace and improve upon proteomics search engine scoring functions. Ana S C Silva, Robbin Bouwmeester, Lennart Martens, and Sven Degroeve. Bioinformatics (2019) doi:10.1093/bioinformatics/btz383
To replicate the experiments described in this article, check out the pub branch of this repository.
Installation
MS²ReScore requires:
- Python 3.6 or higher on Linux, macOS, or Windows Subsystem for Linux
- If the option
run_percolator
is set toTrue
, Percolator needs to be callable with thepercolator
command (tested with version 3.02.1) - Some pipelines require the Percolator converters, such as
tandem2pin
, as well. These are usually installed alongside Percolator.
Minimal installation:
pip install ms2rescore
Recommended installation, including DeepLC for retention time prediction:
pip install ms2rescore[deeplc]
We recommend using a venv or conda virtual environment.
Usage
Command line interface
Run MS²ReScore from the command line as follows:
ms2rescore -c <path-to-config-file> -m <path-to-mgf> <path-to-identification-file>
Run ms2rescore --help
to see all command line options.
Configuration file
MS²ReScore can be further configured through a JSON configuration file. A correct configuration is required to, for example, correctly parse the peptide modifications from the search engine output. If no configuration file is passed, or some options are not configured, the default values for these settings will be used. Options passed from the command line will override the configuration file. The full configuration is validated against a JSON Schema.
A full example configuration file can be found in ms2rescore/package_data/config_default.json.
The config file contains three top level categories (general
, ms2pip
and
percolator
) and an additional categories for specific search engines
(e.g. maxquant
). The most important options in general
are:
pipeline
(string): Pipeline to use, depending on input format. Must be one of:['infer', 'pin', 'tandem', 'maxquant', 'msgfplus', 'peptideshaker']
. Default:infer
.feature_sets
(array): Feature sets for which to generate PIN files and optionally run Percolator. Default:['all']
.- Items (string): Must be one of:
['all', 'ms2pip_rt', 'searchengine', 'rt', 'ms2pip']
.
- Items (string): Must be one of:
An overview of all options can be found in configuration.md
Notes for specific search engines
- MSGFPlus: Run MSGFPlus in a concatenated target-decoy search, with the
-addFeatures 1
flag. - MaxQuant:
- Run MaxQuant without FDR filtering (set to 1)
- MaxQuant requires additional options in the configuration file:
modification_mapping
: Maps MaxQuant output to MS²PIP modifications list. Keys must contain MaxQuant's two-letter modification codes and values must match one of the modifications listed in the MS²PIP configuration (see MS2PIP config).fixed_modifications
: Must list all modifications set as fixed during the MaxQuant search (as this is not denoted in the msms.txt file). Keys refer to the amino acid, values to the modification name used in the MS²PIP configuration.
Output
Several intermediate files are created when the entire pipeline is run. These can be
accessed by specifying the tmp_dir
option. Depending on whether or not Percolator is
run, the following output files can be expected:
For each feature set (e.g. all
, ms2pip
, searchengine
...):
<file>.pin
Percolator IN file<file>.pout
Percolator OUT file with target PSMs<file>.pout_dec
Percolator OUT file with decoy PSMs<file>.weights
Internal feature weights used by Percolator's scoring function.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ms2rescore-2.0.0b4.tar.gz
.
File metadata
- Download URL: ms2rescore-2.0.0b4.tar.gz
- Upload date:
- Size: 38.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b5cc781e7102467b1919dc5e75e4820b1078fe46a04f295ed03373a5dfe8627 |
|
MD5 | bbe5ad556c56486887c46caecbd0cce0 |
|
BLAKE2b-256 | 10a67f55c0b86a12209acbe4d97dcc88f91c242c32bca647962aa5a567dae7f9 |
Provenance
File details
Details for the file ms2rescore-2.0.0b4-py3-none-any.whl
.
File metadata
- Download URL: ms2rescore-2.0.0b4-py3-none-any.whl
- Upload date:
- Size: 45.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e5987509774632631807297782224f6dd308412c35318682455df2e234b80d04 |
|
MD5 | a557356f86e47162defa795bf331750c |
|
BLAKE2b-256 | a6c4cdcc40581db377a624f747461255696c9371f7d42cf90aebbd4c6d32abbd |