Document Type
Article
Publication Date
Fall 10-21-2022
Abstract
Metaproteomics based on high-throughput tandem mass spectrometry (MS/MS) plays a crucial role in characterizing microbiome functions. The acquired MS/MS data is searched against a protein sequence database to identify peptides, which are then used to infer a list of proteins present in a metaproteome sample. While the problem of protein inference has been well-studied for proteomics of single organisms, it remains a major challenge for metaproteomics of complex microbial communities because of the large number of degenerate peptides shared among homologous proteins in different organisms. This challenge calls for improved discrimination of true protein identifications from false protein identifications given a set of unique and degenerate peptides identified in metaproteomics. MetaLP was developed here for protein inference in metaproteomics using an integrative linear programming method. Taxonomic abundance information extracted from metagenomics shotgun sequencing or 16s rRNA gene amplicon sequencing, was incorporated as prior information in MetaLP. Benchmarking with mock, human gut, soil, and marine microbial communities demonstrated significantly higher numbers of protein identifications by MetaLP than ProteinLP, PeptideProphet, DeepPep, PIPQ, and Sipros Ensemble. In conclusion, MetaLP could substantially improve protein inference for complex metaproteomes by incorporating taxonomic abundance information in a linear programming model.
Persistent Identifier
http://hdl.handle.net/10950/4387
Publisher
PLOS One
Permanent Email Address
hji@uttyler.edu
Recommended Citation
Feng, Shichao; Ji, Hong-Long; Wang, Huan; Zhang, Bailu; Sterzenbach, Ryan; Pan, Chongle; and Guo, Xuan, "MetaLP: An integrative linear programming method for protein inference in metaproteomics" (2022). Cellular and Molecular Biology Faculty Publications and Presentations. Paper 8.
Description
© 2022 Feng et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.