Speaker
Description
Bottom-up mass spectrometry-based proteomics studies changes in protein abundance and structure across various biological conditions. Since the currency of these experiments are peptides, i.e. subsets of protein sequences that carry the quantitative information, conclusions at a different level, e.g., at the level of proteins or of post-translational modifications, must be computationally inferred. The inference is particularly challenging in situations where the peptides are shared by multiple proteins.
From a statistical perspective, inclusion of shared peptides into the estimates of abundances or proteins induces a data structure in which observations (peptide intensities) may belong to multiple groups defined by proteins. Typically, shared peptides are removed from analysis of MS data, which leads to loss of information. Alternatively, proteins that share peptides are grouped together, eliminating the possibility of estimating their distinct quantitative patterns.
In this talk, we present a statistical approach for estimating protein abundances based on quantitative information that includes shared peptides. This approach extends the existing MSstatsTMT framework for labeled MS data summarization and differential analysis by treating the quantitative patterns of shared peptides as convex combinations of abundances of individual proteins and estimating the abundance of each source in a sample together with the weights of the combination. We demonstrate the utility of this new summarization method using computer simulations and examples based on data from experiments with diverse biological objectives, including protein degradation, thermal proteome profiling, and modeling post-translational modifications.
75002910324