Speaker
Description
Aging is the dominant risk factor for neurodegenerative and systemic diseases, yet its molecular signatures remain obscured within high-dimensional, noisy, and strongly correlated proteomes. To address this challenge, we introduce the Protein Risk Score (ProtRS) framework—a systematic evaluation framework for ProtRS modeling that assesses how different multivariate approaches extract age-associated signals from cerebrospinal fluid (CSF) proteomics.
Using the Emory CSF cohort (n = 504, 2,067 proteins), we applied a stringent normalization pipeline that minimizes technical artifacts while preserving biological structure. Four variable-selection strategies were evaluated for predicting chronological age: univariate regression, LASSO, Elastic Net, and the Bayesian regression with regularized horseshoe (RHS) prior. Elastic Net achieved the highest predictive accuracy (Pearson correlation r = 0.73), effectively leveraging correlated protein clusters. RHS performed comparably (r = 0.72) while offering superior parsimony and principled uncertainty quantification. In contrast, LASSO (r = 0.68) and univariate screening (<0.60) underperformed, largely due to their inability to model joint proteomic structure.
Biological fidelity was evaluated using overlap with established aging-associated markers from the UK Biobank plasma aging clock. Elastic Net and RHS recovered the largest subset of known aging-related proteins, demonstrating strong pathway relevance and improved interpretability.
To generalize these findings beyond a single dataset, we developed a realistic proteomic simulator capable of replicating empirical correlation structures or generating synthetic ones with controlled sparsity, dimensionality, and multicollinearity. A comprehensive evaluation across 561 high-dimensional scenarios revealed consistent trends: sample size is the primary determinant of predictive performance, while extreme p ≫ n regimes and strong correlations challenge all methods. Across settings, Elastic Net and RHS demonstrated notable resilience—maintaining strong predictive accuracy, stable support recovery, and robustness under multicollinearity. RHS achieved the best false-discovery control, whereas Elastic Net provided substantial computational efficiency. Both consistently outperformed LASSO and univariate approaches in support recovery and predictive stability.
Together, these empirical and simulation-based results form the building blocks of ProtRS and underscore a central principle: accurate proteomic aging models require multivariate regularization that embraces—rather than suppresses—the inherent correlation structure of the proteome. As a systematic evaluation framework for ProtRS modeling, ProtRS offers a scalable, interpretable foundation for constructing precision aging clocks, identifying mechanistic pathways, and integrating proteomics into multimodal aging research. By bridging statistical rigor with biological insight, ProtRS advances the development of next-generation diagnostics and interventions for age-related disease.
64288209366