Speaker
Description
We consider the following prediction problem using observational data obtained from routine health-care visits. Biomarkers such as blood pressure and cholesterol are repeatedly measured over time, resulting in sparse and irregular longitudinal data for thousands of individuals. In addition, we observe corresponding survival outcomes, such as the time to cardiovascular disease or death, which are correlated with underlying biomarker trajectories. Unlike traditional survival settings, individuals do not share a common starting time; instead, they enter the study at varying ages and without an intervention. Furthermore, individual observation windows are relatively short (e.g., five years)–either because long-term data are not available or because older data might not reflect current patient characteristics. Given these constraints, can such short-term longitudinal data be used to make reliable long-term risk predictions, projecting 10-20 years into the future?
From a methodological perspective, there is a way to approach this seemingly contradictory problem. Assuming proportional hazards throughout the whole age range, we can employ proportional hazard models that use age as the underlying time scale. In this setting, individuals enter the study at various ages, resulting in left-truncated survival data and an assumed baseline hazard that spans the entire observed age range. Consequently, the prediction horizon for a new patient can extend up to the maximum age in the training data–potentially decades beyond the short observation windows of individuals. What remains unclear, however, is how to adequately harness the longitudinal information for survival prediction. In more traditional time-on-study settings, different methods have been proposed for longitudinal and associated survival data. Among them, joint models have been shown to reduce bias and improve efficiency in parameter estimation. However, these advantages may come at the cost of substantial computational demands. This burden, as well as the quality of resulting predictions, may further be challenged by the left-truncation in the survival data, the long prediction horizon, and the sparsity of visit times.
In this study, we explored the applicability of joint models and related approaches for long-term risk prediction in data with left-truncated survival times and multiple longitudinal markers as predictors. We use simulation studies to asses the methods, starting with an ideal data scenario with ample repeated measurements per individual, and gradually moving towards a setting that mimics a real-world example of routine health-care data from Austria. We evaluate the methods in terms of prediction accuracy as well as the feasibility of model estimation.
21429409279