Speaker
Description
Longitudinal observational studies and clinical trials routinely collect extensive phenotypic data under changing organisational, technical, and environmental conditions. Variations in examiners, devices, protocols, or ambient factors can introduce consequential forms of measurement heterogeneity and measurement error over time. Although these sources of bias are well recognised, systematic and transparent procedures to detect temporal data patterns in this particular context have been insufficiently assessed.
Using data from the Study of Health in Pomerania (SHIP-START-4, 2019–2021) as a case example, we demonstrate the susceptibility of real-world cohort data to time-related measurement variability. SHIP-START-4 comprised 1182 participants, which underwent interviews, questionnaires, and various clinical examinations, including ultrasound examinations, ECG, blood pressure, spirometry, hand grip, laboratory assays, amongst others, resulting in about 1300 metric phenotypic variables with a median number of 877 observations (Q10=251; Q90=1182).
To address temporal trends, we applied seven commonly used statistical approaches—ARIMA, fused LASSO signal approximator (FLSA), GAM, LOWESS, moving average, PELT, and piecewise regression—to SHIP-START-4 data. Estimands were the range of the systematic change, variance, the mean absolute deviation around the median, and the number of change points. The systematic detection of these findings was also implemented as part of an automated assessment pipeline.
Applying the statistical methods listed above to the same data yielded markedly different estimates of heterogeneity and error, illustrating the complexity of making the right choice to inform on the presence and magnitude of measurement heterogeneity and measurement error. The resulting diversity corresponds to findings of a parallel simulation study, in which the identical methods were evaluated under controlled conditions representing key types of patterns empirically observed in SHIP.
This work highlights two key insights. First, empirical cohort data are intrinsically vulnerable to temporal heterogeneity and should routinely be assessed for them. Second, metadata-driven pipelines allow large-scale studies to incorporate trend detection and measurement-error diagnostics into routine data-quality workflows, but the effectiveness of such monitoring depends critically on robust statistical methods.
21429409805