18–21 May 2026
Europe/Warsaw timezone

Comparison of different methods for the meta-analysis of diagnostic test accuracy studies – a simulation study

21 May 2026, 11:57
18m
Room 13 A

Room 13 A

Speaker

Ferdinand Valentin Stoye (Biostatistics and Medical Biometry, Medical School OWL, Bielefeld University)

Description

Meta-analysis of diagnostic test accuracy (DTA) studies deals with aggregating information from multiple studies on sensitivity and specificity. Classical approaches to this task select a single pair of sensitivity and specificity per study (single threshold methods, STM), possibly ignoring additional information if studies report results on multiple diagnostic thresholds. In recent years, models have been proposed that consider all available information and enable inference on the optimal diagnostic threshold (multiple threshold methods, MTM). We compare five STM and six MTM to each other in a simulation study to evaluate their performance in various situations. For each generated meta-analysis dataset, we estimate a set of summary sensitivity and specificity (either identified directly or using the maximum unweighted Youden-index), and the area under the summary ROC curve (AUC). To cover a broad range of real-life data settings, we vary eight underlying parameter dimensions in the data-generating mechanisms, including continuous or ordinal outcome type, different numbers of diagnostic thresholds per study, and different disease prevalences. Overall, the model performance of STM and MTM is comparable regarding bias in optimal sensitivity, specificity, and AUC, as well as empirical coverage and convergence. However, we observe a binomial generalized linear mixed model with bivariate random effect of the MTM type, that models the sensitivity and specificity with a logit link and additional covariate for the diagnostic threshold, to be slightly superior to the other models in many situations. Model performances depend strongest on the outcome type in the data generation, while the number of thresholds only has minor impact. We thus find the main advantage of using MTM by getting threshold-dependent estimates of sensitivity and specificity. Additionally, we illustrate differences between model estimates in two real-data examples on the diagnosis of type 2 diabetes using the continuous biomarker HbA1c and on the diagnosis of any anxiety disorder using the ordinal questionnaire HADS-A. The applications reveal substantial variations in model estimates within and between STM and MTM, which can be reduced by adjusting for the estimated bias in the simulation settings resembling the real-data situation most closely. Our study highlights the importance of careful model selection when conducting meta-analysis of DTA studies, which should be informed by the observed data structure of the application (e.g., if the diagnostic test is measured on a continuous or ordinal scale).

75002906444

Author

Ferdinand Valentin Stoye (Biostatistics and Medical Biometry, Medical School OWL, Bielefeld University)

Presentation materials

There are no materials yet.