Speaker
Description
Objectives: To evaluate whether machine learning (ML) applied to comprehensive claims data without diagnostic codes can distinguish a high proportion of antibiotic treatment episodes as urinary tract infection (UTI) or non-UTI cases. Such approaches may be valuable for antimicrobial stewardship when diagnosis-linked datasets are unavailable.
Methods: Outpatient antibiotic prescription claims from three major Swiss insurers (2017–2020; ~40% of the Swiss population) were analyzed. Based on clinical input, specific constellations of claims codes (e.g. positive urine culture plus typical antibiotic) were a priori assigned as indicating UTI episodes, providing the reference classification. Predictors included sex, age group, comorbidity, and diagnostic tests ordered during the episode. Four ML classifiers were tested; performance and interpretability were evaluated, with XGBoost prioritized.
Results: After cleaning and balancing, 38,982 records (19,491 UTI; 19,491 non-UTI) were included. XGBoost achieved an AUC of 0.94, accuracy of 87.6%, sensitivity of 79.2%, and specificity of 96.1%. Misclassification was asymmetric: 11% of non-UTI cases were labeled UTI, while 2% of UTI cases were misclassified as non-UTI. Diagnostics ordered were the strongest predictors, followed by female sex and older age.
Conclusions: Even in the absence of diagnosis codes, ML applied to claims data can reliably identify UTI-related prescriptions. This supports the feasibility of claims-based surveillance tools for stewardship, while in parallel highlighting the need for scalable, low-burden approaches to improve direct diagnostic coding in routine data.
21429400324