18–21 May 2026
Europe/Warsaw timezone

[17] Machine Learning Imputation of Air Pollutant Concentrations Using XGBoost

19 May 2026, 10:00
7h 15m
x Poster display area

x Poster display area

Speaker

Dorota Domagała (University of Life Science)

Description

This study evaluated the performance of the XGBoost method for imputing missing values in air quality data. The analysis used complete measurements of PM2.5, PM10, SO₂, NO, NO₂, and C6H6 recorded in Lublin in January 2020. To simulate missing data, 15%, 20%, and 25% of observations were randomly removed from each variable and imputed using XGBoost trained on the remaining data. Additionally, missing values were generated in a non-random manner to test the method under more challenging settings. The accuracy of imputations was assessed using the sum of absolute differences between observed and imputed values. Results show that XGBoost effectively reconstructs missing data under both random and non-random patterns, with minimal deviation from true measurements.

96432319719

Author

Dorota Domagała (University of Life Science)

Co-author

Małgorzata Szczepanik (University of Life Science)

Presentation materials

There are no materials yet.