Speaker
Description
In many real-world datasets, observations are hierarchically structured, such as students nested within classrooms, hospitals within cities, or repeated measurements from the same patient. Performing machine learning without accounting for this clustered structure can lead to biased predictions and misleading interpretations of feature effects.
Recently, Mixed Effect Machine Learning, an extension of the traditional Linear Mixed Effect Model, has gained popularity for analyzing clustered and longitudinal data, particularly in healthcare applications, due to its ability to capture both population-level and group-specific variations. However, despite its predictive advantages, such models often remain black boxes, making interpretation difficult.
Applying explainable AI (XAI) tools such as Shapley Values directly to mixed effect models has limited effectiveness because clustered data contain both cluster-level and observation-level features. Moreover, mixed effect models inherently separate structure into fixed effects, shared across all observations, and random effects, which vary across clusters. Standard SHAP values cannot distinguish how contributions operate at these different hierarchical levels, leading to incomplete explanations of model behavior.
This study proposes an extension of the SHAP framework tailored specifically for Mixed Effect Machine Learning. The proposed approach enables a clear decomposition of feature contributions across cluster and observation levels, offering interpretable insights into how models use features in structured data. Beyond quantifying how much each feature contributes, this method reveals at what level, cluster or individual, the model utilizes each feature. Consequently, it also provides diagnostic guidance on whether additional random effects should be incorporated and how the model’s hierarchical structure should be refined.
85717602088