Speaker
Description
Background:
A random forest (RF) is an efficient method for prediction but it is difficult to
interpret.
Artificial Representative Trees (ARTs) are a special type of surrogate model
that approximates the original strucutre of the RF in a single tree, achieving
similar predictive accuracy.
Conformal Predictive Systems (CPS) provide a framework for uncertainty
quantification by generating prediction intervals. It is also possible to
calculate the probability of an observation being above a selected threshold.
Motivation:
Our aim is to combine the strengths of ARTs and CPS in order to make
reliable predictions of ordinal outcomes and to generate a stable tree that is
easy to interpret while still conveying accurate local uncertainty insights per
leaf. We will use a single ART to predict both the regression value and the
probability of exceeding a diagnostic threshold. One particular strength with
the suggested solution is the ability to dynamically adapt the ART to different
thresholds. For illustration, we use the NHANES dataset containing blood
glucose levels and related health indicators.
Methods:
We compared four modeling strategies: (i) an integrated ART + CPS
approach, (ii) a decision tree + CPS model, (iii) multiple ARTs, and (iv)
decision trees. The output from (i) and (ii) are joint predictions and
uncertainty estimations, and for (iii) and (iv) different models are required for
continuous and probability outcomes. Prediction performance (mean squared
error, Brier Score) was evaluated using 10-fold cross-validation. Coverage of
prediction intervals and model interpretability, measured by tree depth and
cross-fold similarity, are also evaluated.
Results:
The ART + CPS and Decision Tree + CPS models achieve comparable
predictive performance for both glucose level and probability prediction of
prediabetes and diabetes. However, the ART + CPS models are substantially
more interpretable. Their trees are about half the size of decision trees and
more stable. Additionally, they avoid the need for multiple specialized models
for each prediction task that might lead to contradictory interpretations
Conclusion:
This study shows that combining ARTs with CPS produces a single
interpretable model that balances accuracy, stability, and explainability, while
providing quantitative predictions with uncertainty estimates and probabilities
for diagnostic categories.
32144101844