18–21 May 2026
Europe/Warsaw timezone

Comparing variable selection in Cox and accelerated failure time models: noncollapsibility, the phantom hazard

20 May 2026, 11:39
18m
Room 13 B

Room 13 B

oral presentation Censored data 3

Speaker

Lorena Hafermann (Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin)

Description

In descriptive studies, where the primary goal is to identify key predictors of a time-to-event outcome, and in predictive research involving numerous candidate predictors, data-driven variable selection methods are often employed to narrow down the pool of variables. This is particularly necessary when domain expertise is limited or when the practical utility of a prediction model is compromised by an overly complex structure. Despite their utility, variable selection methods may introduce challenges, especially in the context of survival analysis, where the choice of modeling framework can influence the results.

One such challenge probably arises from the noncollapsibility of the Cox proportional hazards model. Noncollapsibility refers to the property whereby the hazard ratio for a predictor in the Cox model does not represent a marginal association when other variables are included in the model, even if they are independent of the predictor of interest (1). In contrast, the accelerated failure time (AFT) model does not exhibit noncollapsibility. This raises the question of whether the noncollapsibility of the Cox model impacts the operation characteristics of variable selection methods compared to the AFT model.

In order to investigate this question, we applied backward elimination with different stopping criteria to Cox and AFT (Weibull) models aiming at cardiovascular event prediction with a previously published data set of health screening examinations with short follow-up (2). We also applied variable selection to bootstrap resamples to investigate selection stability. Our results indicate that the selected models and selection stability were very similar despite noncollapsibility of Cox models. We are currently running a larger simulation study to investigate this issue further.

Selection bias in risk sets may be a reason for noncollapsibility of Cox models, because it affects the correlation structures between covariates within risk sets over time. Our data set exhibited a high proportion of censoring (95.7%), which dominated this possible selection bias by introducing a random element into the composition of risk sets attenuating the correlation of covariates. We preliminarily conclude that the noncollapsibility of Cox models may be negligible for the purpose of variable selection in observational studies with high censoring proportions.

  1. Martinussen, T., Vansteelandt, S., 2013. On collapsibility and confounding bias in Cox and Aalen regression models. Lifetime Data Anal 19, 279–296.
  2. Wallisch, C., et al, 2021. Selection of variables for multivariable models: Opportunities and limitations in quantifying model stability by resampling. Statistics in Medicine 40, 369–381.

64288207955

Author

Lorena Hafermann (Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin)

Co-author

Georg Heinze (Institute of Clinical Biometrics, Centre for Medical Data Science, Medical University of Vienna)

Presentation materials

There are no materials yet.