Variable screening for censored survival data is most challenging when both survival and censoring times are correlated with an ultrahigh‐dimensional vector of covariates. Existing approaches to handling censoring often make use of inverse probability weighting by assuming independent censoring with both survival time and covariates. This is a convenient but rather restrictive assumption which may be unmet in real applications, especially when the censoring mechanism is complex and the number of covariates is large. To accommodate heterogeneous (covariate‐dependent) censoring that is often present in high‐dimensional survival data, we propose a Gehan‐type rank screening method to select features that are relevant to the survival time. The method is invariant to monotone transformations of the response and of the predictors, and works robustly for a general class of survival models. We establish the sure screening property of the proposed methodology. Simulation studies and a lymphoma data analysis demonstrate its favorable performance and practical utility. Copyright © 2020 Board of the Foundation of the Scandinavian Journal of Statistics.
CitationXu, J., Li, W. K., & Ying, Z. (2020). Variable screening for survival data in the presence of heterogeneous censoring. Scandinavian Journal of Statistics. Advance online publication. doi: 10.1111/sjos.12458
- Gehan-type rank statistics
- High-dimensional survival data
- Heterogeneous censoring
- Sure screening property
- Variable screening