Fitting high-dimensional mixture cure models using the hdcuremodelsR package.
Kellie J Archer, Han Fu
Abstract
Open AccessBACKGROUND AND OBJECTIVE: Time-to-event outcomes are often of interest in biomedical studies. When the dataset includes long-term survivors or subjects who will not experience the event of interest, mixture cure models (MCMs) should be fit. Further, it is clinically relevant to identify molecular features from high-throughput assays that are associated with time-to-event outcomes, both to elucidate important pathways and to identify molecular features that may be therapeutic targets or for developing improved risk stratification systems. Herein, we describe our hdcuremodelsR package that can be used to model right-censored time-to-event data when a cured fraction is present and the predictor space is high-dimensional. METHODS: We implemented two different optimization methods, the expectation-maximization and generalized monotone incremental forward stagewise algorithms, for fitting high-dimensional penalized Weibull, exponential, and Cox mixture cure models. Cross-validation functions for each optimization method are provided that can be run with or without controlling the false discovery rate. The modeling functions are flexible in that there is no requirement for the predictors to be the same in the incidence and latency components of the model. The package also includes functions for testing mixture cure modeling assumptions, evaluating performance, and generic functions that can be used to extract meaningful results. RESULTS: We demonstrate fitting a high-dimensional penalized mixture cure model to an acute myeloid leukemia dataset, which had strong predictive performance on an independent test set. CONCLUSION: Our hdcuremodels package fits penalized mixture cure models that can accommodate datasets where the number of predictors exceeds the sample size.