Evaluation of machine-learning models to measure individualized treatment effects from randomized clinical trial data with time-to-event outcomes

Abstract

Objective: In randomized clinical trials, prediction models can be used to explore the relationships between patients' variables (e.g., clinical, pathological, or lifestyle variables, and also biomarker or genomic data) and treatment effect magnitude. Our aim was to evaluate flexible machine learning models capable of incorporating interactions and nonlinear effects from high-dimensional data to estimate individualized treatment recommendations in trials with time-to-event outcomes. Methods: We compared survival models based on neural networks (CoxCC and CoxTime) and random survival forests (Interaction Forests) against a Cox proportional hazards model with an adaptive LASSO (ALASSO) penalty as a benchmark. For individualized treatment recommendations in the survival setting, we adapted metrics originally designed for binary outcomes to accommodate time-to-event data with censoring. These adapted metrics included the C-for-Benefit, the E50-for-Benefit, and the root mean squared error for treatment benefit. An extensive simulation study was conducted using two different data generation processes incorporating nonlinearity and interactions. The models were applied to gene expression and clinical data from three cancer clinical trial data sets. Results: In the first data generation process, neural networks outperformed ALASSO in terms of calibration while the Interaction Forests showed superior C-for-benefit performance. In the second data generation process, both machine learning methods outperformed the benchmark linear ALASSO method across discrimination, calibration, and RMSE metrics. In the cancer trial data sets, the machine learning methods often performed better than ALASSO, particularly IF in terms of C-for-benefit, and either a neural network or IF for calibration measures addressing treatment benefit.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…