Assessing Performance and Clinical Usefulness in Prediction Models With Survival Outcomes: Practical Guidance for Cox Proportional Hazards Models.
McLernon DJ., Giardiello D., Van Calster B., Wynants L., van Geloven N., van Smeden M., Therneau T., Steyerberg EW., topic groups 6 and 8 of the STRATOS Initiative None.
Risk prediction models need thorough validation to assess their performance. Validation of models for survival outcomes poses challenges due to the censoring of observations and the varying time horizon at which predictions can be made. This article describes measures to evaluate predictions and the potential improvement in decision making from survival models based on Cox proportional hazards regression. As a motivating case study, the authors consider the prediction of the composite outcome of recurrence or death (the "event") in patients with breast cancer after surgery. They developed a simple Cox regression model with 3 predictors, as in the Nottingham Prognostic Index, in 2982 women (1275 events over 5 years of follow-up) and externally validated this model in 686 women (285 events over 5 years). Improvement in performance was assessed after the addition of progesterone receptor as a prognostic biomarker. The model predictions can be evaluated across the full range of observed follow-up times or for the event occurring by the end of a fixed time horizon of interest. The authors first discuss recommended statistical measures that evaluate model performance in terms of discrimination, calibration, or overall performance. Further, they evaluate the potential clinical utility of the model to support clinical decision making according to a net benefit measure. They provide SAS and R code to illustrate internal and external validation. The authors recommend the proposed set of performance measures for transparent reporting of the validity of predictions from survival models.