Impact of measurement error and sample size on the performance of multivariable risk prediction models: a simulation study
Vazquez Montes M., Altman DG., Perera R., Collins GS.
Risk prediction models, developed to estimate the probability of an individual developing a particular outcome, are frequently published. Few are adequately validated resulting in a large number of prediction models not used in practice. Data are often measured with some degree of error. This error can influence the performance of a prediction model. The impact of either random or systematic error in a particular covariate, the covariate’s strength or the sample size at which this measurement error could become negligible on model performance is unknown. This simulation study investigates the impact of measurement error and its relationship to sample size and a covariate’s strength on calibration (i.e. how close observed and predicted probabilities are, and quantified by the calibration slope and Brier score), discrimination (i.e. how well the model differentiates between individuals with and without the outcome, and quantified by the c-index and D statistic), and explained variation (e.g. R2).