Comparison of statistical approaches for analyzing incomplete longitudinal patient-reported outcome data in randomized controlled trials
Rombach I., Jenkinson C., Gray AM., Murray DW., Rivero Arias O.
Purpose Missing data are a potential source of bias in the results of randomized controlled trials (RCTs), but are often unavoidable in clinical research, particularly in patient-reported outcome measures (PROMs). Maximum likelihood (ML), multiple imputation (MI), and inverse probability weighting (IPW) can be used to handle incomplete longitudinal data. This paper compares their performance when analyzing PROMs, using a simulation study based on one RCT dataset. Methods Realistic missing-at-random data were simulated based on patterns observed during the follow-up of the Knee Arthroscopy Trial (KAT-ISRCTN45837371). Simulation scenarios covered different sample sizes, with missing PROMs outcome data in 10-60% of participants. Monotone and non-monotone missing data patterns were considered. Missing data were addressed using ML, MI, and IPW, and were analyzed via multilevel mixed-effects linear regression models. Root mean square errors in the treatment effects were used as performance parameters across 1000 simulations. Results Non-convergence issues were observed for IPW at small sample sizes. The performance of all three approaches worsened with decreasing sample size and increasing proportions of missing data. MI and ML performed similarly when the MI model was restricted to baseline variables, but MI performed better when using post-randomization data in the imputation model, and also in non-monotone versus monotone missing data scenarios. IPW performed worse than ML and MI in all simulation scenarios. Conclusions When additional post-randomization information is available, MI can be beneficial over ML for handling incomplete longitudinal PROMs data. IPW is not recommended for handling missing PROMs data in the simulated scenarios.