Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models.
Andaur Navarro CL., Damen JA., van Smeden M., Takada T., Nijman SW., Dhiman P., Ma J., Collins GS., Bajpai R., Riley RD., Moons KG., Hooft L.
OBJECTIVE: We sought to summarize the study design, modelling strategies, and performance measures reported in studies on clinical prediction models developed using machine learning techniques. STUDY DESIGN AND SETTING: We search PubMed for articles published between 01/01/2018 and 31/12/2019, describing the development or the development with external validation of a multivariable prediction model using any supervised machine learning technique. No restrictions were made based on study design, data source, or predicted patient-related health outcomes. RESULTS: We included 152 studies, 58 (38.2% [95%CI 30.8-46.1]) were diagnostic and 94 (61.8% [95%CI 53.9-69.2]) prognostic studies. Most studies reported only the development of prediction models (n=133, 87.5% [95%CI 81.3-91.8]), focused on binary outcomes (n=131, 86.2% [95%CI 79.8-90.8), and did not report a sample size calculation (n=125, 82.2% [95%CI 75.4-87.5]). The most common algorithms used were support vector machine (n=86/522, 16.5% [95%CI 13.5-19.9]) and random forest (n=73/522, 14% [95%CI 11.3-17.2]). Values for area under the Receiver Operating Characteristic curve ranged from 0.45 to 1.00. Calibration metrics were often missed (n=494/522, 94.6% [95%CI 92.4-96.3]). CONCLUSIONS: Our review revealed that focus is required on handling of missing values, methods for internal validation, and reporting of calibration to improve the methodological conduct of studies on machine learning-based prediction models. SYSTEMATIC REVIEW REGISTRATION: PROSPERO, CRD42019161764.