Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

OBJECTIVE: To describe the frequency of open science practices in a contemporary sample of studies developing prognostic models using machine learning methods in the field of oncology. STUDY DESIGN AND SETTING: We conducted a systematic review, searching the MEDLINE database between December 1, 2022, and December 31, 2022, for studies developing a multivariable prognostic model using machine learning methods (as defined by the authors) in oncology. Two authors independently screened records and extracted open science practices. RESULTS: We identified 46 publications describing the development of a multivariable prognostic model. The adoption of open science principles was poor. Only one study reported availability of a study protocol, and only one study was registered. Funding statements and conflicts of interest statements were common. Thirty-five studies (76%) provided data sharing statements, with 21 (46%) indicating data were available on request to the authors and seven declaring data sharing was not applicable. Two studies (4%) shared data. Only 12 studies (26%) provided code sharing statements, including 2 (4%) that indicated the code was available on request to the authors. Only 11 studies (24%) provided sufficient information to allow their model to be used in practice. The use of reporting guidelines was rare: eight studies (18%) mentioning using a reporting guideline, with 4 (10%) using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis Or Diagnosis statement, 1 (2%) using Minimum Information About Clinical Artificial Intelligence Modeling and Consolidated Standards Of Reporting Trials-Artificial Intelligence, 1 (2%) using Strengthening The Reporting Of Observational Studies In Epidemiology, 1 (2%) using Standards for Reporting Diagnostic Accuracy Studies, and 1 (2%) using Transparent Reporting of Evaluations with Nonrandomized Designs. CONCLUSION: The adoption of open science principles in oncology studies developing prognostic models using machine learning methods is poor. Guidance and an increased awareness of benefits and best practices of open science are needed for prediction research in oncology.

Original publication




Journal article


J clin epidemiol

Publication Date





Code sharing, Data sharing, Machine learning, Open science, Prognosis, Reporting