Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Clinical prediction models are widely used in health and medical research. The area under the receiver operating characteristic curve (AUC) is a frequently used estimate to describe the discriminatory ability of a clinical prediction model. The AUC is often interpreted relative to thresholds, with "good" or "excellent" models defined at 0.7, 0.8 or 0.9. These thresholds may create targets that result in "hacking", where researchers are motivated to re-analyse their data until they achieve a "good" result. METHODS: We extracted AUC values from PubMed abstracts to look for evidence of hacking. We used histograms of the AUC values in bins of size 0.01 and compared the observed distribution to a smooth distribution from a spline. RESULTS: The distribution of 306,888 AUC values showed clear excesses above the thresholds of 0.7, 0.8 and 0.9 and shortfalls below the thresholds. CONCLUSIONS: The AUCs for some models are over-inflated, which risks exposing patients to sub-optimal clinical decision-making. Greater modelling transparency is needed, including published protocols, and data and code sharing.

Original publication

DOI

10.1186/s12916-023-03048-6

Type

Journal article

Journal

Bmc med

Publication Date

04/09/2023

Volume

21

Keywords

Area under curve, Diagnosis, Hacking, Prediction model, Prognosis, Receiver operating characteristic, Statistics, Humans, Models, Statistical, Prognosis, Biomedical Research, ROC Curve