Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

BACKGROUND: Machine-learning methods are gaining in popularity to predict medical events but their added value to other methods is still to be determined. We compared performances of clinical prediction models for bronchopulmonary dysplasia (BPD) or death in very preterm infants using logistic regression and random forests methods. METHODS: Two population-based cohorts of very preterm infants were used: EPIPAGE-2 (France, 2011) for development and internal validation and EPICE (Europe, 2011) for external validation. Eligible infants were born before 30 weeks' gestation and admitted in neonatal units. BPD was defined as any respiratory support at 36 weeks postmenstrual age. Candidate predictors were available shortly after birth or at day 3. Logistic regression and random forest models performance was assessed in terms of discrimination (c-statistic) and calibration plots. RESULTS: Prevalence of BPD/death was 32.1% (668/1923) in EPIPAGE-2 and 41.0% (1368/3335) in EPICE. At both time points, logistic regression and random forest models showed similar performance during internal validation. At birth, external validation in EPICE showed good discrimination (logistic regression model: c-statistics 0.81, 95% CI 0.80-0.83; random forest: 0.80, 95% CI 0.79-0.81) but both models underestimated the probability of BPD/death. Model performances were heterogeneous throughout European regions. CONCLUSIONS: Both modelling methods performed similarly to predict BPD/death shortly after birth in very preterm children. IMPACT: Whether machine-learning methods predict better short-term respiratory outcomes in very preterm infants than logistic regression models is debated. Random forest-based prediction models did not perform better than logistic regression to predict bronchopulmonary dysplasia or death shortly after birth in very preterm infants. Calibration performances varied among European countries. While offering the same performance, regression models are easier to understand, to disseminate and to apply to different populations.

Original publication

DOI

10.1038/s41390-025-04170-2

Type

Journal article

Journal

Pediatr res

Publication Date

12/06/2025