Confounding factors and biases abound when predicting molecular biomarkers from histological images.

Dawood M., Branson K., Tejpar S., Rajpoot N., Minhas FUAA.

Deep learning models that infer clinically relevant biomarker status from tissue images are being explored as rapid and low-cost alternatives to molecular testing. Here we show, through statistical analysis across multiple cancer types, datasets and modelling approaches, that the datasets used to train these models contain strong dependencies between biomarkers and clinicopathological features, which prevent models from isolating the effect of a single biomarker and lead them to learn confounded signals. Consequently, their prediction accuracy varies substantially with the status of codependent biomarkers and clinicopathological variables, and for several biomarkers, the gain over what a pathologist can already infer from routine histopathological features, such as grade, remains modest. These findings indicate that current approaches are not yet suitable as substitutes for molecular testing but can support triage or complementary decision-making with caution. Unconfounded biomarker prediction will require models that learn causal rather than correlational relationships between biomarkers and tissue morphology.

More information Original publication

DOI

10.1038/s41551-026-01616-8

Type

Journal article

Publication Date

2026-03-02T00:00:00+00:00

Cookies on this website

Confounding factors and biases abound when predicting molecular biomarkers from histological images.

Dawood M., Branson K., Tejpar S., Rajpoot N., Minhas FUAA.

DOI

Type

Publication Date