Ethnicity, Health Equity and AI

In this video, Sara and her team discuss their study on ethnicity data in NHS England, and how they will use this research to improve health equity in research and practice. Find out more about the study here
Research groups
- Planetary Health Informatics
- Big Health Data Research
- Pharmaco- and Device epidemiology
- EHDEN - European Health Data & Evidence Network
- HDRUK - Health Data Research UK
- OHDSI - Observational Health Data Sciences and Informatics
- National Geographic Society - Early Career Explorer
- OPTIMA - Tackling cancer with artificial intelligence and real-world data
- UKRI NERC - Creating Digital Environments
- Centre for Statistics in Medicine - Research Group
- ATLAS Programme - ATLAS -Enhanced Recovery for Arthroplasty Patients
- STAR Programme - STAR - Support and Treatment After Replacement
Colleges
Sara Khalid
BE, MSc (Oxon), DPhil
Associate Professor of Health Informatics and Biomedical Data Science
- Group Head - Planetary Health Informatics, Centre for Statistics in Medicine
- Senior Research Fellow in Biomedical Data Science and Health Informatics
- Machine Learning Lead - Pharmaco-device Epidemiology Group, NDORMS
- UKRI NERC Senior Fellow in Creating Digital Environments
- National Geographic Explorer - Remote Monitoring and Machine Learning
- Former Ambassador for Women in Data Science - University of Oxford
Health Informatics, Intelligent Patient Monitoring, Planetary Health, Real-world Data Science
RESEARCH
Sara leads the Planetary Health Informatics Group at the Centre for Statistics in Medicine (Oxford) and the Machine Learning and Big Data Team of the Health Data Sciences Section in NDORMS (Oxford) which she joined in 2016. She is also affiliated with the Institute of Biomedical Engineering (Oxford) where she completed her doctoral and post-doctoral research in the Biomedical Signal Processing and Image Analysis Groups.
Her research applies artificial intelligence to international real-world health data, in order to further our understanding of disease and fills the gaps in global health, leveraging common data models and federated network analytics. She works closely with clinicians, engineers, clinical and environmental epidemiologists, conservationists, data scientists, and public and patient groups in the UK, Europe, Latin America, South Asia, and Africa to co-create models for equitable and ethical solutions for planetary health problems.
Sara completed her DPhil in Engineering Science at the IBME, University of Oxford, as a Rhodes Scholar. Prior to that she received a Distinction for her MSc in Biomedical Engineering from the University of Oxford in 2009, as a Qualcomm Scholar. In 2007 she graduated with a BE in Electronics Engineering from the National University of Sciences and Technology in Pakistan.
Teaching and Supervision
Sara teaches a number of health data science courses at NDORMS and University-wide, and is Director of the "Observational health data science: epidemiology, machine learning, and health economics" course. She is also a faculty member at the NIHR BRC course "Data analysis: statistics - designing clinical research and biostatistics", and the "Real-world epidemiology with OMOP common data model" summer school organised by the Health Data Science Section at NDORMS.
Sara supervises a number of research students including DPhil and MSc students at Oxford, as well as UK and overseas PhD students. Interested students are welcome to get in touch.
Recent publications
-
Sociodemographic factors, biomarkers and comorbidities associated with post-acute COVID-19 sequelae in UK Biobank.
Journal article
Alcalde-Herraiz M. et al, (2025), Nat commun, 16
-
Ethnic disparities in COVID-19 mortality and cardiovascular disease in England and Wales between 2020-2022.
Journal article
Pineda-Moncusí M. et al, (2025), Nat commun, 16
-
Is Fine-Tuning Useful in EHR-Based Prediction Models? a Use Case on Mortality Prediction with Longitudinal Data from Spanish (SIDIAP) and UK (CPRD) Populations Aged Over 65 Years
Conference paper
Carrasco-Ribelles LA. et al, (2025), Proceedings ieee symposium on computer based medical systems, 00, 107 - 110
-
Clusters of post-acute COVID-19 symptoms: a latent class analysis across 9 databases and 7 countries.
Journal article
López-Güell K. et al, (2025), J clin epidemiol
-
Recommendations for Successful Development and Implementation of Digital Health Technology Tools
Journal article
Loo RTJ. et al, (2025), Journal of medical internet research, 27, e56747 - e56747
-
Use of Machine Learning to Compare Disease Risk Scores and Propensity Scores Across Complex Confounding Scenarios: A Simulation Study.
Journal article
Guo Y. et al, (2025), Pharmacoepidemiol drug saf, 34
-
Advancing breast, lung and prostate cancer research with federated learning. A systematic review.
Journal article
Ankolekar A. et al, (2025), Npj digit med, 8
-
Ethnic disparities in COVID-19 mortality and cardiovascular disease in England and Wales between 2020-2022
Journal article
Pineda Moncusi M. et al, (2025), Nature communications
-
Causal Forests versus Inverse Probability of Treatment Weighting to adjust for Cluster-Level Confounding: A Parametric and Plasmode Simulation Study based on US Hosptial Electronic Health Record Data
Preprint
Du M. et al, (2025)
-
Incidence and prevalence of asthma, chronic obstructive pulmonary disease and interstitial lung disease between 2004 and 2023: harmonised analyses of longitudinal cohorts across England, Wales, South-East Scotland and Northern Ireland.
Journal article
Whittaker H. et al, (2025), Thorax