Planetary Health Informatics explained
Ethnicity, Health Equity and AI

In this video, Sara and her team discuss their study on ethnicity data in NHS England, and how they will use this research to improve health equity in research and practice. Find out more about the study here
Research groups
- Planetary Health Informatics
- Big Health Data Research
- Pharmaco- and Device epidemiology
- EHDEN - European Health Data & Evidence Network
- HDRUK - Health Data Research UK
- OHDSI - Observational Health Data Sciences and Informatics
- National Geographic Society - Early Career Explorer
- OPTIMA - Tackling cancer with artificial intelligence and real-world data
- UKRI NERC - Creating Digital Environments
- Centre for Statistics in Medicine - Research Group
- ATLAS Programme - ATLAS -Enhanced Recovery for Arthroplasty Patients
- STAR Programme - STAR - Support and Treatment After Replacement
Colleges
Sara Khalid
BE, MSc (Oxon), DPhil
Associate Professor of Health Informatics and Biomedical Data Science
- Group Head - Planetary Health Informatics, Centre for Statistics in Medicine
- Senior Research Fellow in Biomedical Data Science and Health Informatics
- Machine Learning Lead - Pharmaco-device Epidemiology Group, NDORMS
- UKRI NERC Senior Fellow in Creating Digital Environments
- National Geographic Explorer - Remote Monitoring and Machine Learning
- Former Ambassador for Women in Data Science - University of Oxford
Health Informatics, Intelligent Patient Monitoring, Planetary Health, Real-world Data Science
RESEARCH
Sara leads the Planetary Health Informatics Group at the Centre for Statistics in Medicine (Oxford) and the Machine Learning and Big Data Team of the Health Data Sciences Section in NDORMS (Oxford) which she joined in 2016. She is also affiliated with the Institute of Biomedical Engineering (Oxford) where she completed her doctoral and post-doctoral research in the Biomedical Signal Processing and Image Analysis Groups.
Her research applies artificial intelligence to international real-world health data, in order to further our understanding of disease and fills the gaps in global health, leveraging common data models and federated network analytics. She works closely with clinicians, engineers, clinical and environmental epidemiologists, conservationists, data scientists, and public and patient groups in the UK, Europe, Latin America, South Asia, and Africa to co-create models for equitable and ethical solutions for planetary health problems.
Sara completed her DPhil in Engineering Science at the IBME, University of Oxford, as a Rhodes Scholar. Prior to that she received a Distinction for her MSc in Biomedical Engineering from the University of Oxford in 2009, as a Qualcomm Scholar. In 2007 she graduated with a BE in Electronics Engineering from the National University of Sciences and Technology in Pakistan.
Teaching and Supervision
Sara teaches a number of health data science courses at NDORMS and University-wide, and is Director of the "Observational health data science: epidemiology, machine learning, and health economics" course. She is also a faculty member at the NIHR BRC course "Data analysis: statistics - designing clinical research and biostatistics", and the "Real-world epidemiology with OMOP common data model" summer school organised by the Health Data Science Section at NDORMS.
Sara supervises a number of research students including DPhil and MSc students at Oxford, as well as UK and overseas PhD students. Interested students are welcome to get in touch.
Recent publications
Evaluating large language models for clinical note processing: local fine-tuning and internal-external validation using electronic health records from South Asia.
Journal article
Hasheminasab SA. et al, (2026), BMC Med Inform Decis Mak
Mapping the potential and limitations of using generative AI technologies to address socio-economic challenges in LMICs.
Journal article
Adams R. et al, (2026), Nat Comput Sci
Detecting brick kiln infrastructure at scale: graph, foundation, and remote sensing models for satellite imagery data
Conference paper
Nazir U. et al, (2026)
Causal Forests Versus Inverse Probability of Treatment Weighting to Adjust for Cluster-Level Confounding: A Parametric and Plasmode Simulation Study Based on US Hospital Electronic Health Record Data.
Journal article
Du M. et al, (2025), Pharmacoepidemiol Drug Saf, 34
Changes in use and utilisation patterns of drugs with reported shortages between 2010 and 2024 in Europe and North America: a network cohort study
Journal article
Pineda-Moncusí M. et al, (2025), The Lancet Public Health, 10, e835 - e847
Clusters of post-acute COVID-19 symptoms: a latent class analysis across 9 databases and 7 countries.
Journal article
López-Güell K. et al, (2025), J Clin Epidemiol, 185
Is fine-tuning useful in EHR-based prediction models? A use case on mortality prediction with longitudinal data from Spanish (SIDIAP) and UK (CPRD) populations aged over 65 years
Conference paper
Carrasco-Ribelles LA. et al, (2025), 107 - 110
Ethnic disparities in COVID-19 mortality and cardiovascular disease in England and Wales between 2020-2022.
Journal article
Pineda-Moncusí M. et al, (2025), Nat Commun, 16
Developing and externally validating a multivariable prediction model to predict the risk of developing psoriatic arthritis in adults newly diagnosed with psoriasis in primary care: an observational cohort study
Conference paper
Vivekanantham A. et al, (2025), Annals of the Rheumatic Diseases, 84, 562 - 563