Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.


The Covid pandemic has highlighted inequalities in health systems around the world. However, inequity is not limited to the pandemic – it is in fact a long-standing and multifaceted issue. In addition to socio-economic complexities, imbalances in healthcare technologies can worsen existing biases. 

An example is the artificial intelligence technology behind clinical prediction models. If there are imbalances in the data used to train the models, or if there are algorithm biases within the analytical pipeline, the resulting models can be biased and result in mis-estimation of the health risks of patients in real-time. This in turn can lead to some groups of patients being under- or over-prioritised. 

This research will develop prediction models that are based on bias-minimisation guidelines  (developed by the Equator Centre UK housed in the Centre for Statistics in Medicine) and that are tailored to specific patient groups, including patients with different ethnic backgrounds, patients with rare conditions and patients with disabilities. By addressing any sources of bias in the data and in the analytical pipelines, prediction models can be made more targeted and equitable.

The project is conducted via Trusted Research Environments, such as NHS Digital and SAIL.  The study uses routinely collected data from UK GDPPR/GPES, Hospital Episode Statistics (HES), and Office of National Statistics.

Patient and public engagement and involvement will be an important element of this research.  

Study materials 

Ethnicity, Data, Health Research poster - Punjabi  

Ethnicity, Data, Health Research poster - Gujrati 

Ethnicity, Data, Health Research poster - English 

Ethnicity, Data, Health Research Project infographic  


1/10 patients in England don't have an ethnicity record 

Individuals with no ethnicity records tends to be younger and are more likely to be males than individuals with a recorded ethnicity.

 One in ten map


Granularity of ethnicity concepts 

Ethnicity data recorded in the National Health Service in the UK can be disaggregated from 6 the high-level ethnic groups (Asian, Black/African/Caribbean, White, Mixed, Other Ethnic Groups and Unknown), to 19 NHS ethnicity codes and up to 489 SNOMED-CT ethnic concepts.

Tree diagram


Ethnicity breakdown N (%) in primary care data (England) 

From primary care records (GDPPR data source), we observed than 9.8% of individuals self-identified as Asian/Asian British, 3.6% as Black/African/Caribbean/Black British, 77.3% as White, 2.2% as Mixed, 3.6% as Other Ethnic Groups and 3.2% as Unknown/Non-stated.