Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Analysing potential biases in health-specific Large Language Models (LLMs) for global use, trained on real-world patient data.

This project works with Artificial Intelligence in the form of Large Language Models (LLMs), training them to extract and present important information from complex and lengthy patient records to clinicians, supporting time-sensitive decision-making.


Currently, healthcare settings in South Asia are facing unprecedented demands, largely over-subscribed and under-resourced. To support clinicians in rapid and time-sensitive decision-making, this project is applying AI technology to extract and present important and relevant information from patient histories, most of which is contained in long full-text patient notes. Using open-source LLMs which have already been trained on clinical data, our researchers are collaborating with Shaukat Khanum Memorial Cancer Hospital and Research Centre in Pakistan to fine-tune these models based on local medical data, creating a model better suited for use in South Asia.

This project won a Grand Challenges Catalysing Equitable Artificial Intelligence Use award from the Gates Foundation, contributing to the use of Artificial Intelligence and LLMs in low- and middle-income countries, in pursuit of health equity. Read more about this project on the Grand Challenges Website.

Initially, our goal is for the LLMs to automatically extract essential medical concepts from patient records and provide answers to healthcare practitioners’ questions. From a technical standpoint, these models will undertake a comprehensive analysis of patients’ medical histories, which are complex and data-rich, and transform them into reach representations. These representations can then be harnessed by fellow researchers for various applications, such as predicting medical events or mapping adverse event relationships for drugs.

Ultimately, the team aims to assess whether open-source clinical LLMs contain biases based on the data they are trained on. Typically, these LLMs are trained on large collections of digital patient histories, but these collections overrepresent high-income countries and populations, meaning that the resultant LLMs can contain biases that make them less accurate in low- or middle-income countries. To reveal the impact of these biases, our researchers will compare the performance of models trained on Pakistani patients’ medical histories with those trained on these publicly available datasets, identifying differences in the models’ outcomes. Understanding the effects of these biases will prevent them from perpetuating health disparities within the region as AI technology is adopted by healthcare workers.