Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

  • Project No: KTPS-NC-17
  • Intake: 2021 KTPS-NC


The human immune system is highly dynamic, and most of the effector functions that immune cells perform are only seen when these cells encounter a pathogen or receive activating signals from other cells. This makes studying the genetics of the immune system difficult, as genetic variation may have no effect on baseline (i.e. unstimulated) cells and may only manifest when the cells are provided with a specific type of stimulus. As a result, studying the genetic risk factors for inflammatory or autoimmune conditions can be challenging, as we have to find the “correct” stimulation that causes the genetic variants to become active. As each such stimulation requires a large, expensive genetic mapping study to give an answer either way, only limited numbers of stimulations can be studied.

However, even when a genetic variant has no effect on effector functions (such as gene expression), it may still have detectable genetic effects on other measurable quantitative traits (such as protein binding) at baseline. When combined with non-genetic information (e.g. maps of gene regulation after a specific stimulation) it may be possible to predict which genetic variants will be likely to impact gene expression after stimulation, without carrying out a full-scale genetic experiment. There is also information contained with the sequence context itself (e.g. through prediction of non-coding regulatory elements such as uORFs) and in evolutionary measures such as conservation or haploinsufficiency, which can also give important clues as to how a genetic variant will impact molecular phenotype.

The first initial aim of this project will be to build a predictive algorithm to infer which genetic variants are likely to impact gene expression after stimulation, using only a combination of baseline genetic association data and non-genetic information for that stimulation condition. We will test this algorithm on publicly available data from primary monocytes and iPSC-derived macrophages, accessed from the EBI eQTL catalogue. The second aim will be to use this algorithm, in combination with summary statistics from genome-wide association studies of inflammatory and auto-immune diseases, to infer which stimulation conditions are most strongly associated with the action of genetic risk variation for each disease.

In the longer term, there are multiple directions that this DPhil can take. The first is to develop more sophisticated statistical or machine learning approaches to refine predictions, and to produce further insight into the pathways driving stimulation-specific gene regulatory in immune cells. The second is to follow up the biology of genetic risk for specific diseases on the basis of predicted stimulated QTLs, including generating new genomic data on disease patients to follow up the effect of these variants.

Note that this project will be entirely computational in the first instance. However, there is significant scope for the full DPhil to include an experimental or data generating component, in collaboration with other members of our research groups and/or external collaborators.


Co-Supervisor 1: Luke Jostins-Dean, Kennedy Institute of Rheumatology

Co-Supervisor 2: Nicky Whiffin, Wellcome Trust Centre for Human Genetics


  • Genetics
  • statistics
  • functional
  • genomics
  • immune cells
  • inflammatory disease 


  • Training in using R, if required. We will develop an R training plan, with weekly feedback on code from Luke or another experienced R user.
  • Training and experience in using publicly available databases, including accessing data through APIs.
  • Gaining experience in high-performance computing, statistical modelling, machine learning and statistical genetics, and knowledge of a range of high-throughput genome-wide datasets.
  • Opportunity to gain experience and feedback on presenting scientific data (will give presentations at our group meeting and the Kennedy-wide Genomics Forum, in addition to the GMS presentation).
  • Opportunity to develop active collaborations between computational and wet-lab scientists.


  • Fairfax BP, Humburg P, Makino S, et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Science. 2014;343(6175):1246949
  • Alasoo K, Rodrigues J, Mukhopadhyay S, et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat Genet. 2018;50(3):424-431. doi:10.1038/s41588-018-0046-z
  • Calderon D, Nguyen MLT, Mezger A, et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet. 2019;51(10):1494-1505. doi:10.1038/s41588-019-0505-9
  • Whiffin N, Karczewski K, Zhang X, et al (2020) Characterising the loss-of-function impact of 5’ untranslated region variants in 15,708 individuals. Nat. Comms. 11:2523


Luke Jostins-Dean, Kennedy Institute of Rheumatology

Nicky Whiffin, Wellcome Trust Centre for Human Genetics