Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.
Skip to main content

NDORMS DPhil & MSc by Research

research project outline

Observational studies of medications and clinical interventions may provide evidence on effectiveness and safety in real life patients. Regulatory bodies including FDA and EMA have showed interest in the use of evidence from studies using routinely collected. However, the critical challenge in these studies is confounding. Methods from statistics, econometric and artificial intelligence have helped to minimize for measured confounding in drug safety and comparative effectiveness research. These methods have still limitations which were not addressed. For example, comparison of multiple treatments, handling of missing data, and unmeasured confounding. Machine learning methods have the potential to address these challenges yet their performance warrant new research.

We aim to assess the performance of different machine learning methods for the observational study of the risk/s and benefit/s of musculoskeletal medications and devices as used in actual practice conditions and in potentially all NHS patients.

To do so, we will use routinely collected health big data (i.e. information from national registries, audits, and pseudonymised NHS records, and international datasets via our collaboration such as the 100Million Brazillian cohort) as well as simulated datasets. These will be analysed using machine learning methods in combination with different approaches including (high dimensional) propensity scores, disease risk scores and marginal structural models. These different observational analyses will be applied to clinical use cases and compared to ongoing surgical randomized controlled trials.

The comparative performance of these methods in handling multiple treatments (more than two levels), unmeasured confounding, and missing data will be compared with conventional methods such as regression methods with multiple imputations.
In addition, they will also be used for the analysis of simulated datasets for the study of their ability to minimize confounding and related bias in comparative medication/device risk-benefit studies.



The DPhil will be jointly supervised by Associate Prof Prieto-Alhambra, Dr M Sanni Ali, Dr Sara Khalid, and Prof Gary Collins, all based at the Oxford Centre for Statistics in Medicine (CSM), NDORMS, University of Oxford.

Prof Daniel Prieto-Alhambra has published extensively in the field of pharmaco-epidemiology, and is recognised internationally as an authority on use of routine data for musculoskeletal pharmaco- and device epidemiology.

Dr M Sanni Ali is a Senior Researcher in Pharmaco-epidemiologic methods and Assistant Professor of epidemiology at LSHTM. He has extensive expertise in the use, validation and development of pharmaco-epidemiological methods, both for the analysis of routinely collected data as well as in simulated datasets.

Prof Gary Collins research interests are focused on methodological aspects surrounding the development and validation of multivariable prediction models and has published widely in this area. He has a particular focus on the role that big data has in evaluating prediction models.

Current DPhil Students within the group: 5



The Botnar Research Centre plays host to the University of Oxford's Institute of Musculoskeletal Sciences, which enables and encourages research and education into the causes of musculoskeletal disease and their treatment. The proposed project would be part of the work of the Big Health Data Research group.

Training will be provided in techniques related research methodology, including the handling and analysis of large datasets, and advanced statistical and machine learning techniques. Attendance at formal training courses will be encouraged, and will include the "Real world epidemiology Oxford summer school" and advanced statistics courses.

In addition, courses from the Oxford Learning Institute and the Oxford University Computer Sciences on key skills for the completion of a successful DPhil thesis will be available. Additional on the job training opportunities will arise, and the supervisors will encourage the student to pursue such opportunities.

A core curriculum of lectures organized departmentally will be taken in the first term to provide a solid foundation in a broad range of subjects including epidemiology, health economics, and data analysis.

Students will attend regular seminars within the department and those relevant in the wider University.

Students will be expected to present data regularly in the departmental PGR seminars, the Center for Statistics in Medicine (CSM) and to attend external conferences to present their research globally.

Students will also have the opportunity to work closely with the CSM and London School of Hygiene and Tropical Medicine’s Electronic Medical Record group and Potential collaboration with CIDACS –Brazilian center for data and knowledge integration.

Students will have access to various courses run by the Medical Sciences Division Skills Training Team and other departments. All students are required to attend a 2 - day Statistical and Experimental Design course at NDORMS.


Related Publications

  • Mortality rates at 10 years after metal-on-metal hip resurfacing compared with total hip replacement in England: retrospective cohort analysis of hospital episode statistics. Kendal AR., Prieto-Alhambra D., Arden NK., Carr A., Judge A. BMJ 2013.
  • Statistical Primer: developing and validating a risk prediction model. Grant SW., Collins GS., Nashef SAM. Eur J Cardiothorac Surg.
  • The Comparative Performance of Logistic Regression and Random Forest in Propensity Score Methods: a Simulation Study. Poster. Poster. M Sanni Ali, Sara Khalid, Gary S. Collins, and Daniel Prieto-Alhambra. 33rd ICPE, August 26-30, 2017, Montreal, Canada. Pharmacoepidemiology and Drug Safety 2017;26(Suppl.2): 3–636.
  • Methodological comparison of marginal structural model, time-varying Cox regression and propensity score methods: the example of antidepressant use and the risk of hip fracture. M Sanni Ali, Rolf HH Groenwold, Svetlana V Belitser, Patrick C Souverein, Elisa Martín, Nicolle M Gatto, Consuelo Huerta, Helga Gardarsdottir, Kit C.B. Roes, Arno W Hoes, Antonius de Boer, Olaf H Klungel. Pharmacoepidemiol Drug Safety 2016; Suppl 1:114-21.
  • Best (but often forgotten) Practices: Propensity Score Methods in Clinical Nutrition Research. Ali M Sanni, Rolf HH Groenwold, Klungel OH. American Journal of Clinical Nutrition 2016; 104(2):247-58.
  • Propensity score balance measures in pharmacoepidemiology: a simulation study. M Sanni Ali, Groenwold RH, Pestman WR, Belitser SV, Roes KC, Hoes AW, de Boer A, Klungel OH. Pharmacoepidemiol Drug Safety 2014; 25:770-772.



Additional information regarding the project can be addressed to Associate Prof D Prieto-Alhambra:


How to Apply

Interested applicants should have or expect to obtain a first or upper second class BSc degree or equivalent, and will also need to provide evidence of English language competence. The application guide and form is found online and the DPhil or MSc by research will commence in October 2019.

For further information, please visit

 Project reference number #NDORMS-2019/3


Full list


Find out more