Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Using satellite data, including vegetation levels, nighttime lights, rainfall and temperature, connected to malaria levels in South Asia to improve malaria prediction models.

Photo by Erik Karits on Unsplash

In this project, a collaboration with LUMS in Pakistan, we developed a deep learning model to improve the prediction of malaria outbreaks in South Asia. The model combines vegetation levels, nighttime light pollution, rainfall and temperature data for specific locations and times to create a highly specific prediction model. Our study used data spanning 2000-2016, and then tested the prediction model on 2017 outbreaks.

The abstract for this project has just been published in the Lancet Planetary Health journal - read it here

GIF shows two maps of South Asia, on the left showing malaria incidence correlated with rainfall, and on the right showing malaria correlated with temperature

GIF shows malaria levels (red circles) correlated with temperature on the left (Dark Blue to Light Blue), and with rainfall on the right (Yellow to Dark Purple)

Currently, approximately 50% of the global population is at risk of malaria infection, particularly in Africa and South Asia. Outbreaks are also connected to extreme weather events which are becoming more frequent as climate change continues. Tracking and predicting outbreaks is difficult as data is often extrapolated from small-scale household surveys, and influenced by a broad range of factors.

To combat this, we used a multi-dimensional long short-term memory model (LSTM) which combines data from satellites or weather stations, and historic malaria outbreaks, to build a more accurate prediction model. These factors are all seen to influence malaria outbreaks - like the levels of vegetation, rainfall and temperature - or tell us something about the local populations. The nighttime light pollution recorded on satellites was used in this study as a way to represent the socioeconomic status of regions, as lower nighttime light levels is associated with higher levels of poverty.

A GIF showing two maps of South Asia, one showing malaria cases as correlated with light pollution (left) and one showing malaria cases correlated with vegetation levels (right).

GIF shows malaria levels (red circles) correlated with rainfall on the left (Dark Purple to Yellow), and with vegetation levels on the right (Light to Dark Green)

Our satellite data came from DMSP OLS (2000-2013) and VIIRS (2014-2017), while vegetation, temperature and rainfall data were all derived from the ARENA project, which combined Demographic Health Data and geo-referenced environmental data.

When we tested our model, we found it performed well in comparison with the true malaria incidence for the districts of Pakistan, India and Bangladesh as shown in this figure - where purple districts show high accuracy, pink districts show overestimation of risk, and blue districts show underestimation. When our model was tested against the existing deep learning models (particularly looking at the model of Shi et al, 2015), our model produced a reduction in the error rates for prediction.

Map of South Asia, showing the accuracy of malaria prediction models

Map of Pakistan, India and Bangladesh showing the accuracy of our prediction model

Further details of this study, including a talk by one of the team, can be found here on the NeurIPS conference website, and we'll share other materials as they are published! This permalink will also be updated as the work is published.