In recent years, there has been a significant increase in the amount of routine malaria surveillance data from national health systems released into the public domain. Under the leadership of Dr Harry Gibson, the ROAD-MAP team is investing resource in pulling these data together to make a coherent global dataset of reported incidence and deaths at a sub-national level.

Data gathering takes place in annual cycles, starting each August. The data collected feeds into a number of MAP’s research agendas and collaborations and it is the intention to release as much of these data as we are able, subject to the permissions granted by the data owners.

Data gathering is conducted according to the following protocol:

All data stored by MAP is done so in accordance with our data policy.

#
 
Global Burden of Disease (GBD) Anual Parasite Incidence Datasets

MAP provides estimates of malaria morbidity and mortality for the Global Burden of Disease Report published by the Institute of Health Metrics and Evaluation (IHME), Seattle. The routine surveillance data provided by ROAD-MAP is a key set of input data to these models.

Although We are working towards making these input data available for download, at present the data is only available on request. The main obstacle is that some data has been obtained from country’s online data portals. Although we always credit sources, we nonetheless feel that duplicating data downloads already available on another institution’s online platform is discourteous.

Burundi Health District API, 2015, extracted from routine surveillance data reports.
Burundi Health District API, 2015, extracted from routine surveillance data reports.

GBD 2019

The data comprises calculated annual parasite incidence rate (API) values at national and sub-national level down to administrative level three. The API values are calculated per thousand head of population and presented for both Plasmodium falciparum and P. vivax, according to the formula published by Cibulskis et. al.(1) using raw case data collected from reports published by country ministries of health and other sources. The population figures used as the demoninator are provided by IHME and are explained later in this document.

Although data is available down to administrative level three, the data coverage is not global at lower administrative levels, with decreasing data availability the lower the level. This is a result of different levels of coverage by different sources.

Globally, the data covers the years 1980 to 2018 although few individual countries have data for this full period. Coverage in data increases through time as a product of increasing availability of sources.

All sources of case data are listed in the data download and the majority of them are publicly available via the source owner’s websites. Links to source owner’s websites are maintained on MAP’s Country Trends page. The raw case figures are deliberately not included as they remain the property of the respective sources. For a small number of countries, data was provided in confidence via personal communications and case figures are not in the public domain. For these countries, API data is deliberately absent from our data download and only the source information is included.

Data was collected on a global scale but largely excludes sub-Saharan Africa. This is because the statistical methods employed to model malaria burden in this region rely on cross-sectional survey data rather than surveillance data. MAP is updating its statistical models to include both cross-sectional survey data and routine surveillance data. Future releases of API data will include whatever publicly available case data we can source for sub-Saharan Africa.

MAP has provided an R package that can download data points from our Explorer tool.

Downloads

Data is available on request. The data comprises:

  • Data dictionaries
  • A CSV of the calculated API and the source of the raw case figures used in the calculations
  • A geometry file for use in mapping software, containing the same data as in the CSV

Data Dictionary

Field Description
iso_2_code The ISO 2 code for the country this administrative unit is in
iso_3_code The ISO 3 code for the country this administrative unit is in
country_name The name of the country this administrative unit is in
admin_level The administrative level of the administrative unit this data is for. ADMIN0 is the country-level, ADMIN1, ADMIN2, and ADMIN3 are sub-national levels. The availability of data diminishes the lower in the administrative unit hierarchy the administrative unit is in.
admin_unit The name of the administrative unit. Note that for ADMIN0 level rows of data, this column will be the same as the country_name column
year The year this row of data is for. Note that some countries have a year defined differently to the Western definition of January to December. To address this, the data for a given year is defined as the records that encompass the Western date of 1st January
gender Set to ALL for all records. The figures released here are those that fed into the GBD 2017 modelling and for that, no disaggregation by gender was used for the raw data. MAP does hold data disaggregated by gender and will present this in future releases
age_bin Set to ALL for all records. Please see the note for gender for an explanation.
api_low_pf The low value of API for Plasmodium falciparum calculated using the Cibulskis et. al. method
api_high_pf The high value of API for Plasmodium falciparum calculated using the Cibulskis et. al. method
api_mean_pf The mean of the low and high values of API calculated for Plasmodium falciparum
api_low_pv The low value of API for Plasmodium vivax calculated using the Cibulskis et. al. method
api_high_pv The high value of API for Plasmodium vivax calculated using the Cibulskis et. al. method
api_mean_pv The mean of the low and high values of API calculated for Plasmodium vivax
api_low_all The low value of API for any malaria calculated using the Cibulskis et. al. method
api_high_all The high value of API for any malaria calculated using the Cibulskis et. al. method
api_mean_all The mean of the low and high values of API calculated for any malaria
source_title The title of the source.
source_index The index of the source: this column will be filled in for peer-reviewed literature and online data portals (in which case, the URL is presented). It will be blank for ministry of health reports and reports by similar organisations
source_authors The authors for the source
source_year The year of the source

Population Figures

The population figures used as the denominator to calculate API were provided by the Institute of Health Metrics and Evaluation (IHME). The IHME population figures are derived from the United Nations (UN) official estimates of population. Figures were provided at national level for all countries and at administration level one for Kenya, Saudi Arabia, Brazil, India, China, Mexico, and Indonesia.

Most of the sub-national case data collected by ROAD-MAP from Ministry of Health reports had associated population figures. However, while these population figures were collected, they were disregarded in favour of figures provided by IHME because of the latter’s provenance to the UN.

For the purposes of calculating API, the population-at-risk was used as the denominator.

In order to convert the IHME national population figures to subnational figures of population-at-risk to match the administrative areas for which case data were collected, the following steps were taken:

  • A global raster surface of population was created using a hybrid of data from GPWv4 and WorldPop, with the latter taking priority for those pixels where both had population data.
  • A raster of IHME population was then created by distributing the IHME population figures for country/administrative units across the pixels bounded by each country/administrative unit in the same proportions as the corresponding pixels in the hybrid GPWv4 / WorldPop raster.
  • MAP has previously published a global limits layer outside which transmission of malaria is highly unlikely (2). This layer was based on environmental factors, travel guidelines, and statements by the countries regarding their malaria-endemic status in 2010. An amended version of this global limits layer was created excluding the malaria-endemic status of the country. This exclusion was necessary because the research project covered data extending back to 1980 during which time the status of all countries has changed. This new global limits layer was applied over the IHME population raster to set population values in pixels outside the limits of transmission to be zero.
  • Rasterized versions of GADM geometry files for both sub-national administrative units and national borders were then used to provide a set of pixels in the IHME raster to sum to produce the population-at-risk for those sub-national units. Eritrea was an exception in that GAUL geometry files were used rather than GADM.

References
1. Cibulskis RE, Aregawi M, Williams R, Otten M, Dye C (2011) Worldwide Incidence of Malaria in 2009: Estimates, Time Trends, and a Critique of Methods. PLoS Med 8(12): e1001142. doi:10.1371/journal.pmed.1001142
2. Gething, P.W.*, Patil, A.P.*, Smith, D.L.*, Guerra, C.A., Elyazar, I.R.F., Johnston , G.L., Tatem, A.J. and Hay, S.I. (2011). A new world malaria map: Plasmodium falciparum endemicity in 2010. Malaria Journal, 10: 378. *indicates equal authorship.