#
 
MAP Data Summary and Download Page

This page supports the Wellcome Trust’s “Data Re-Use Prize – Malaria” competition and provides a summary of MAP’s data estate and links to where these data can be downloaded from. Data for the example questions are also grouped together on this page for convenience.

MAP provides the outputs of our research, as well as broader technical advice and support, to National Malaria Control Programmes (NMCPs), non-governmental organisations (NGOs), Ministries of Health, and other third parties as part of our commitment to open access data.

To this end, MAP obtains, curates, and shares a wide variety of malariometric data. These fall into two categories:

  1. Input data for models. These include data that are expanded upon in the pages linked below:

    Further input data are available from the MAP Data Explorer page and Country Profiles, including:

    • Mosquito vector occurrence surveys
    • Duffy negativity surveys
    • G6PD deficiency surveys
    • HbS (sickle haemoglobin) surveys
  2. Modelled outputs. These include data that is expanded upon in the pages linked to below:

    Further modelled outputs are available from the MAP Data Explorer page and Country Profiles, including:

    • Mosquito vector occurrence and relative abundance
    • The spatial limits of Plasmodium falciparum and P. vivax malaria
    • Temperature suitability for malaria transmission
    • P. vivax relapse incidence
    • Duffy-negativity phenotype frequency
    • G6PD deficiency allele frequency
    • HbS (sickle haemoglobin) allele frequency

#
 
Data For the Example Questions

The Wellcome Data Re-Use Prize for Malaria is in the form of an open question. Participants are challenged to explore MAP’s data and come up with innovative uses or insights. Submissions might combine our data or modelled outputs with their own open datasets to address questions either directly associated with malaria or for which malaria might be a potential covariate.

Three example questions are included on the Wellcome Trust’s competition page. The suggested datasets to use for these example questions are gathered below for convenience. Participants should not feel constrained to just use these data.

#
 
Example Question 1: Explaining unattributed transmission

The resources for this question come from MAP’s paper on the effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015 and include unpublished data from intermediary steps in the modelling process.

  1. Rasters of prevalence means, as published in the paper. There is a raster for each year, the value in each pixel indicates the estimated parasite rate in children between the ages and two and ten.
  2. Rasters of PR upper and lower bounds. These rasters provide the upper and lower credible intervals for each of rasters of prevalence means.
  3. Raster of residuals. These rasters show the remaining (residual) transmission that has not been accounted for by the covariates already in the model i.e. the residuals have accounted for the effects of insecticide treated nets, access to artimisinin combined therapies, and indoor residual spraying with insecticides (see next item).
  4. Rasters of covariates already used in the models. The way these covariates were used in the paper is explained in the MAP paper Re-examining environmental correlates of Plasmodium falciparum malaria endemicity: a data-intensive variable selection approach. The CSV file below shows how each of the subsequent covariates relate to Table 5 in the paper.

#
 
Example Question 2: Downscaling areal incidence data

The resources for this question come from data published in MAP’s paper on travel times to cities to assess inequalities in accessibility and unpublished data from MAP’s paper on the effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015.

It also includes a subset (Senegal, Ethiopia, and Zambia) of our forthcoming database of annual parasite index (API), calculated from Ministry of Health routine surveillance systems.

  1. API Data for Senegal, Ethiopia, and Zambia. The data comprises:
    • Data dictionaries
    • A CSV of the calculated API and the source of the raw case figures used in the calculations
    • A geometry file for use in mapping software, containing the same data as in the CSV

API Data for Senegal, Zambia, and Ethiopia – 0.3MB

  • Rasters of temperature suitability for P. falciparum malaria transmission (only years 2000-2012 are available).
  • Global rasters of accessibility to centres of population.
  • Rasters of covariates already used in the models. The way these covariates were used in the paper is explained in the MAP paper Re-examining environmental correlates of Plasmodium falciparum malaria endemicity: a data-intensive variable selection approach. The CSV file below shows how each of the subsequent covariates relate to Table 5 in the paper.
  • #
     
    Example Question 3: Visualisation of uncertainty

    The resources for this question come from MAP’s paper on the effect of malaria control on Plasmodium falciparum in Africa between 2000 and 2015 and include unpublished data from intermediary steps in the modelling process.

    1. Rasters of prevalence means, as published in the paper. There is a raster for each year,the value in each pixel indicates the estimated parasite rate in children between the ages and two and ten.
    2. Rasters of PR upper and lower bounds. These rasters provide the upper and lower credible intervals for each of rasters of prevalence means.
    3. Tables of national P. falciparum PR with credible intervals
    4. The rasters for each of the 100 runs (or realisations) of the models, by year. These realisations are the data from which the mean PR and confidence interval rasters are produced. The total number of files is 1,600 (100 for each of the years 2000-2015 in the study). Hence, they have been divided up into years to make downloading the data more manageable.