Home ITCPhD Defence Andrea Araujo Navas

PhD Defence Andrea Araujo Navas

statistical evaluation of spatial uncertainty in schistosomiasis mapping

Andrea Araujo Navas is a PhD student in the department of Earth Observation Sciende (EOS). Her supervisor is prof.dr.ir. A. Stein from the faculty of Geo-information Science and Earth Observation.

The World Health Organization has identified seventeen neglected tropical diseases (NTDs) for targeted control. Schistosomiasis (SCH) is one of the most prevalent NTDs worldwide with high significance in the public health domain. Spatial modelling of SCH using earth observation data informs about the geographic areas where at-risk populations are in need of mass drug anthelminthic treatment. Several sources of uncertainty may decrease the quality and reliability of SCH modelling. This dissertation investigates three methods to reduce the uncertainty derived by the use of earth observation data in SCH modelling studies. It uses spatial statistics for uncertainty quantification and representation, and provides potential consequences when ignoring uncertainties for SCH control.

First, a systematic review and evaluation of uncertainty in SCH and soil-transmitted helminths modelling studies was performed (STH). The definition, quantification, and main sources of uncertainty were investigated as well as implications for SCH and STH control. The literature search was done by grouping three terms referring to uncertainty, geography, and the type of disease (SCH or STH) in the Web of Knowledge and PubMed. Uncertainty was mostly defined as lack of precision. In total, 91% of the studies quantified uncertainty in their predictions, and 23% of the studies mapped uncertainty. Furthermore, uncertainty in the regression coefficients was quantified by 57% of the studies but only 7% incorporated it in the predictions. Uncertainty in the covariates was identified but not quantified in 50% of the studies. Bayesian statistics was used to quantify uncertainty by means of credible intervals. Main sources of uncertainty were related to sample design and spatial aggregation and disaggregation methods.

Second, uncertainty due to positional mismatch between covariate and survey data was addressed using exposure areas as potential locations for SCH transmission. Exposure areas were delineated using a spatial Bayesian network (sBN) with five observable exposure risk factors. Prior and conditional probabilities were obtained from the literature and inserted as weights based upon their relative contribution to exposure. Based on those, joint probabilities of exposure were obtained to be used within sBNs. High probability values of exposure corresponded to areas where snails could be present and where people can easily access water bodies. Extracting covariate values from areas with high probability of exposure, instead of survey locations, is a way to address this mismatch. These results can be used to guide local SCH control teams to exposed communities, and in this way improve the efficiency of mass drug administration campaigns.

Third, uncertainty due to pure specification bias was solved by using a convolutional model. The model used barangay or ecological-level survey data and city-level environmental data. Covariate city–level data were considered as individual-level exposure. Differences between ecological and individual-level estimates and predictions were quantified and compared using Bayesian statistics. The estimated parameter corresponding to the nearest distance to water bodies presented the minimum difference between convolution and ecological models (0.03), whereas the estimated normalized difference water index parameter presented the maximum difference (0.28). Land surface temperature at night and elevation presented high differences with uncertainty vales equal to 0.23 and 0.13, respectively. The convolutional model presented less uncertain parameter estimates showing its good ability to correct for pure specification bias.

Fourth, the effects of the modifiable areal unit problem (MAUP) on environmental drivers of SCH were quantified. Five spatial supports of increasing size were used. All covariates were brought to the same spatial support of analysis (SSA). Differences between individual-level parameter estimates from the models at five increasing SSAs were quantified and compared. Increasing the SSA to 500 m gradually increased the parameter estimates and their associated uncertainties. Abrupt changes in parameter estimates occurred at SSA = 1 km, resulting in loss of significance of almost all covariates on SCH prevalence. These results suggest the use of an adequate spatial data structure to provide more reliable parameter estimates and a realistic relationship between the risk factors and SCH prevalence.

To summarize, the research presented in this dissertation investigates methods to deal with uncertainties derived from the use of earth observation data in SCH modelling. It uses Bayesian statistics for uncertainty quantification and highlights implications of uncertainty interpretation in the public health domain. Such implications aim to enable best practice in survey design and improve the identification of populations at-risk, and quantification of people in need of anthelmintic treatment. This research thus presents a framework for the future development of spatial decision support systems for SCH surveillance and control.