Adaptive modelling approach for predicting causes of death: insights from verbal autopsy data in Tanzania.
Mahadia Tunga, James Chambua, Juma Lungo
Abstract
Open AccessBACKGROUND: The World Health Organization (WHO) has approved the use of a verbal autopsy (VA), a survey-based approach to generate out-of-hospital causes of death (CoDs). Through this study, an adaptive Bayesian networks machine learning model was developed and tested. The model is scalable and adaptable for predicting new causes as the dataset expands. METHODS: The 2016 WHO questionnaire was used to collect data from Iringa, Tanzania, and data augmentation was performed using the Synthetic Minority Oversampling Technique for nominal features to increase the dataset size and reduce bias in the CoD classification. The model development was guided by a CoD decision flow that integrates essential factors and steps for accurate CoD prediction. To our knowledge, no previous study has provided this operational guide for VA cause of death prediction. RESULTS: The model was evaluated using accuracy, sensitivity, specificity and F1 score metrics and compared with Support Vector Machine and Naïve Bayesian models. Results indicated an average accuracy of 97%, specificity of 97%, sensitivity of 94% and F1 score of 94%, which are superior compared with Naïve Bayesian and Support Vector Machine models. CONCLUSIONS: The reported performance of the developed model demonstrates the potential for this model to enhance VA-based CoD data by integrating a machine learning approach with physician expertise. The results highlight the effectiveness of combining Bayesian networks with physician Symptom Cause Information as a valuable tool in advancing the performance of CoD predictions.