Explainable Machine Learning Models SHAP-based for Feature Importance Affecting Stunting Prevalence

Asysta Amalia Pasaribu; Nur Fitriyani Sahamony; Khairil Anwar Notodiputro; Bagus Sartono

doi:10.21512/comtech.v17i1.13732

Authors

Asysta Amalia Pasaribu Department of Statistics, School of Computer Science, BINUS UNIVERSITY, Jakarta
Nur Fitriyani Sahamony Universitas Binawan, IPB University
Khairil Anwar Notodiputro 1IPB University, Department of Statistics and Data Science, Bogor, 16680, Indonesia 2Universitas Binawan, Faculty of Business and Social Science, Jl. Dewi Sartika Raya, Jakarta Timur, 13630, Indonesia
Bagus Sartono 1IPB University, Department of Statistics and Data Science, Bogor, 16680, Indonesia 2Universitas Binawan, Faculty of Business and Social Science, Jl. Dewi Sartika Raya, Jakarta Timur, 13630, Indonesia

DOI:

https://doi.org/10.21512/comtech.v17i1.13732

Keywords:

feature importance, logistic regression, explanaible machine learning, SHAPE value, stunting prevalence

Abstract

Stunting is a form of chronic nutritional deficiency in toddlers and remains a major public health concern due to its impact on child growth and development. Efforts to reduce its prevalence continue to be strengthened in Indonesia, particularly in Sumatra Province. This study aims to evaluate the accuracy of a logistic regression model and three machine learning models—decision tree, random forest, and support vector machine (SVM)—in classifying stunting prevalence. The response variable is the prevalence of stunting among toddlers and is categorized into two classes: exceeding the national target and not exceeding it, based on the 2024 national threshold. Although classification models can provide accurate predictions, they often lack interpretability. Therefore, this study applies the SHAP method to the best-performing machine learning model to identify the key factors influencing stunting. The use of Shapley values is justified through the uniqueness theorem, which establishes it as the only attribution method that satisfies desirable fairness properties. SHAP values explain the model by referencing both the trained model and the underlying data. The results show that the random forest model achieves the highest accuracy (90.00%) and outperforms the other models. SHAP analysis reveals that Underweight is the most influential predictor contributing to stunting prevalence in Sumatra Province. These findings highlight the importance of machine learning interpretability in supporting policy decisions to reduce stunting.

Dimensions

References

Arsenault, P.-D., Wang, S., & Patenaude, J.-M. (2025). A survey of explainable artificial intelligence (XAI) in financial time series forecasting. ACM Computing Surveys, 57(10), 1–37. https://doi.org/10.1145/3729531

Asgedom, Y. S., Seifu, B. L., Mare, K. U., Asmare, Z. A., Asebe, H. A., Kase, B. F., Shibeshi, A. H., Tebeje, T. M., Sabo, K. G., & Fente, B. M. (2024). Levels of stunting associated factors among under-five children in Ethiopia: A multi-level ordinal logistic regression analysis. Plos One, 19(1), e0296451. https://doi.org/10.1371/journal.pone.0296451

Ashari, R., Basyir, V., Afriwardi, A., Mayetti, M., Yusrawati, Y., & Desmawati, D. (2023). Factors Related to Stunting Incidence in Toddlers Aged 24-59 Months in the Working Area of Kambang Community Health Center, Pesisir Selatan District. Contagion: Scientific Periodical Journal of Public Health and Coastal Health, 5(2), 530–549.

Asmare, A. A., Tegegne, A. S., Belay, D. B., & Agmas, Y. A. (2025). Coexisting predictors for undernutrition indices among under-five children in West Africa: Application of a multilevel multivariate ordinal logistic regression model. BMC Nutrition, 11(1), 112. https://doi.org/10.1186/s40795-025-01099-x

Fadmi, F. R., Mulyani, S., Justin, W. O. S., & Riza, Y. (2025). Logistic Regression Analysis of Risk Factors for Stunting Among Toddlers Aged 24-59 Months in Southeast Sulawesi, Indonesia. Health Dynamics, 2(2), 85–91. https://doi.org/10.33846/hd20206

Fahrani, D., Putri, A. E., & Pramana, S. (2025). Combining survey and satellite data for spatial analysis of the prevalence of stunting in java in 2021. 3302(1), 050005. https://doi.org/10.1063/5.0262277

Falah, A. N., Andriyana, Y., Jaya, I., Tantular, B., & Maryadi, E. (2025). Expanded spatial Durbin model for analyzing stunting prevalence in Java Island. Commun. Math. Biol. Neurosci., 2025, Article-ID. https://doi.org/10.28919/cmbn/9217

Fedyk, M., & Ray, M. (2023). How to leverage machine learning interpretability and explainability to generate hypotheses in cognitive psychology. 45(45).

Girma, B., Sasahu, L. D., & Rahman, A. (2025). Spatial distribution of stunting among breast feeding children in Sub-Sahara Africa. PLoS One, 20(6), e0325812. https://doi.org/10.1371/journal.pone.0325812

Henninger, M., Debelak, R., Rothacher, Y., & Strobl, C. (2023). Interpretable machine learning for psychological research: Opportunities and pitfalls. Psychological Methods. https://doi.org/10.1037/met0000560

Houssein, E. H., Gamal, A. M., Younis, E. M., & Mohamed, E. (2025). Explainable artificial intelligence for medical imaging systems using deep learning: A comprehensive review. Cluster Computing, 28(7), 469. https://doi.org/10.1007/s10586-025-05281-5

Juniarti, N., Alsharaydeh, E., Sari, C. W. M., Yani, D. I., & Hutton, A. (2025). Determinant factors influencing stunting prevention behaviors among working mothers in West Java Province, Indonesia: A cross-sectional study. BMC Public Health, 25(1), 2719. https://doi.org/10.1186/s12889-025-24078-0

Kassie, G. A., & Asgedom, Y. S. (2025). Childhood stunting severity level and associated factors among under-five children in Tanzania: A multi-level ordinal logistic regression analysis using 2022 Tanzanian demographic and health survey. BMC Pediatrics, 25(1), 129. https://doi.org/10.1186/s12887-025-05490-2

Mohsin, M. T., & Nasim, N. B. (2025). Explaining the Unexplainable: A Systematic Review of Explainable AI in Finance. arXiv Preprint arXiv:2503.05966. https://doi.org/10.48550/arXiv.2503.05966

Muhaimin, A., Ekacitta, P. C., & Ardiani, A. E. (2025). Spatial Autocorrelation Analysis of East Java Stunting Prevalence Cases in 2023. Journal of Advances in Information and Industrial Technology, 7(1), 83–94.https://doi.org/10.52435/jaiit.v7i1.689

Purnamasari, I., Widiyati, F., & Sahli, M. (2022). Analisis Faktor yang Mempengaruhi Kejadian Stunting pada Balita. Jurnal Penelitian Dan Pengabdian Kepada Masyarakat UNSIQ, 9(1), 48–56. https://doi.org/10.32699/ppkm.v9i1.2342

Rehman, A., Lin, J. C., & Heldal, I. (2025). Enhancing Psychologists’ Understanding Through Explainable Deep Learning Framework for ADHD Diagnosis. Expert Systems, 42(2), e13788. https://doi.org/10.1111/exsy.13788

Rifada, M., Chamidah, N., Ningrum, R. A., & Muniroh, L. (2023). Stunting determinants among toddlers in Probolinggo district of Indonesia using parametric and nonparametric ordinal logistic regression models. Commun. Math. Biol. Neurosci., 2023, Article-ID. https://doi.org/10.28919/cmbn/6690

Shah, T., Shekokar, K., Barve, A., & Khandare, P. (2024). An Analytical Review: Explainable AI for Decision Making in Finance Using Machine Learning. 1–5. 10.1109/PICET60765.2024.10716075

Shifa, N., Saleh, M., Akbari, Y., & Al Maadeed, S. (2025). A review of explainable AI techniques and their evaluation in mammography for breast cancer screening. Clinical Imaging, 123, 110492. https://doi.org/10.1016/j.clinimag.2025.110492

Sun, X., Liu, C., Wang, J., & Li, J. (2020). Assessing the extreme risk spillovers of international commodities on maritime markets: A GARCH-Copula-CoVaR approach. International Review of Financial Analysis, 68, 101453.https://doi.org/10.1016/j.irfa.2020.101453

Uban, A.-S., Chulvi, B., & Rosso, P. (2022). Explainability of depression detection on social media: From deep learning models to psychological interpretations and multimodality. In Early Detection of Mental Health Disorders by Social Media Monitoring: The First Five Years of the eRisk Project (pp. 289–320). Springer. https://doi.org/10.1007/978-3-031-04431-1_13

Ullah, N., Guzmán-Aroca, F., Martínez-Álvarez, F., De Falco, I., & Sannino, G. (2025). A novel explainable AI framework for medical image classification integrating statistical, visual, and rule-based methods. Medical Image Analysis, 103665. https://doi.org/10.1016/j.media.2025.103665