Variable Selection in Clustering for Sanitation Access Analysis in East Java Supporting SDG 6

Authors

  • Mohammad Dian Purnama State University of Surabaya (UNESA)

DOI:

https://doi.org/10.21512/emacsjournal.v8i1.14729

Keywords:

Clustering, Variable Selection, Sanitation Access, SDG 6

Abstract

To have sanitation we need to think about a few things to make people healthy and help the world be a better place. This study is trying to figure out how people in East Java get to use sanitation. We are looking at an important things that help us understand how people use sanitation. We used a method called clustering to see how different cities and districts in East Java are doing. This study utilized a set of six variables, encompassing the five pillars of community-based total sanitation (STBM). The variables employed following the selection process include awareness of open defecation (SBS), awareness of hand washing with soap (CPTM), and drinking water and food management (PAMMRT). The resulting in this study has three distinct clusters, each reflecting different levels of sanitation across cities and districts in East Java. However, the clustering is important to recognize that the excluded variables maintain considerable value as indicators established by the government. Furthermore, to its capacity to implement the variable selection method in the context of clustering, it is anticipated that this research will serve as a valuable resource for policymakers, providing them with a framework to prioritize specific areas in their efforts to enhance sanitation access for the purpose of achieving sustainable development.

Dimensions

Author Biography

Mohammad Dian Purnama, State University of Surabaya (UNESA)

Department of Mathematics, Faculty of Mathematics and Natural Sciences

References

Buitrago-Boret, S. E., Martínez-Rivas, R., Florez-Diaz, J., Mijares-Seminario, R., & Rincón, E. (2023). Using cluster analysis on municipal statistical data to configure public policies about Water, Sanitation and Hygiene in Venezuela. arXiv preprint arXiv:2301.12604.

Fraley, C. (1998). Algorithms for model-based Gaussian hierarchical clustering. SIAM Journal on Scientific Computing, 20(1), 270-281.

Gogebakan, M. (2021). A novel approach for Gaussian mixture model clustering based on soft computing method. IEEE Access, 9, 159987-160003.

Govender, P., & Sivakumar, V. (2020). Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmospheric pollution research, 11(1), 40-56.

Hennig, C., Meila, M., Murtagh, F., & Rocci, R. (2023). Handbook of Cluster Analysis. CRC Press.

Lu, Z., & Lou, W. (2023). Bayesian approaches to variable selection in mixture models with application to disease clustering. Journal of Applied Statistics, 50(2), 387-407.

Maugis, C., Celeux, G., & Martin-Magniette, M. L. (2009). Variable selection for clustering with Gaussian mixture models. Biometrics, 65(3), 701-709.

Pereira, M. A., & Marques, R. C. (2021). Sustainable water and sanitation for all: are we there yet?. Water Research, 207, 117765.

Purnama, M. D. (2023). Average Linkage-based Agglomerative Hierarchical Clustering terhadap Indikator Pembangunan Ekonomi Jawa Timur 2022. Jurnal Sains dan Seni ITS, 12(6), D477-D482.

Purnama, M. D. (2025). Cluster Analysis of Highest Education Completed in East Java Province with Spherical K-Means Method. Parameter: Journal of Statistics, 5(1), 61-67.

Purnama, M. D., & Sofro, A. Y. (2025). Implementation of agglomerative nesting and divisive analysis in East Java criminality rate hierarchical clustering. In AIP Conference Proceedings (Vol. 3316, No. 1, p. 040001). AIP Publishing LLC.

Redivo, E., Nguyen, H. D., & Gupta, M. (2020). Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions. Computational Statistics & Data Analysis, 152, 107040.

Saraçlı, S., & Akşit, M. (2022). Comparison of hierarchic clustering methods with cophenetic correlation coefficient in big data. Afyon Kocatepe Üniversitesi Fen Ve Mühendislik Bilimleri Dergisi, 22(3), 552-559.

Tosunoglu, B. A., & Kocak, C. (2023). Feature selection for clustering and classification based attack detection systems in vehicular ad-hoc networks. Microprocessors and Microsystems, 104808.

United Nations. (2024). The Sustainable Development Goals Report 2024. United Nations Publications.

Downloads

Published

2026-05-15

How to Cite

Purnama, M. D. (2026). Variable Selection in Clustering for Sanitation Access Analysis in East Java Supporting SDG 6. Engineering, MAthematics and Computer Science Journal (EMACS), 8(1), 49–55. https://doi.org/10.21512/emacsjournal.v8i1.14729

Issue

Section

Articles
Abstract 40  .
PDF downloaded 8  .