The Implementation of the Fuzzy C-Means Method in Handling Outlier Data in the 2021 Village Potential Data of Bengkulu Province
DOI:
https://doi.org/10.21512/comtech.v16i1.12274Keywords:
Fuzzy C-Means (FCM) method, outlier data, village potential dataAbstract
Clustering groups aims to ensure similarity within clusters and disparity between them. The research evaluated the Fuzzy C-Means method’s effectiveness in clustering large datasets containing outliers, focusing on the 2021 Village Potential data from Bengkulu Province. The dataset, comprising 1,514 observations from villages and urban villages, provided a comprehensive resource for understanding regional development. Outliers, a common challenge in cluster analysis, were detected using univariate and multivariate methods, revealing substantial variability. PCA was applied, improving clustering quality to address multicollinearity among variables. In the results, the fuzzifier (w) parameter in the FCM method plays a crucial role in controlling the degree of membership for data points in clusters, which can potentially reduce the impact of outliers, enhancing clustering robustness and accuracy. The FCM method effectively produces clusters with high intra-cluster homogeneity and inter-cluster heterogeneity. Using the Elbow method, three optimal clusters are identified. Cluster 1, dominated by villages in Bengkulu City, is the most advanced, with superior infrastructure and services, but the fewest villages business units, necessitating economic empowerment. Cluster 2, comprising villages in North Bengkulu Regency, demonstrates moderate development but suffers from poor transportation access, requiring improvements to support socio-economic activities. Cluster 3, dominated by villages in Kaur Regency, is the least developed, with limited basic services and infrastructure, highlighting the need for substantial investments in governance and essential services. These findings provide actionable insights for village development in Bengkulu Province, supporting targeted policies tailored to each cluster’s unique characteristics.
Plum Analytics
References
Abdellahoum, H., Mokhtari, N., Brahimi, A., & Boukra, A. (2021). CSFCM: An improved Fuzzy C-Means image segmentation algorithm using a cooperative approach. Expert Systems with Applications, 166. https://doi.org/10.1016/j.eswa.2020.114063
Ahmadov, E. Y. (2023). Comparative analysis of K-Means and Fuzzy C-Means algorithms on demographic data using the PCA method. Problems of Information Technology, 14(1), 15–22. https://doi.org/10.25045/jpit.v14.i1.03
Azrahwati, Nusrang, M., Aidid, M. K., & Rais, Z. (2022). K-Means cluster analysis for grouping districts in South Sulawesi province based on village potential. ARRUS Journal of Mathematics and Applied Science, 2(2), 73–82. https://doi.org/10.35877/mathscience739
Badan Pusat Statistik. (2019, May 9). Indeks pembangunan desa 2018. https://www.bps.go.id/id/publication/2019/05/09/4edae4bd6c18d24b1b4273fe/indeks-pembangunan-desa-2018.html
Badan Pusat Statistik. (2022, March 24). Statistik potensi desa Indonesia 2021. https://www.bps.go.id/id/publication/2022/03/24/ceab4ec9f942b1a4fdf4cd08/statistik-potensi-desa-indonesia-2021.html
Bieber, M., Verhagen, W. J. C., Cosson, F., & Santos, B. F. (2023). Generic diagnostic framework for anomaly detection—Application in satellite and spacecraft systems. Aerospace, 10(8), 1–24. https://doi.org/10.3390/aerospace10080673
Choudhary, B., & Saxena, V. (2023). Fuzzy C-Mean technique for accessing large database of banking sector. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 263–271.
Chrisinta, D., Sumertajaya, I. M., & Indahwati. (2020). Evaluasi kinerja metode cluster ensemble dan latent class clustering pada peubah campuran. Indonesian Journal of Statistics and Its Applications, 4(3), 448–461.
Hassan, A. A. H., Shah, W. M., Othman, M. F. I., & Hassan, H. A. H. (2020). Evaluate the performance of K-Means and the Fuzzy C-Means algorithms to formation balanced clusters in wireless sensor networks. International Journal of Electrical and Computer Engineering, 10(2), 1515–1523. https://doi.org/10.11591/ijece.v10i2.pp1515-1523
Hennig, C. (2019). Cluster validation by measurement of clustering characteristics relevant to the user. In C. H. Skiadas & J. R. Bozeman (Eds.), Data analysis and applications 1: Clustering and regression, modeling-estimating, forecasting and data mining. Wiley. https://doi.org/10.1002/9781119597568.ch1
Kenger, O. N., Kenger, Z. D., Ozceylan, E., & Mrugalska, B. (2023). Clustering of cities based on their smart performances: A comparative approach of Fuzzy C-Means, K-Means, and K-Medoids. IEEE Access, 11, 134446–134459. https://doi.org/10.1109/ACCESS.2023.3333753
Mahmudi, Goejantoro, R., & Amijaya, F. D. T. (2021). Perbandingan metode C-Means dan Fuzzy C-Means pada pengelompokan kabupaten/kota di Kalimantan berdasarkan indikator IPM tahun 2019. Jurnal EKSPONENSIAL, 12(2), 193–200. https://doi.org/10.30872/eksponensial.v12i2.814
Nowak-Brzezińska, A., & Łazarz, W. (2021). Qualitative data clustering to detect outliers. Entropy, 23(7), 1–27. https://doi.org/10.3390/e23070869
Oti, E. U., Olusola, M. O., Eze, F. C., & Enogwe, S. U. (2021). Comprehensive review of K-Means clustering algorithms. International Journal of Advances in Scientific Research and Engineering, 7(8), 64–68. https://doi.org/10.31695/ijasre.2021.34050
Singh, P., Rathee, N., Sharda, S., & Kumar, S. (2023). Comparative study of rough set-based FCM and K-Means clustering for tumor segmentation from brain MRI images. Revue d’Intelligence Artificielle, 37(4), 921–927. https://doi.org/10.18280/ria.370412
Supandi, A., Saefuddin, A., & Sulvianti, I. D. (2020). Two step cluster application to classify villages in Kabupaten Madiun based on village potential data. Xplore: Journal of Statistics, 10(1), 12–26.
Wang, H. Y., Wang, J. S., & Wang, G. (2021). Combination evaluation method of Fuzzy C-Mean clustering validity based on hybrid weighted strategy. IEEE Access, 9, 27239–27261. https://doi.org/10.1109/ACCESS.2021.3058264
Zhou, K., & Yang, S. (2019). Fuzzifier selection in Fuzzy C-Means from cluster size distribution perspective. Informatica, 30(3), 613–628. https://doi.org/10.15388/informatica.2019.221
Zhou, S., Li, D., Zhang, Z., & Ping, R. (2021). A new membership scaling Fuzzy C-Means clustering algorithm. IEEE Transactions on Fuzzy Systems, 29(9), 2810–2818. https://doi.org/10.1109/TFUZZ.2020.3003441
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Intan Juliana Panjaitan, Indahwati, Farit Mochamad Afendi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
 USER RIGHTS
 All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: