Comparison of the Performance Results of C4.5 and Random Forest Algorithm in Data Mining to Predict Childbirth Process

Authors

DOI:

https://doi.org/10.21512/commit.v17i1.8236

Keywords:

C4.5 Algorithm, Random Forest Algorithm, Data Mining, Childbirth Process

Abstract

Technology advancements in the world of information have made it easier for many people to process data. Data mining is a process of mining more valuable information from large data sets. The research aims to determine the difference between the C.45 and random forest algorithms in data mining to predict the childbirth process of pregnant women. It compares the accuracy of the performance results of the C4.5 and random forest algorithms to predict the delivery process for pregnant women. Then, experimental research is conducted to classify the childbirth process in Situbondo, Indonesia, by applying the C.45 and the random forest algorithm in the data mining. The decision tree J48 algorithm is used for the C4.5 algorithm in the research. Both algorithms are compared for their error classification and accuracy level. The research uses 1,000 data for training and 200 data for testing. The results show the accuracy of implementing the C4.5 and random forest algorithms with data mining using 10-fold cross-validation, generating 96% and 95% as correctly classified data. Then, the Relative Absolute Error for both algorithms has the same result. It is 15%. The C4.5 algorithm has a better result than the random forest algorithm by comparing the performance results. Further research can add more data to improve the accuracy of the analysis results by using another algorithm.

Dimensions

Plum Analytics

Author Biographies

Muhasshanah, Universitas Ibrahimy

Program Studi Teknologi Informasi

Mohammad Tohir, Universitas Ibrahimy

Program Studi Tadris Matematika

Dewi Andariya Ningsih, Universitas Ibrahimy

Program Studi Kebidanan

Neny Yuli Susanti, Universitas Ibrahimy

Program Studi Kebidanan

Astik Umiyah, Universitas Ibrahimy

Program Studi Kebidanan

Lia Fitria, Universitas Ibrahimy

Program Studi Pendidikan Profesi Bidan

References

J. Yang, Y. Li, Q. Liu, L. Li, A. Feng, T. Wang, S. Zheng, A. Xu, and J. Lyu, “Brief introduction of medical database and data mining technology in big data era,” Journal of Evidence-Based Medicine, vol. 13, no. 1, pp. 57–69, 2020.

S. Lv, H. Kim, B. Zheng, and H. Jin, “A review of data mining with big data towards its applications in the electronics industry,” Applied Sciences, vol. 8, no. 4, pp. 1–34, 2018.

S. Dutta and S. K. Bandyopadhyay, “Employee attrition prediction using neural network cross validation method,” International Journal of Commerce and Management Research, vol. 6, no. 3, pp. 80–85, 2020.

D. S. Abdelminaam, N. Neggaz, I. A. E. Gomaa, F. H. Ismail, and A. A. Elsawy, “ArabicDialects: An efficient framework for Arabic dialects opinion mining on Twitter using optimized deep neural networks,” IEEE Access, vol. 9, pp. 97 079–97 099, 2021.

S. Pouriyeh, S. Vahid, G. Sannino, G. De Pietro, H. Arabnia, and J. Gutierrez, “A comprehensive investigation and comparison of machine learning techniques in the domain of heart disease,” in 2017 IEEE Symposium on Computers and Communications (ISCC). Heraklion, Greece: IEEE, July 3–6, 2017, pp. 204–207.

Z. Xu and Z. Wang, “A risk prediction model for type 2 diabetes based on weighted feature selection of random forest and XGBoost ensemble classifier,” in 2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI). Guilin, China: IEEE, June 7–9, 2019, pp. 278–283.

T. T. Huynh-Cam, L. S. Chen, and H. Le, “Using decision trees and random forest algorithms to predict and determine factors contributing to firstyear university students’ learning performance,” Algorithms, vol. 14, no. 11, pp. 1–17, 2021.

S. Poudyal, M. Nagahi, M. Nagahisarchoghaei, and G. Ghanbari, “Machine learning techniques for determining students’ academic performance: A sustainable development case for engineering education,” in 2020 International Conference on Decision Aid Sciences and Application (DASA). Sakheer, Bahrain: IEEE, Nov. 8–9, 2020, pp. 920–924.

A. Hamoud, A. S. Hashim, and W. A. Awadh, “Predicting student performance in higher education institutions using decision tree analysis,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 5, pp. 26–31, 2018.

R. Sudrajat, I. Irianingsih, and D. Krisnawan, “Analysis of data mining classification by comparison of C4.5 and ID algorithms,” in IOP Conference Series: Materials Science and Engineering, vol. 166. IOP Publishing, 2017, pp.1–8.

R. Jothikumar and S. R. Balan, “C4.5 classification algorithm with back-track pruning for accurate prediction of heart disease,” Biomedical Research, pp. S107–S111, 2016.

M. T. Yazici, S. Basurra, and M. M. Gaber, “Edge machine learning: Enabling smart internet of things applications,” Big Data and Cognitive Computing, vol. 2, no. 3, pp. 1–17, 2018.

A. Priyam, G. R. Abhijeeta, A. Rathee, and S. Srivastava, “Comparative analysis of decision tree classification algorithms,” International Journal of Current Engineering and Technology, vol. 3, no. 2, pp. 334–337, 2013.

K. Kim, “A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree,” Pattern Recognition, vol. 60, pp. 157–163, 2016.

P. Y. Pawar and S. H. Gawande, “A comparative study on different types of approaches to text categorization,” International Journal of Machine Learning and Computing, vol. 2, no. 4, pp. 423–426, 2012.

W. Baswardono, D. Kurniadi, A. Mulyani, and D. M. Arifin, “Comparative analysis of decision tree algorithms: Random forest and C4.5 for airlines customer satisfaction classification,” Journal of Physics: Conference Series, vol. 1402, pp. 1–6, 2019.

W. Gata, H. Basri, R. Hidayat, Y. E. Patras, B. Baharuddin, R. Fatmasari, S. Tohari, and N. K. Wardhani, “Algorithm implementations Na¨ıve Bayes, random forest. C4.5 on online gaming for learning achievement predictions,” in 2nd International Conference on Research of Educational Administration and Management (ICREAM 2018). Bandung, Indonesia: Atlantis Press, Oct. 18, 2019, pp. 1–9.

E. Ismanto and M. Novalia, “Komparasi kinerja algoritma C4.5, random forest, dan gradient boosting untuk klasifikasi komoditas,” Techno.Com, vol. 20, no. 3, pp. 400–410, 2021.

R. B. Bhardwaj and S. R. Chaurasia, “Use of ANN, C4.5 and random forest algorithm in the evaluation of seismic soil liquefaction,” Journal of Soft Computing in Civil Engineering, vol. 6, no. 2, pp. 92–106, 2022.

W. Gata, G. Grand, R. Fatmasari, B. Baharuddin, Y. E. Patras, R. Hidayat, S. Tohari, and N. K. Wardhani, “Prediction of teachers’ lateness factors coming to school using C4.5, random tree, random forest algorithm,” in 2nd International Conference on Research of Educational Administration and Management (ICREAM 2018). Bandung, Indonesia: Atlantis Press, Oct. 18, 2019, pp. 161–166.

A. Lalonde, K. Herschderfer, D. Pascali-Bonaro, C. Hanson, C. Fuchtner, and G. H. A. Visser, “The international childbirth initiative: 12 steps to safe and respectful MotherBaby-Family maternity care,” International Journal of Gynecology & Obstetrics, vol. 146, no. 1, pp. 65–73, 2019.

A. A. Daniels and A. Abuosi, “Improving emergency obstetric referral systems in low and middle income countries: A qualitative study in a tertiary health facility in Ghana,” BMC Health Services Research, vol. 20, no. 1, pp. 1–10, 2020.

R. Rahim, I. Zufria, N. Kurniasih, M. Y. Simargolang, A. Hasibuan, D. U. Sutiksno, R. F. Nanuru, J. N. Anamofa, A. S. Ahmar, and A. D. GS, “C4.5 classification data mining for inventory control,” International Journal of Engineering & Technology, vol. 7, no. 2.3, pp. 68–72, 2018.

E. M. Moegni and D. Ocviyanti, Buku saku pelayanan kesehatan ibu di fasilitas kesehatan dasar dan rujukan. Kementerian Kesehatan Republik Indonesia, 2013.

A. Sofian, Rustam Mochtar sinopsis obstetri. EGC, 2012.

A. Craik, Y. He, and J. L. Contreras-Vidal, “Deep learning for electroencephalogram (EEG) classification tasks: A review,” Journal of Neural Engineering, vol. 16, no. 3, pp. 1–28, 2019.

W. Liu, J. Su, Z. Mao, P. Jin, Y. Huang, C. Dou, L. Zhou, and Y. Shang, “Research on text classification method of distribution network equipment fault based on deep learning,” in 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom). New York, USA: IEEE, Aug. 1–3, 2020, pp. 11–16.

W. Liu, B.Wang, and Z. Song, “Failure prediction of municipal water pipes using machine learning algorithms,” Water Resources Management, vol. 36, no. 4, pp. 1271–1285, 2022.

A. Rana and R. Pandey, “A review of popular decision tree algorithms in data mining,” Asian Journal of Multidimensional Research, vol. 10, no. 10, pp. 230–237, 2021.

M. Rakhra, P. Soniya, D. Tanwar, P. Singh, D. Bordoloi, P. Agarwal, S. Takkar, K. Jairath, and N. Verma, “Crop price prediction using random forest and decision tree regression:-A review,” Materials Today: Proceedings, 2021.

S. R. Hashemi, S. S. M. Salehi, D. Erdogmus, S. P. Prabhu, S. K. Warfield, and A. Gholipour, “Asymmetric loss functions and deep denselyconnected networks for highly-imbalanced medical image segmentation: Application to multiple sclerosis lesion detection,” IEEE Access, vol. 7, pp. 1721–1735, 2018.

K. Sprute, V. Kramer, S. A. Koerber, M. Meneses, R. Fernandez, C. Soza-Ried, M. Eiber, W. A. Weber, I. Rauscher, K. Rahbar et al., “Diagnostic accuracy of 18F-PSMA-1007 PET/CT imaging for lymph node staging of prostate carcinoma in primary and biochemical recurrence,” Journal of Nuclear Medicine, vol. 62, no. 2, pp. 208–213, 2021.

M. A. Muslim, S. H. Rukmana, E. Sugiharti, B. Prasetiyo, and S. Alimah, “Optimization of C4.5 algorithm-based particle swarm optimization for breast cancer diagnosis,” Journal of Physics: Conference Series, vol. 983, pp. 1–7, 2018.

M. Pal and S. Parija, “Prediction of heart diseases using random forest,” Journal of Physics: Conference Series, vol. 1817, pp. 1–8, 2021.

Y. A. Saadoon and R. H. Abdulamir, “Improved random forest algorithm performance for big data,” Journal of Physics: Conference Series, vol. 1897, pp. 1–13, 2021.

O¨ . Akar and O. Gu¨ngo¨r, “Classification of multispectral images using random forest algorithm,” Journal of Geodesy and Geoinformation, vol. 1, no. 2, pp. 105–112, 2012.

Downloads

Published

2023-03-17
Abstract 875  .
PDF downloaded 490  .