Comparison of the Performance Results of C4.5 and Random Forest Algorithm in Data Mining to Predict Childbirth Process
DOI:
https://doi.org/10.21512/commit.v17i1.8236Keywords:
C4.5 Algorithm, Random Forest Algorithm, Data Mining, Childbirth ProcessAbstract
Technology advancements in the world of information have made it easier for many people to process data. Data mining is a process of mining more valuable information from large data sets. The research aims to determine the difference between the C.45 and random forest algorithms in data mining to predict the childbirth process of pregnant women. It compares the accuracy of the performance results of the C4.5 and random forest algorithms to predict the delivery process for pregnant women. Then, experimental research is conducted to classify the childbirth process in Situbondo, Indonesia, by applying the C.45 and the random forest algorithm in the data mining. The decision tree J48 algorithm is used for the C4.5 algorithm in the research. Both algorithms are compared for their error classification and accuracy level. The research uses 1,000 data for training and 200 data for testing. The results show the accuracy of implementing the C4.5 and random forest algorithms with data mining using 10-fold cross-validation, generating 96% and 95% as correctly classified data. Then, the Relative Absolute Error for both algorithms has the same result. It is 15%. The C4.5 algorithm has a better result than the random forest algorithm by comparing the performance results. Further research can add more data to improve the accuracy of the analysis results by using another algorithm.
Plum Analytics
References
J. Yang, Y. Li, Q. Liu, L. Li, A. Feng, T. Wang, S. Zheng, A. Xu, and J. Lyu, “Brief introduction of medical database and data mining technology in big data era,” Journal of Evidence-Based Medicine, vol. 13, no. 1, pp. 57–69, 2020.
S. Lv, H. Kim, B. Zheng, and H. Jin, “A review of data mining with big data towards its applications in the electronics industry,” Applied Sciences, vol. 8, no. 4, pp. 1–34, 2018.
S. Dutta and S. K. Bandyopadhyay, “Employee attrition prediction using neural network cross validation method,” International Journal of Commerce and Management Research, vol. 6, no. 3, pp. 80–85, 2020.
D. S. Abdelminaam, N. Neggaz, I. A. E. Gomaa, F. H. Ismail, and A. A. Elsawy, “ArabicDialects: An efficient framework for Arabic dialects opinion mining on Twitter using optimized deep neural networks,” IEEE Access, vol. 9, pp. 97 079–97 099, 2021.
S. Pouriyeh, S. Vahid, G. Sannino, G. De Pietro, H. Arabnia, and J. Gutierrez, “A comprehensive investigation and comparison of machine learning techniques in the domain of heart disease,” in 2017 IEEE Symposium on Computers and Communications (ISCC). Heraklion, Greece: IEEE, July 3–6, 2017, pp. 204–207.
Z. Xu and Z. Wang, “A risk prediction model for type 2 diabetes based on weighted feature selection of random forest and XGBoost ensemble classifier,” in 2019 Eleventh International Conference on Advanced Computational Intelligence (ICACI). Guilin, China: IEEE, June 7–9, 2019, pp. 278–283.
T. T. Huynh-Cam, L. S. Chen, and H. Le, “Using decision trees and random forest algorithms to predict and determine factors contributing to firstyear university students’ learning performance,” Algorithms, vol. 14, no. 11, pp. 1–17, 2021.
S. Poudyal, M. Nagahi, M. Nagahisarchoghaei, and G. Ghanbari, “Machine learning techniques for determining students’ academic performance: A sustainable development case for engineering education,” in 2020 International Conference on Decision Aid Sciences and Application (DASA). Sakheer, Bahrain: IEEE, Nov. 8–9, 2020, pp. 920–924.
A. Hamoud, A. S. Hashim, and W. A. Awadh, “Predicting student performance in higher education institutions using decision tree analysis,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 5, pp. 26–31, 2018.
R. Sudrajat, I. Irianingsih, and D. Krisnawan, “Analysis of data mining classification by comparison of C4.5 and ID algorithms,” in IOP Conference Series: Materials Science and Engineering, vol. 166. IOP Publishing, 2017, pp.1–8.
R. Jothikumar and S. R. Balan, “C4.5 classification algorithm with back-track pruning for accurate prediction of heart disease,” Biomedical Research, pp. S107–S111, 2016.
M. T. Yazici, S. Basurra, and M. M. Gaber, “Edge machine learning: Enabling smart internet of things applications,” Big Data and Cognitive Computing, vol. 2, no. 3, pp. 1–17, 2018.
A. Priyam, G. R. Abhijeeta, A. Rathee, and S. Srivastava, “Comparative analysis of decision tree classification algorithms,” International Journal of Current Engineering and Technology, vol. 3, no. 2, pp. 334–337, 2013.
K. Kim, “A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree,” Pattern Recognition, vol. 60, pp. 157–163, 2016.
P. Y. Pawar and S. H. Gawande, “A comparative study on different types of approaches to text categorization,” International Journal of Machine Learning and Computing, vol. 2, no. 4, pp. 423–426, 2012.
W. Baswardono, D. Kurniadi, A. Mulyani, and D. M. Arifin, “Comparative analysis of decision tree algorithms: Random forest and C4.5 for airlines customer satisfaction classification,” Journal of Physics: Conference Series, vol. 1402, pp. 1–6, 2019.
W. Gata, H. Basri, R. Hidayat, Y. E. Patras, B. Baharuddin, R. Fatmasari, S. Tohari, and N. K. Wardhani, “Algorithm implementations Na¨ıve Bayes, random forest. C4.5 on online gaming for learning achievement predictions,” in 2nd International Conference on Research of Educational Administration and Management (ICREAM 2018). Bandung, Indonesia: Atlantis Press, Oct. 18, 2019, pp. 1–9.
E. Ismanto and M. Novalia, “Komparasi kinerja algoritma C4.5, random forest, dan gradient boosting untuk klasifikasi komoditas,” Techno.Com, vol. 20, no. 3, pp. 400–410, 2021.
R. B. Bhardwaj and S. R. Chaurasia, “Use of ANN, C4.5 and random forest algorithm in the evaluation of seismic soil liquefaction,” Journal of Soft Computing in Civil Engineering, vol. 6, no. 2, pp. 92–106, 2022.
W. Gata, G. Grand, R. Fatmasari, B. Baharuddin, Y. E. Patras, R. Hidayat, S. Tohari, and N. K. Wardhani, “Prediction of teachers’ lateness factors coming to school using C4.5, random tree, random forest algorithm,” in 2nd International Conference on Research of Educational Administration and Management (ICREAM 2018). Bandung, Indonesia: Atlantis Press, Oct. 18, 2019, pp. 161–166.
A. Lalonde, K. Herschderfer, D. Pascali-Bonaro, C. Hanson, C. Fuchtner, and G. H. A. Visser, “The international childbirth initiative: 12 steps to safe and respectful MotherBaby-Family maternity care,” International Journal of Gynecology & Obstetrics, vol. 146, no. 1, pp. 65–73, 2019.
A. A. Daniels and A. Abuosi, “Improving emergency obstetric referral systems in low and middle income countries: A qualitative study in a tertiary health facility in Ghana,” BMC Health Services Research, vol. 20, no. 1, pp. 1–10, 2020.
R. Rahim, I. Zufria, N. Kurniasih, M. Y. Simargolang, A. Hasibuan, D. U. Sutiksno, R. F. Nanuru, J. N. Anamofa, A. S. Ahmar, and A. D. GS, “C4.5 classification data mining for inventory control,” International Journal of Engineering & Technology, vol. 7, no. 2.3, pp. 68–72, 2018.
E. M. Moegni and D. Ocviyanti, Buku saku pelayanan kesehatan ibu di fasilitas kesehatan dasar dan rujukan. Kementerian Kesehatan Republik Indonesia, 2013.
A. Sofian, Rustam Mochtar sinopsis obstetri. EGC, 2012.
A. Craik, Y. He, and J. L. Contreras-Vidal, “Deep learning for electroencephalogram (EEG) classification tasks: A review,” Journal of Neural Engineering, vol. 16, no. 3, pp. 1–28, 2019.
W. Liu, J. Su, Z. Mao, P. Jin, Y. Huang, C. Dou, L. Zhou, and Y. Shang, “Research on text classification method of distribution network equipment fault based on deep learning,” in 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom). New York, USA: IEEE, Aug. 1–3, 2020, pp. 11–16.
W. Liu, B.Wang, and Z. Song, “Failure prediction of municipal water pipes using machine learning algorithms,” Water Resources Management, vol. 36, no. 4, pp. 1271–1285, 2022.
A. Rana and R. Pandey, “A review of popular decision tree algorithms in data mining,” Asian Journal of Multidimensional Research, vol. 10, no. 10, pp. 230–237, 2021.
M. Rakhra, P. Soniya, D. Tanwar, P. Singh, D. Bordoloi, P. Agarwal, S. Takkar, K. Jairath, and N. Verma, “Crop price prediction using random forest and decision tree regression:-A review,” Materials Today: Proceedings, 2021.
S. R. Hashemi, S. S. M. Salehi, D. Erdogmus, S. P. Prabhu, S. K. Warfield, and A. Gholipour, “Asymmetric loss functions and deep denselyconnected networks for highly-imbalanced medical image segmentation: Application to multiple sclerosis lesion detection,” IEEE Access, vol. 7, pp. 1721–1735, 2018.
K. Sprute, V. Kramer, S. A. Koerber, M. Meneses, R. Fernandez, C. Soza-Ried, M. Eiber, W. A. Weber, I. Rauscher, K. Rahbar et al., “Diagnostic accuracy of 18F-PSMA-1007 PET/CT imaging for lymph node staging of prostate carcinoma in primary and biochemical recurrence,” Journal of Nuclear Medicine, vol. 62, no. 2, pp. 208–213, 2021.
M. A. Muslim, S. H. Rukmana, E. Sugiharti, B. Prasetiyo, and S. Alimah, “Optimization of C4.5 algorithm-based particle swarm optimization for breast cancer diagnosis,” Journal of Physics: Conference Series, vol. 983, pp. 1–7, 2018.
M. Pal and S. Parija, “Prediction of heart diseases using random forest,” Journal of Physics: Conference Series, vol. 1817, pp. 1–8, 2021.
Y. A. Saadoon and R. H. Abdulamir, “Improved random forest algorithm performance for big data,” Journal of Physics: Conference Series, vol. 1897, pp. 1–13, 2021.
O¨ . Akar and O. Gu¨ngo¨r, “Classification of multispectral images using random forest algorithm,” Journal of Geodesy and Geoinformation, vol. 1, no. 2, pp. 105–112, 2012.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Muhasshanah, Mohammad Tohir, Dewi Andariya Ningsih, Neny Yuli Susanti, Astik Umiyah, Lia Fitria
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)