The Comparison of Deep Learning Models for Indonesian Political Hoax News Detection

Authors

  • Oktavia Citra Resmi Rachmawati Department of Information and Computer Engineering
  • Zakha Maisat Eka Darmawan Department of Creative Multimedia Technology

DOI:

https://doi.org/10.21512/commit.v18i2.10929

Keywords:

Deep Learning Model, Political Hoax News Detection, Text Classification

Abstract

Indonesia is the world’s fourth most populous country and has a diverse sociopolitical landscape. Political fake news exacerbates existing social divisions and causes political polarization in Indonesian society. Hence, studying it as a specific challenge can contribute to broader discussions on the impact of fake news in different contexts. The researchers propose a hoax news detection system by developing a deep learning model with various lapses against a data set preprocessed using term-frequency and token filtering to represent the most prominent words in each class. The researchers compare the layers with the potential to have high performance in predicting the falsity of Indonesian political news data by observing the models based on training history plots, model specification, and performance metrics in the classification report module. The deep learning models include One-Dimensional Convolution Neural Networks (1D CNN), Long-Term Short Memory (LSTM), and Gated Recurrent Unit (GRU). The news data are obtained from the Kaggle site, containing 41.726 rows of data. Based on the experiments with the text data that has been preprocessed in the form of vectors and the specific parameters before starting, the results show that GRU achieves the highest performance value in accuracy, recall, precision, and F1 score. Although GRU becomes the model with the smallest file size, it is the slowest model to generate predictions from text news data. It also has a higher potential to be an overfitted model due to parameters than a simple RNN.

Dimensions

Plum Analytics

Author Biographies

Oktavia Citra Resmi Rachmawati, Department of Information and Computer Engineering

The Electronic Engineering Polytechnic Institute of Surabaya

Zakha Maisat Eka Darmawan, Department of Creative Multimedia Technology

The Electronic Engineering Polytechnic Institute of Surabaya

References

S. R. Sahoo and B. B. Gupta, “Multiple features based approach for automatic fake news detection on social networks using deep learning,” Applied Soft Computing, vol. 100, 2021.

X. Zhou, R. Zafarani, K. Shu, and H. Liu, “Fake news: Fundamental theories, detection strategies and challenges,” in Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. Melbourne VIC, Australia: Association for Computing Machinery, Feb. 11–15, 2019, pp. 836–837.

X. Zhang and A. A. Ghorbani, “An overview of online fake news: Characterization, detection, and discussion,” Information Processing & Management, vol. 57, no. 2, 2020.

A. Gelfert, “Fake news: A definition,” Informal Logic, vol. 38, no. 1, pp. 84–117, 2018.

T. Duile and S. Tamma, “Political language and fake news: Some considerations from the 2019 election in Indonesia,” Indonesia and the Malay World, vol. 49, no. 143, pp. 82–105, 2021.

E. H. Susanto, “Social media, hoax, and threats against diversity in Indonesia,” International Journal of Innovation, Creativity and Change, vol. 8, no. 12, pp. 328–344, 2019.

T. T. Putri, S. HendryxWarra, I. Y. Sitepu, M. Sihombing, and Silvi, “Analysis and detection of hoax contents in Idonesian news based on machine learning,” Journal of Informatic Pelita Nusantara, vol. 4, no. 1, 2019.

B. P. Nayoga, R. Adipradana, R. Suryadi, and D. Suhartono, “Hoax analyzer for Indonesian news using deep learning models,” Procedia Computer Science, vol. 179, pp. 704–712, 2021.

B. Zaman, A. Justitia, K. N. Sani, and E. Purwanti, “An Indonesian hoax news detection system using reader feedback and Na¨ıve Bayes algorithm,” Cybernetics and Information Technologies, vol. 20, no. 1, pp. 82–94, 2020.

K. Padmanandam, S. P. V. D. S. Bheri, L. Vegesna, and K. Sruthi, “A speech recognized dynamic word cloud visualization for text summarization,” in 2021 6th International Conference on Inventive Computation Technologies (ICICT). Coimbatore, India: IEEE, Jan. 20–22, 2021, pp. 609–613.

M. J. Denny and A. Spirling, “Text preprocessing for unsupervised learning: Why it matters, when it misleads, and what to do about it,” Political Analysis, vol. 26, no. 2, pp. 168–189, 2018.

H. Hassani, C. Beneki, S. Unger, M. T. Mazinani, and M. R. Yeganegi, “Text mining in big data analytics,” Big Data and Cognitive Computing, vol. 4, no. 1, pp. 1–34, 2020.

M. T. F. Al Islami, A. R. Barakbah, and T. Harsono, “Social media engineering for issues feature extraction using categorization knowledge modelling and rule-based sentiment analysis,” JOIV: International Journal on Informatics Visualization, vol. 5, no. 1, pp. 83–93, 2021.

L. Hickman, S. Thapa, L. Tay, M. Cao, and P. Srinivasan, “Text preprocessing for text mining in organizational research: Review and recommendations,” Organizational Research Methods, vol. 25, no. 1, pp. 114–146, 2022.

M. Alfian, A. R. Barakbah, and I. Winarno, “Indonesian online news extraction and clustering using evolving clustering,” JOIV: International Journal on Informatics Visualization, vol. 55, no. 3, pp. 280–290, 2021.

A. Adeyemo, H. Wimmer, and L. M. Powell, “Effects of normalization techniques on logistic regression in data science,” Journal of Information Systems Applied Research, vol. 12, no. 2, pp. 37–44, 2019.

S. K. R. Koduru, “A comprehensive analysis of normalization approaches for privacy protection in data mining,” International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), vol. 8, no. 5, pp. 144–157, 2022.

N. K¨aming, A. Dawid, K. Kottmann, M. Lewenstein, K. Sengstock, A. Dauphin, and C. Weitenberg, “Unsupervised machine learning of topological phase transitions from experimental data,” Machine Learning: Science and Technology, vol. 2, pp. 1–20, 2021.

I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT Press, 2016.

E. G. Adagbasa, S. A. Adelabu, and T. W. Okello, “Application of deep learning with stratified Kfold for vegetation species discrimation in a protected mountainous region using Sentinel-2 image,” Geocarto International, vol. 37, no. 1, pp. 142–162, 2022.

H. Ling, C. Qian, W. Kang, C. Liang, and H. Chen, “Combination of support vector machine and K-Fold cross validation to predict compressive strength of concrete in marine environment,” Construction and Building Materials, vol. 206, pp. 355–363, 2019.

W. M. Fatihia, A. Fariza, and T. Karlita, “CNN with batch normalization adjustment for offline hand-written signature genuine verification,” JOIV: International Journal on Informatics Visualization, vol. 7, no. 1, pp. 200–207, 2023.

P. Harrington, Machine learning in action. Simon and Schuster, 2012.

F. Provost and T. Fawcett, Data science for business: What you need to know about data mining and data-analytic thinking. O’Reilly Media, Inc., 2013.

F. Amin and M. Mahmoud, “Confusion matrix in binary classification problems: A step-by-step tutorial,” Journal of Engineering Research, vol. 6, no. 5, 2022.

D. Chicco and G. Jurman, “The advantages of the Matthews Correlation Coefficient (MCC) over F1 score and accuracy in binary classification evaluation,” BMC Genomics, vol. 21, pp. 1–13, 2020.

Y. Widhiyasana, T. Semiawan, I. G. A. Mudzakir, and M. R. Noor, “Penerapan Convolutional Long Short-Term Memory untuk klasifikasi teks berita Bahasa Indonesia,” Jurnal Nasional Teknik Elektro dan Teknologi Informasi, vol. 10, no. 4, pp. 354–361, 2021.

A. Singh, S. K. Dargar, A. Gupta, A. Kumar, A. K. Srivastava, M. Srivastava, P. Kumar Tiwari, and M. A. Ullah, “[Retracted] Evolving long short-term memory network-based text classification,” Computational Intelligence and Neuroscience, vol. 2022, no. 1, pp. 1–11, 2022.

S. Varsamopoulos, K. Bertels, and C. G. Almudever, “Decoding surface code with a distributed neural network–based decoder,” Quantum Machine Intelligence, vol. 2, pp. 1–12, 2020.

H. Huan, Z. Guo, T. Cai, and Z. He, “A text classification method based on a convolutional and bidirectional long short-term memory model,” Connection Science, vol. 34, no. 1, pp. 2108–2124, 2022.

T. Zhang and F. You, “Research on short text classification based on TextCNN,” Journal of Physics: Conference Series, vol. 1757, pp. 1–7, 2021.

S. Nam, H. Park, C. Seo, and D. Choi, “Forged signature distinction using convolutional neural network for feature extraction,” Applied Sciences, vol. 8, no. 2, pp. 1–14, 2018.

A. Dutta, S. Kumar, and M. Basu, “A gated recurrent unit approach to Bitcoin price prediction,” Journal of Risk and Financial Management, vol. 13, no. 2, pp. 1–16, 2020.

X. Liu, Z. Lin, and Z. Feng, “Short-term offshore wind speed forecast by seasonal ARIMA-A comparison against GRU and LSTM,” Energy, vol. 227, 2021.

S. Paguada, L. Batina, I. Buhan, and I. Armendariz, “Being patient and persistent: Optimizing an early stopping strategy for deep learning in profiled attacks,” IEEE Transactions on Computers, pp. 1–12, 2023.

A. Thakur, M. Gupta, D. K. Sinha, K. K. Mishra, V. K. Venkatesan, and S. Guluwadi, “Transformative breast cancer diagnosis using CNNs with optimized ReduceLROnPlateau and early stopping enhancements,” International Journal of Computational Intelligence Systems, vol. 17, no. 1, pp. 1–18, 2024.

E. Rojas, D. P´erez, J. C. Calhoun, L. B. Gomez, T. Jones, and E. Meneses, “Understanding soft error sensitivity of deep learning models and frameworks through checkpoint alteration,” in 2021 IEEE International Conference on Cluster Computing (CLUSTER). Portland, USA: IEEE, Sept. 7–10, 2021, pp. 492–503.

Downloads

Published

2024-08-21
Abstract 819  .
PDF downloaded 399  .