Combating Hoax and Misinformation in Indonesia Using Machine Learning What is Missing and Future Directions

Dwinanda Kinanti Suci Sekarhati

doi:10.21512/emacsjournal.v6i2.11556

Authors

Dwinanda Kinanti Suci Sekarhati Universitas Bina Nusantara

DOI:

https://doi.org/10.21512/emacsjournal.v6i2.11556

Keywords:

Hoax, Misinformation, Detection Process, Machine Learning

Abstract

According to survey from several organizations in Indonesia to 10.000 respondents with age range from 13-70 years at 2022 and 2023, 56% respondents are mainly found hoax and misinformation on social media and online media platform with 45% respondents are hesitant with their ability to differentiate true information with hoax. Most of the hoax and false information researchers in Indonesia also still have some challenges such as on the dataset detection method. This research will use the systematic literature review using PICOC, inclusion-exclusion rules, and qualityâ€™s checklist. The results based on 20 papers are data crawlerâ€™s application usage, labelling, and text pre-processing are the major steps to improve the dataset with more than 10.000 data. There are also already some advance methodologies for hoax and misinformation detection in text form such as graph-based learning and special architecture design, yet thereâ€™s still a little number for the detection in media form. The recommendation includes the dataset improvement steps, literature, and methodologies in media form.

Dimensions

Plum Analytics

Author Biography

Dwinanda Kinanti Suci Sekarhati, Universitas Bina Nusantara

Computer Science Department, School of Computer Science

References

Annur, C. M. (2023, May 30). Mayoritas Warga Indonesia Ragu dalam Memilah Berita Hoaks. Databoks.

Bachtiar, M. R., Gusti, D. N., Wijaya, I., & Hidajat, M. (2018). Web-Based Application Development for False Images Detection for Multi Images Through Demosaicing Detection. Proceedings of 2018 International Conference on Information Management and Technology, ICIMTech 2018, September, 277â€“280. https://doi.org/10.1109/ICIMTech.2018.8528175

Bengesi, S., Oladunni, T., Olusegun, R., & Audu, H. (2023). A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets. IEEE Access, 11, 11811â€“11826. https://doi.org/10.1109/ACCESS.2023.3242290

Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., & Kompatsiaris, Y. (2018). Detection and visualization of misleading content on Twitter. International Journal of Multimedia Information Retrieval, 7(1), 71â€“86. https://doi.org/10.1007/s13735-017-0143-x

Bora, K., Das, D., Barman, B., & Borah, P. (2018). Are internet videos useful sources of information during global public health emergencies? A case study of YouTube videos during the 2015â€“16 Zika virus pandemic. Pathogens and Global Health, 112(6), 320â€“328. https://doi.org/10.1080/20477724.2018.1507784

BPPT. (2020). Strategi Nasional Kecerdasan Artifisial Indonesia 2020 - 2045.

Cekfakta. (n.d.). Cekfakta - Media. Cekfakta. Retrieved April 27, 2020, from https://cekfakta.com/media

CNN Indonesia. (2019). Kebebasan di Era Jokowi dan Jerat Lima Pasal â€œPanasâ€ UU ITE. CNN Indonesia. https://www.cnnindonesia.com/teknologi/20191018171905-185-440771/kebebasan-di-era-jokowi-dan-jerat-lima-pasal-panas-uu-ite

Daon001. (2019). Hingga Februari, Mesin Ais Jaring 11 Ribu Konten Terorisme. Kominfo. https://www.kominfo.go.id/content/detail/17296/hingga-februari-mesin-ais-jaring-11-ribu-konten-terorisme/0/sorotan_media

Fajri, N. (2023, February 24). Hoaks Merajalela? Jangan Sampai Kamu Jadi Korbannya! Kemenkeu.

First Draft. (2017). Fake news. Itâ€™s complicated. - First Draft - Medium. First Draft. https://medium.com/1st-draft/fake-news-its-complicated-d0f773766c79

Habib, A., Asghar, M. Z., Khan, A., Habib, A., & Khan, A. (2019). False information detection in online content and its role in decision making: a systematic literature review. Social Network Analysis and Mining, 9(1), 50. https://doi.org/10.1007/s13278-019-0595-5

Harjule, P., Manva, M. T., Mehta, T., Gurjar, S., & Agarwal, B. (2022). Analysing Misinformation Sharing Amongst College Students in India during COVID-19. Procedia Computer Science, 218, 671â€“685. https://doi.org/10.1016/j.procs.2023.01.048

Hossain, Md. R., Hoque, M. M., Siddique, N., & Dewan, M. A. A. (2024). AraCovTexFinder: Leveraging the transformer-based language model for Arabic COVID-19 text identification. Engineering Applications of Artificial Intelligence, 133, 107987. https://doi.org/https://doi.org/10.1016/j.engappai.2024.107987

Jamil, N. B. C. E., Ishak, I. Bin, Sidi, F., Affendey, L. S., & Mamat, A. (2015). A Systematic Review on the Profiling of Digital News Portal for Big Data Veracity. Procedia Computer Science. https://doi.org/10.1016/j.procs.2015.12.154

KBBI. (2020). Arti kata Hoax. KBBI. https://kbbionline.com/arti/gaul/hoax

Kemkominfo. (2024, January 2). Hingga Akhir Tahun 2023, Kominfo Tangani 12.547 Isu Hoaks. Kemkominfo. https://www.kominfo.go.id/content/detail/53899/siaran-pers-no-02hmkominfo012024-tentang-hingga-akhir-tahun-2023-kominfo-tangani-12547-isu-hoaks/0/siaran_pers

Kitchenham, B. (2007). Guidelines for performing Systematic Literature Reviews in Software Engineering. https://www.researchgate.net/publication/302924724

KÄ±lÄ±nÃ§, D. D., & Sayar, G. (2019). Assessment of reliability of YouTube videos on orthodontics. Turkish Journal of Orthodontics, 32(3), 145â€“150. https://doi.org/10.5152/TurkJOrthod.2019.18064

Kopev, D., Ali, A., Koychev, I., & Nakov, P. (2019). Detecting Deception in Political Debates Using Acoustic and Textual Features. 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings, 652â€“659. https://doi.org/10.1109/ASRU46091.2019.9003892

Literasi Digital. (n.d.). Koleksi Buku Literasi Digital. Literasi Digital. Retrieved May 15, 2020, from https://literasidigital.id/koleksi-buku-literasi-digital/

Lumoindong, C. W. D., Aryadi, M. A., Wilyani, I. T., & Suhartomo, A. (2020). Effectiveness of Probabilistic Image Sampling Techniques to Identify Hoax-related Images in Indonesia. International Journal of Innovative Technology and Exploring Engineering, 9(3S), 125â€“131. https://doi.org/10.35940/ijitee.c1029.0193s20

MAFINDO. (n.d.-a). Lapor Hoax - TurnBackHoax. MAFINDO. Retrieved May 16, 2020, from https://turnbackhoax.id/lapor-hoax/

MAFINDO. (n.d.-b). Lapor Hoax - TurnBackHoax. MAFINDO. Retrieved May 16, 2020, from https://turnbackhoax.id/lapor-hoax/

MAFINDO. (2018a). Tool Kit Penanganan Hoaks dan Disinformasi.

MAFINDO. (2018b). Tool Kit Penanganan Hoaks dan Disinformasi.

MAFINDO. (2019). Terminologi & Penggunaan Format Baru untuk Post di FAFHH. Facebook Post. https://www.facebook.com/groups/fafhh/permalink/814386372227233/

Mth. (2020). Mau Tahu Cara Kerja Mesin AIS dalam Tangani Konten Negatif? Kominfo. https://www.kominfo.go.id/content/detail/24497/mau-tahu-cara-kerja-mesin-ais-dalam-tangani-konten-negatif/0/berita_satker

Pierri, F., Luceri, L., Jindal, N., & Ferrara, E. (2023). Propaganda and Misinformation on Facebook and Twitter during the Russian Invasion of Ukraine. Proceedings of the 15th ACM Web Science Conference 2023, 65â€“74. https://doi.org/10.1145/3578503.3583597

Prasetijo, A. B., Isnanto, R. R., Eridani, D., Soetrisno, Y. A. A., Arfan, M., & Sofwan, A. (2017). Hoax detection system on Indonesian news sites based on text classification using SVM and SGD. Proceedings - 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering, ICITACEE 2017, 2018-Janua, 45â€“49. https://doi.org/10.1109/ICITACEE.2017.8257673

Pratiwi, I. Y. R., Asmara, R. A., & Rahutomo, F. (2017). Study of Hoax News Detection Using Naive Bayes Classifier in Indonesian Language. In International Conferenve on Information & Communication Technology and System (ICTS) (pp. 73â€“78). http://ieeexplore.ieee.org/abstract/document/8265649/

Rahutomo, F., Yanuar, I., Pratiwi, R., & Ramadhani, D. M. (2019). NAÃVE BAYESâ€™S EXPERIMENT ON HOAX NEWS DETECTION IN INDONESIAN LANGUAGE. Jurnal Penelitian Komunikasi Dan Opini Publik, 23(1), 1â€“15.

Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017). Truth of varying shades: Analyzing language in fake news and political fact-checking. EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, 2931â€“2937. https://doi.org/10.18653/v1/d17-1317

Rasywir, E., & Purwarianti, A. (2015). Eksperimen pada Sistem Klasifikasi Berita Hoax Berbahasa Indonesia Berbasis Pembelajaran Mesin. Jurnal Cybermatika, 3(2), 1â€“8. https://www.mendeley.com/import/

Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Niessner, M. (2019). FaceForensics++: Learning to detect manipulated facial images. Proceedings of the IEEE International Conference on Computer Vision, 2019-Octob, 1â€“11. https://doi.org/10.1109/ICCV.2019.00009

Sabir, E., Cheng, J., Jaiswal, A., AbdAlmageed, W., Masi, I., & Natarajan, P. (2019). Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. Interfaces (GUI), 1, 80â€“87. http://arxiv.org/abs/1905.00582

Santoso, H. A., Rachmawanto, E. H., Nugraha, A., Nugroho, A. A., & Basuki, R. S. (2020). Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization. Telkomnika, 18(2), 799â€“806. https://doi.org/10.12928/telkomnika.v18i2.14744

Sekarhati, D. K. S. (2024, April 24). Turn Back Hoax Telegram - April 2024.

Soleman, S., & Sabila, Y. (2020). Blog Prosa.ai - Berkenalan dengan Apollo Anti Hoax. Prosa.Ai. https://blog.prosa.ai/id/berkenalan-dengan-apollo-anti-hoax/

Statista. (2023, March 15). Media platforms that present hoaxes and fake news in Indonesia in 2022. Statista. https://www.statista.com/statistics/1316006/indonesia-media-with-hoaxes-and-fake-news/

Volkova, S., Ayton, E., Arendt, D. L., Huang, Z., & Hutchinson, B. (2019). Explaining multimodal deceptive news prediction models. Proceedings of the 13th International Conference on Web and Social Media, ICWSM 2019, Icwsm, 659â€“662.

Wang, G., Mohanlal, M., Wilson, C., Wang, X., Metzger, M., Zheng, H., & Zhao, B. Y. (2012). Social Turing Tests: Crowdsourcing Sybil Detection. ArXiv.

Wei, B., Yu, M., Chen, K., & Jiang, J. (2019). Deep-BIF: Blind Image Forensics Based on Deep Learning. 2019 IEEE Conference on Dependable and Secure Computing, DSC 2019 - Proceedings, 1â€“6. https://doi.org/10.1109/DSC47296.2019.8937712

Weinzierl, M. A., & Harabagiu, S. M. (2021). Automatic detection of COVID-19 vaccine misinformation with graph link prediction. Journal of Biomedical Informatics, 124, 103955. https://doi.org/https://doi.org/10.1016/j.jbi.2021.103955

Xu, K., Wang, F., Wang, H., & Yang, B. (2020). Detecting fake news over online social media via domain reputations and content understanding. Tsinghua Science and Technology, 25(1), 20â€“27. https://doi.org/10.26599/TST.2018.9010139

Zaman, B., Justitia, A., & Sani, K. N. (2020). An Indonesian Hoax News Detection System Using Reader Feedback and NaÃ¯ve Bayes Algorithm. Cybernetics and Information Technologies, 20(1), 82â€“94. https://doi.org/10.2478/cait-2020-0006

Zarei, K., Farahbakhsh, R., Crespi, N., & Tyson, G. (2021). Dataset of Coronavirus Content From Instagram With an Exploratory Analysis. IEEE Access, 9, 157192â€“157202. https://doi.org/10.1109/ACCESS.2021.3126552