Combating Hoax and Misinformation in Indonesia Using Machine Learning What is Missing and Future Directions
DOI:
https://doi.org/10.21512/emacsjournal.v6i2.11556Keywords:
Hoax, Misinformation, Detection Process, Machine LearningAbstract
According to survey from several organizations in Indonesia to 10.000 respondents with age range from 13-70 years at 2022 and 2023, 56% respondents are mainly found hoax and misinformation on social media and online media platform with 45% respondents are hesitant with their ability to differentiate true information with hoax. Most of the hoax and false information researchers in Indonesia also still have some challenges such as on the dataset detection method. This research will use the systematic literature review using PICOC, inclusion-exclusion rules, and quality’s checklist. The results based on 20 papers are data crawler’s application usage, labelling, and text pre-processing are the major steps to improve the dataset with more than 10.000 data. There are also already some advance methodologies for hoax and misinformation detection in text form such as graph-based learning and special architecture design, yet there’s still a little number for the detection in media form. The recommendation includes the dataset improvement steps, literature, and methodologies in media form.
Plum Analytics
References
Annur, C. M. (2023, May 30). Mayoritas Warga Indonesia Ragu dalam Memilah Berita Hoaks. Databoks.
Bachtiar, M. R., Gusti, D. N., Wijaya, I., & Hidajat, M. (2018). Web-Based Application Development for False Images Detection for Multi Images Through Demosaicing Detection. Proceedings of 2018 International Conference on Information Management and Technology, ICIMTech 2018, September, 277–280. https://doi.org/10.1109/ICIMTech.2018.8528175
Bengesi, S., Oladunni, T., Olusegun, R., & Audu, H. (2023). A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets. IEEE Access, 11, 11811–11826. https://doi.org/10.1109/ACCESS.2023.3242290
Boididou, C., Papadopoulos, S., Zampoglou, M., Apostolidis, L., Papadopoulou, O., & Kompatsiaris, Y. (2018). Detection and visualization of misleading content on Twitter. International Journal of Multimedia Information Retrieval, 7(1), 71–86. https://doi.org/10.1007/s13735-017-0143-x
Bora, K., Das, D., Barman, B., & Borah, P. (2018). Are internet videos useful sources of information during global public health emergencies? A case study of YouTube videos during the 2015–16 Zika virus pandemic. Pathogens and Global Health, 112(6), 320–328. https://doi.org/10.1080/20477724.2018.1507784
BPPT. (2020). Strategi Nasional Kecerdasan Artifisial Indonesia 2020 - 2045.
Cekfakta. (n.d.). Cekfakta - Media. Cekfakta. Retrieved April 27, 2020, from https://cekfakta.com/media
CNN Indonesia. (2019). Kebebasan di Era Jokowi dan Jerat Lima Pasal “Panas” UU ITE. CNN Indonesia. https://www.cnnindonesia.com/teknologi/20191018171905-185-440771/kebebasan-di-era-jokowi-dan-jerat-lima-pasal-panas-uu-ite
Daon001. (2019). Hingga Februari, Mesin Ais Jaring 11 Ribu Konten Terorisme. Kominfo. https://www.kominfo.go.id/content/detail/17296/hingga-februari-mesin-ais-jaring-11-ribu-konten-terorisme/0/sorotan_media
Fajri, N. (2023, February 24). Hoaks Merajalela? Jangan Sampai Kamu Jadi Korbannya! Kemenkeu.
First Draft. (2017). Fake news. It’s complicated. - First Draft - Medium. First Draft. https://medium.com/1st-draft/fake-news-its-complicated-d0f773766c79
Habib, A., Asghar, M. Z., Khan, A., Habib, A., & Khan, A. (2019). False information detection in online content and its role in decision making: a systematic literature review. Social Network Analysis and Mining, 9(1), 50. https://doi.org/10.1007/s13278-019-0595-5
Harjule, P., Manva, M. T., Mehta, T., Gurjar, S., & Agarwal, B. (2022). Analysing Misinformation Sharing Amongst College Students in India during COVID-19. Procedia Computer Science, 218, 671–685. https://doi.org/10.1016/j.procs.2023.01.048
Hossain, Md. R., Hoque, M. M., Siddique, N., & Dewan, M. A. A. (2024). AraCovTexFinder: Leveraging the transformer-based language model for Arabic COVID-19 text identification. Engineering Applications of Artificial Intelligence, 133, 107987. https://doi.org/https://doi.org/10.1016/j.engappai.2024.107987
Jamil, N. B. C. E., Ishak, I. Bin, Sidi, F., Affendey, L. S., & Mamat, A. (2015). A Systematic Review on the Profiling of Digital News Portal for Big Data Veracity. Procedia Computer Science. https://doi.org/10.1016/j.procs.2015.12.154
KBBI. (2020). Arti kata Hoax. KBBI. https://kbbionline.com/arti/gaul/hoax
Kemkominfo. (2024, January 2). Hingga Akhir Tahun 2023, Kominfo Tangani 12.547 Isu Hoaks. Kemkominfo. https://www.kominfo.go.id/content/detail/53899/siaran-pers-no-02hmkominfo012024-tentang-hingga-akhir-tahun-2023-kominfo-tangani-12547-isu-hoaks/0/siaran_pers
Kitchenham, B. (2007). Guidelines for performing Systematic Literature Reviews in Software Engineering. https://www.researchgate.net/publication/302924724
Kılınç, D. D., & Sayar, G. (2019). Assessment of reliability of YouTube videos on orthodontics. Turkish Journal of Orthodontics, 32(3), 145–150. https://doi.org/10.5152/TurkJOrthod.2019.18064
Kopev, D., Ali, A., Koychev, I., & Nakov, P. (2019). Detecting Deception in Political Debates Using Acoustic and Textual Features. 2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings, 652–659. https://doi.org/10.1109/ASRU46091.2019.9003892
Literasi Digital. (n.d.). Koleksi Buku Literasi Digital. Literasi Digital. Retrieved May 15, 2020, from https://literasidigital.id/koleksi-buku-literasi-digital/
Lumoindong, C. W. D., Aryadi, M. A., Wilyani, I. T., & Suhartomo, A. (2020). Effectiveness of Probabilistic Image Sampling Techniques to Identify Hoax-related Images in Indonesia. International Journal of Innovative Technology and Exploring Engineering, 9(3S), 125–131. https://doi.org/10.35940/ijitee.c1029.0193s20
MAFINDO. (n.d.-a). Lapor Hoax - TurnBackHoax. MAFINDO. Retrieved May 16, 2020, from https://turnbackhoax.id/lapor-hoax/
MAFINDO. (n.d.-b). Lapor Hoax - TurnBackHoax. MAFINDO. Retrieved May 16, 2020, from https://turnbackhoax.id/lapor-hoax/
MAFINDO. (2018a). Tool Kit Penanganan Hoaks dan Disinformasi.
MAFINDO. (2018b). Tool Kit Penanganan Hoaks dan Disinformasi.
MAFINDO. (2019). Terminologi & Penggunaan Format Baru untuk Post di FAFHH. Facebook Post. https://www.facebook.com/groups/fafhh/permalink/814386372227233/
Mth. (2020). Mau Tahu Cara Kerja Mesin AIS dalam Tangani Konten Negatif? Kominfo. https://www.kominfo.go.id/content/detail/24497/mau-tahu-cara-kerja-mesin-ais-dalam-tangani-konten-negatif/0/berita_satker
Pierri, F., Luceri, L., Jindal, N., & Ferrara, E. (2023). Propaganda and Misinformation on Facebook and Twitter during the Russian Invasion of Ukraine. Proceedings of the 15th ACM Web Science Conference 2023, 65–74. https://doi.org/10.1145/3578503.3583597
Prasetijo, A. B., Isnanto, R. R., Eridani, D., Soetrisno, Y. A. A., Arfan, M., & Sofwan, A. (2017). Hoax detection system on Indonesian news sites based on text classification using SVM and SGD. Proceedings - 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering, ICITACEE 2017, 2018-Janua, 45–49. https://doi.org/10.1109/ICITACEE.2017.8257673
Pratiwi, I. Y. R., Asmara, R. A., & Rahutomo, F. (2017). Study of Hoax News Detection Using Naive Bayes Classifier in Indonesian Language. In International Conferenve on Information & Communication Technology and System (ICTS) (pp. 73–78). http://ieeexplore.ieee.org/abstract/document/8265649/
Rahutomo, F., Yanuar, I., Pratiwi, R., & Ramadhani, D. M. (2019). NAÏVE BAYES’S EXPERIMENT ON HOAX NEWS DETECTION IN INDONESIAN LANGUAGE. Jurnal Penelitian Komunikasi Dan Opini Publik, 23(1), 1–15.
Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017). Truth of varying shades: Analyzing language in fake news and political fact-checking. EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, 2931–2937. https://doi.org/10.18653/v1/d17-1317
Rasywir, E., & Purwarianti, A. (2015). Eksperimen pada Sistem Klasifikasi Berita Hoax Berbahasa Indonesia Berbasis Pembelajaran Mesin. Jurnal Cybermatika, 3(2), 1–8. https://www.mendeley.com/import/
Rossler, A., Cozzolino, D., Verdoliva, L., Riess, C., Thies, J., & Niessner, M. (2019). FaceForensics++: Learning to detect manipulated facial images. Proceedings of the IEEE International Conference on Computer Vision, 2019-Octob, 1–11. https://doi.org/10.1109/ICCV.2019.00009
Sabir, E., Cheng, J., Jaiswal, A., AbdAlmageed, W., Masi, I., & Natarajan, P. (2019). Recurrent Convolutional Strategies for Face Manipulation Detection in Videos. Interfaces (GUI), 1, 80–87. http://arxiv.org/abs/1905.00582
Santoso, H. A., Rachmawanto, E. H., Nugraha, A., Nugroho, A. A., & Basuki, R. S. (2020). Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization Hoax classification and sentiment analysis of Indonesian news using Naive Bayes optimization. Telkomnika, 18(2), 799–806. https://doi.org/10.12928/telkomnika.v18i2.14744
Sekarhati, D. K. S. (2024, April 24). Turn Back Hoax Telegram - April 2024.
Soleman, S., & Sabila, Y. (2020). Blog Prosa.ai - Berkenalan dengan Apollo Anti Hoax. Prosa.Ai. https://blog.prosa.ai/id/berkenalan-dengan-apollo-anti-hoax/
Statista. (2023, March 15). Media platforms that present hoaxes and fake news in Indonesia in 2022. Statista. https://www.statista.com/statistics/1316006/indonesia-media-with-hoaxes-and-fake-news/
Volkova, S., Ayton, E., Arendt, D. L., Huang, Z., & Hutchinson, B. (2019). Explaining multimodal deceptive news prediction models. Proceedings of the 13th International Conference on Web and Social Media, ICWSM 2019, Icwsm, 659–662.
Wang, G., Mohanlal, M., Wilson, C., Wang, X., Metzger, M., Zheng, H., & Zhao, B. Y. (2012). Social Turing Tests: Crowdsourcing Sybil Detection. ArXiv.
Wei, B., Yu, M., Chen, K., & Jiang, J. (2019). Deep-BIF: Blind Image Forensics Based on Deep Learning. 2019 IEEE Conference on Dependable and Secure Computing, DSC 2019 - Proceedings, 1–6. https://doi.org/10.1109/DSC47296.2019.8937712
Weinzierl, M. A., & Harabagiu, S. M. (2021). Automatic detection of COVID-19 vaccine misinformation with graph link prediction. Journal of Biomedical Informatics, 124, 103955. https://doi.org/https://doi.org/10.1016/j.jbi.2021.103955
Xu, K., Wang, F., Wang, H., & Yang, B. (2020). Detecting fake news over online social media via domain reputations and content understanding. Tsinghua Science and Technology, 25(1), 20–27. https://doi.org/10.26599/TST.2018.9010139
Zaman, B., Justitia, A., & Sani, K. N. (2020). An Indonesian Hoax News Detection System Using Reader Feedback and Naïve Bayes Algorithm. Cybernetics and Information Technologies, 20(1), 82–94. https://doi.org/10.2478/cait-2020-0006
Zarei, K., Farahbakhsh, R., Crespi, N., & Tyson, G. (2021). Dataset of Coronavirus Content From Instagram With an Exploratory Analysis. IEEE Access, 9, 157192–157202. https://doi.org/10.1109/ACCESS.2021.3126552
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Engineering, MAthematics and Computer Science Journal (EMACS)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)