Comparison of IndoBERT and SVM Algorithm to Perform Aspect Based Sentiment Analysis using Hierarchical Dirichlet Process

Authors

  • Sheila Prima Octarini Bina Nusantara University
  • Alfi Yusrotis Zakiyyah Bina Nusantara University
  • Kartika Purwandari Bina Nusantara University

DOI:

https://doi.org/10.21512/emacsjournal.v7i3.13493

Keywords:

Hierarchical Dirichlet Process, SVM, IndoBERT, SMOTE, Aspect Based Sentiment Analysis

Abstract

Analyzing the performance of SVM and IndoBERT models for aspect-based sentiment analysis on fashion reviews in Tokopedia E-Commerce. This study employs the SMOTE technique due to the imbalance in the original data. Aspect determination using the Hierarchical Dirichlet Process (HDP) model yields satisfactory results with an adequate coherence score. The comparison between SVM and IndoBERT methods for aspect-based sentiment analysis shows that SVM is superior. IndoBERT achieved an accuracy of 87%, precision of 91%, recall of 93%, and F1-Score of 92%, while SVM attained an accuracy of 96%, precision of 100%, recall of 92%, and F1- Score of 96%. Therefore, the SVM model was chosen for implementation on a website that allows users to view aspect-based sentiment analysis on products in E-Commerce. The HDP model effectively grouped related terms into aspects such as “Material,” “Shipping,” and “Colour,” enhancing interpretability in sentiment classification. The resulting website enables users to analyze product sentiments interactively, providing actionable insights for both sellers and customers to assess product quality and service satisfaction more efficiently.

Dimensions

Author Biographies

Sheila Prima Octarini, Bina Nusantara University

Computer Science Department, School of Computer Science

 

Alfi Yusrotis Zakiyyah, Bina Nusantara University

Computer Science Department, School of Computer Science

 

Kartika Purwandari, Bina Nusantara University

Computer Science Department, School of Computer Science

 

References

Brownlee, J. (2020). Imbalanced classification with Python: Better metrics, balance skewed classes, and apply cost-sensitive learning. Machine Learning Mastery.

Blanco, V., Japón, A., & Puerto, J. (2023). Multiclass optimal classification trees with SVM-splits. Machine Learning, 112(12), 4905–4928. https://doi.org/10.1007/s10994-023-06366-1

Budiman, I., Faisal, M. R., Faridhah, A., Farmadi, A., Mazdadi, M. I., Saragih, T. H., & Abadi, F. (2024). Classification performance comparison of BERT and IndoBERT on self-report of COVID-19 status on social media. Journal of Computer Sciences Institute, 30, 61–67. https://doi.org/10.35784/jcsi.5564

Cortes, C., & Vapnik, V. (2019). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018

Gaye, B., Zhang, D., & Wulamu, A. (2021). Improvement of Support Vector Machine Algorithm in Big Data Background. Mathematical Problems in Engineering, 2021, 1–9. https://doi.org/10.1155/2021/5594899

Geetha, M., & Renuka, D. K. (2021). Improving the performance of aspect-based sentiment analysis using fine-tuned BERT Base Uncased model. International Journal of Intelligent Networks, 2, 64–69. https://doi.org/10.1016/j.ijin.2021.06.005

Goldberg, Y. (2017). Neural network methods for natural language processing. Morgan & Claypool Publishers. https://doi.org/10.2200/S00762ED1V01Y201703HLT037

Koto, F., Lau, J. H., & Baldwin, T. (2020). IndoBERT: A pre-trained language model for Indonesian. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 757–770).

Liu, B. (2020). Introduction. In Sentiment analysis: Mining opinions, sentiments, and emotions (pp. 1–17). Cambridge University Press.

Maulani, I., Fatichah, C., & Wijaya, A. Y. (2024). Klasifikasi ulasan berdasarkan divisi pada Google Play menggunakan metode Hierarchical Dirichlet Process dan metode Ensemble [Review classification based on divisions on Google Play using the Hierarchical Dirichlet Process and Ensemble methods]. ILKOMNIKA: Journal of Computer Science and Applied Informatics, 6(1), 30–42. https://doi.org/10.28926/ilkomnika.v6i1.596

Merdiansah, R., Siska, & Ridha, A. A. (2024). Analisis sentimen pengguna X Indonesia terkait kendaraan listrik menggunakan IndoBERT [Sentiment analysis of Indonesian X users regarding electric vehicles using IndoBERT]. Jurnal Informatika Komputer, 9(2), 100–110. https://ejournal.sisfokomtek.org/index.php/jikom/article/view/2895/2044

Parengkuan, S., & Nurhasanah, N. (2021). Analisis komparatif preferensi konsumen dalam belanja online [Comparative analysis of consumer preferences in online shopping]. Jurnal Ekonomi: Journal of Economic, 12(2). https://doi.org/10.47007/jeko.v12i02.4345

Pavithra, C. B., & Savitha, J. (2024). Topic modeling for evolving textual data using LDA, HDP, NMF, BERTopic, and DTM with a focus on research papers. Journal of Machine Learning Research, 12(1), 45–57.

Sarang, P. (2023). Support Vector Machines (pp. 153–165). https://doi.org/10.1007/978-3-031-02363-7_8

Taufiq Dwi Purnomo, & Joko Sutopo. (2024). Comparison Of Pre-Trained Bert-Based Transformer Models for Regional Language Text Sentiment Analysis in Indonesia. International Journal Science and Technology, 3(3), 11–21. https://doi.org/10.56127/ijst.v3i3.1739

Downloads

Published

2025-09-30

How to Cite

Octarini, S. P., Zakiyyah, A. Y., & Purwandari, K. (2025). Comparison of IndoBERT and SVM Algorithm to Perform Aspect Based Sentiment Analysis using Hierarchical Dirichlet Process. Engineering, MAthematics and Computer Science Journal (EMACS), 7(3), 363–370. https://doi.org/10.21512/emacsjournal.v7i3.13493

Issue

Section

Articles
Abstract 42  .
PDF downloaded 13  .