Fine-Tuning Hybrid Deep Learning for Sentiment Analysis of Indonesian Product Reviews

Arwin Halim; Roni Yunis; Erlina Halim

doi:10.21512/commit.v20i1.13838

Authors

Arwin Halim Universitas Mikroskil
Roni Yunis Universitas Mikroskil
Erlina Halim Universitas Mikroskil

DOI:

https://doi.org/10.21512/commit.v20i1.13838

Keywords:

Convolutional Neural Network (CNN), Hybrid Deep Learning, Long Short-Term Memory (LSTM), Resampling Method, Tree-structure Parzen Estimator (TPE)

Abstract

The research aims to build a hybrid deep learning model for sentiment analysis of Indonesian ecommerce product reviews, which represent the expressed opinions of customers. A major challenge in the domain is the presence of non-standard language and highly imbalanced sentiment classes, which hinder accurate classification. Most existing Indonesian sentiment analysis studies rely on relatively small and balanced datasets and primarily use attention mechanisms, an ensemble model, as well as a sequential fusion method. In the research, a large-scale dataset of Indonesian product reviews is collected from the largest e-commerce site in the country. The dataset consists of review text and corresponding product ratings. After preprocessing, semantic features are extracted using a pre-trained Indonesia Bidirectional Encoder Representations from Transformers (IndoBERT) model. The features are then fed into a hybrid model combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) layers through parallel feature-level fusion. Model hyperparameters are optimized using the Tree-Structured Parzen Estimator (TPE), while data imbalance is addressed through resampling methods. Regularization strategies are also applied to mitigate overfitting, and the model is evaluated using stratified k-fold cross-validation. The model hyperparameters are validated using a learning curve, showing a stable and consistent curve following the trend. The results show that the hybrid CNN-LSTM model, combined with Support Vector Machine Synthetic Minority Oversampling Technique (SVMSMOTE), achieves superior performance in distinguishing positive and negative reviews. This outcome reaches Receiver Operating Characteristic - Area Under the Curve (ROC AUC) score of 92.48%, outperforming baseline and conventional machine learning models. These results also show good generalization ability, characterized by consistent values with a very low standard deviation of 0.0009 for each fold.

Dimensions

Author Biographies

Arwin Halim, Universitas Mikroskil

Department of Informatics Engineering, Faculty of Informatics

Roni Yunis, Universitas Mikroskil

Department of Information Systems, Faculty of Informatics

Erlina Halim, Universitas Mikroskil

Department of Informatics Engineering, Faculty of Informatics

References

[1] H. Huang, A. A. Zavareh, and M. B. Mustafa, “Sentiment analysis in e-commerce platforms: A review of current techniques and future directions,” IEEE Access, vol. 11, pp. 90 367–90 382, 2023.

[2] V. O. Tama, Y. Sibaroni, and Adiwijaya, “Labeling analysis in the classification of product review sentiments by using multinomial Naive Bayes algorithm,” Journal of Physics: Conference Series, vol. 1192, pp. 1–11, 2019.

[3] R. Catelli, S. Pelosi, and M. Esposito, “Lexiconbased vs. BERT-based sentiment analysis: A comparative study in Italian,” Electronics, vol. 11, no. 3, pp. 1–20, 2022.

[4] C. Fiarni, H. Maharani, and R. Pratama, “Sentiment analysis system for Indonesia online retail shop review using hierarchy Naive Bayes technique,” in 2016 4th International Conference on Information and Communication Technology (ICoICT). Bandung, Indonesia: IEEE, May 25–27, 2016, pp. 1–6.

[5] A. Daza, N. D. G. Rueda, M. S. A. S´anchez, W. F. R. Esp´ıritu, and M. E. C. Qui˜nones, “Sentiment analysis on e-commerce product reviews using machine learning and deep learning algorithms: A bibliometric analysis, systematic literature review, challenges and future works,” International Journal of Information Management Data Insights, vol. 4, no. 2, pp. 1–20, 2024.

[6] E. Halim, R. Purba, and A. Andri, “Consumer opinion extraction using text mining for product recommendations on e-commerce,” Indonesian Journal of Artificial Intelligence and Data Mining, vol. 4, no. 1, pp. 19–28, 2021.

[7] N. Hayatin, G. I. Marthasari, and L. Nuarini, “Optimization of sentiment analysis for indonesian presidential election using Na¨ıve Bayes and Particle Swarm Optimization,” Jurnal Online Informatika, vol. 5, no. 1, pp. 81–88, 2020.

[8] A. Romadhony, S. Al Faraby, R. Rismala, U. N. Wisesti, and A. Arifianto, “Sentiment analysis on a large indonesian product review dataset.” Journal of Information Systems Engineering & Business Intelligence, vol. 10, no. 1, pp. 167–178, 2024.

[9] J. H. Computer, S. M. Honova, V. P. Computer, C. A. Setiawan, I. H. Parmonangan, and Diana, “Sentiment analysis of skincare product reviews in Indonesian language using IndoBERT and LSTM,” in 2023 IEEE 9th Information Technology International Seminar (ITIS). Batu Malang, Indonesia: IEEE, Oct. 18–20, 2023, pp. 1–6.

[10] H. Ahmadian, T. F. Abidin, H. Riza, and K. Muchtar, “Hybrid models for emotion classification and sentiment analysis in Indonesian language,” Applied Computational Intelligence and Soft Computing, vol. 2024, no. 1, pp. 1–17, 2024.

[11] W. F. Satrya, R. Aprilliyani, and E. H. Yossy, “Sentiment analysis of Indonesian police chief using multi-level ensemble model,” Procedia Computer Science, vol. 216, pp. 620–629, 2023.

[12] R. Kusumaningrum, I. Z. Nisa, R. Jayanto, R. P. Nawangsari, and A. Wibowo, “Deep learningbased application for multilevel sentiment analysis of Indonesian hotel reviews,” Heliyon, vol. 9, no. 6, pp. 1–12, 2023.

[13] A. K. Gogineni, S. K. Sai Reddy, H. Kakarala, Y. C. Gavini, M. P. Venkat, K. Hajarathaiah, and M. K. Enduri, “A hybrid deep learning framework for efficient sentiment analysis,” International Journal of Advanced Computer Science & Applications, vol. 14, no. 12, pp. 1032–1038, 2023.

[14] M. M. Rahman, A. I. Shiplu, Y. Watanobe, and M. A. Alam, “RoBERTa-BiLSTM: A contextaware hybrid model for sentiment analysis,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 9, no. 6, pp. 3788–3805, 2025.

[15] C. H. Lin and U. Nuha, “Sentiment analysis of indonesian datasets based on a hybrid deeplearning strategy,” Journal of Big Data, vol. 10, pp. 1–19, 2023.

[16] TMO Group, “SEA eCommerce: Sales data by country & industry (+ free reports),” 2024. [Online]. Available: https://www.tmogroup.asia/insights/southeast-asia-ecommerce-data-monthly-updates/

[17] A. Hussain, V. Dhanawat, A. Aslam, N. Iqbal, and S. Tripura, “Credit card fraud detection using machine learning techniques: Dealing with imbalanced data using over-sampling and undersampling methods,” in 2024 Beyond Technology Summit on Informatics International Conference (BTS-I2C). East Java, Indonesia: IEEE, Dec. 19, 2024, pp. 676–681.

[18] C. F. G. D. Santos and J. P. Papa, “Avoiding overfitting: A survey on regularization methods for convolutional neural networks,” ACM Computing Surveys (CSUR), vol. 54, no. 10s, pp. 1–25, 2022.

[19] TMO Group, “10 largest online marketplaces in Southeast Asia (2024),” 2024. [Online]. Available: https://www.tmogroup.asia/insights/must-know-southeast-asia-online-marketplaces/

[20] Z. Rahimi and M. M. Homayounpour, “The impact of preprocessing on word embedding quality: A comparative study,” Language Resources and Evaluation, vol. 57, no. 1, pp. 257–291, 2023.

[21] Q. Lu, X. Sun, Y. Long, Z. Gao, J. Feng, and T. Sun, “Sentiment analysis: Comprehensive reviews, recent advances, and open challenges,” IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 11, pp. 15 092–15 112, 2023.

[22] K. Smelyakov, D. Karachevtsev, D. Kulemza, Y. Samoilenko, O. Patlan, and A. Chupryna, “Effectiveness of preprocessing algorithms for natural language processing applications,” in 2020 IEEE International Conference on Problems of Infocommunications. Science and Technology (PIC S&T). Kharkiv, Ukraine: IEEE, Oct. 2–9, 2020, pp. 187–191.

[23] J. Y. B. Yin, N. H. M. Saad, and Z. Yaacob, “Exploring sentiment analysis on e-commerce business: Lazada and Shopee,” TEM Journal, vol. 11, no. 4, pp. 1508–1519, 2022.

[24] A. Singh and J. O’Hagan, “Exploring topic modelling of user reviews as a monitoring mechanism for emergent issues within social VR communities,” 2024. [Online]. Available: https://arxiv.org/abs/2406.03994

[25] A. G. Prabono, “Mpstemmer: A multi-phase stemmer for standard and nonstandard Indonesian words,” 2020. [Online]. Available: https://github.com/ariaghora/mpstemmer

[26] H. Murfi, T. Gowandi, G. Ardaneswari, and S. Nurrohmah, “BERT-based combination of convolutional and recurrent neural network for Indonesian sentiment analysis,” Applied Soft Computing, vol. 151, 2024.

[27] C. Zhou, Q. Li, C. Li, J. Yu, Y. Liu, G. Wang, K. Zhang, C. Ji, Q. Yan, L. He, H. Peng, J. Li, J. Wu, Z. Liu, P. Xie, C. Xiong, J. Pei, P. S. Yu, and L. Sun, “A comprehensive survey on pretrained foundation models: A history from BERT to ChatGPT,” International Journal of Machine Learning and Cybernetics, vol. 16, no. 12, pp. 9851–9915, 2025.

[28] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A benchmark dataset and pre-trained language model for Indonesian NLP,” in Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International Committee on Computational Linguistics, 2020, pp. 757–770.

[29] B. Wilie, K. Vincentio, G. I. Winata, S. Cahyawijaya, X. Li, Z. Y. Lim, S. Soleman, R. Mahendra, P. Fung, S. Bahar, and A. Purwarianti, “IndoNLU: Benchmark and resources for evaluating Indonesian natural language understanding,” in Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing. Suzhou, China: Association for Computational Linguistics, 2020, pp. 843–857.

[30] A. U. Rehman, A. K. Malik, B. Raza, and W. Ali, “A hybrid CNN-LSTM model for improving accuracy of movie reviews sentiment analysis,” Multimedia Tools and Applications, vol. 78, no. 18, pp. 26 597–26 613, 2019.

[31] L. Yang, Y. Li, J. Wang, and R. S. Sherratt, “Sentiment analysis for e-commerce product reviews in Chinese based on sentiment lexicon and deep learning,” IEEE Access, vol. 8, pp. 23 522–23 530, 2020.

[32] X. Wang, W. Jiang, and Z. Luo, “Combination of convolutional and recurrent neural network for sentiment analysis of short texts,” in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. Osaka, Japan: The COLING 2016 Organizing Committee, 2016, pp. 2428–2437.

[33] J. Sangeetha and U. Kumaran, “A hybrid optimization algorithm using BiLSTM structure for sentiment analysis,” Measurement: Sensors, vol. 25, pp. 1–7, 2023.

[34] S. Tao, P. Peng, Y. Li, H. Sun, Q. Li, and H. Wang, “Supervised contrastive representation learning with Tree-Structured Parzen Estimator Bayesian optimization for imbalanced tabular data,” Expert Systems with Applications, vol. 237, 2024.

[35] G. B´ek´esi, L. Barancsuk, and B. Hartmann, “Deep neural network based distribution system state estimation using hyperparameter optimization,” Results in Engineering, vol. 24, pp. 1–14, 2024.

[36] N. Zhou, B. Shang, M. Xu, L. Peng, and G. Feng, “Enhancing photovoltaic power prediction using a CNN-LSTM-attention hybrid model with Bayesian hyperparameter optimization,” Global Energy Interconnection, vol. 7, no. 5, pp. 667–681, 2024.

[37] P. Mooijman, C. Catal, B. Tekinerdogan, A. Lommen, and M. Blokland, “The effects of data balancing approaches: A case study,” Applied Soft Computing, vol. 132, pp. 1–32, 2023.

[38] U. B. Mahadevaswamy and P. Swathi, “Sentiment analysis using bidirectional LSTM network,” Procedia Computer Science, vol. 218, pp. 45–56, 2023.