Advancing Indonesian Audio Emotion Classification: A Comparative Study Using IndoWaveSentiment
DOI:
https://doi.org/10.21512/emacsjournal.v7i2.13415Keywords:
Speech Emotion Recognition, Indonesian speech, IndoWaveSentiment, ensemble learning, acoustic featuresAbstract
This study addresses the critical gap in Indonesian Speech Emotion Recognition (SER) by evaluating machine learning models on the IndoWaveSentiment dataset, a novel corpus of 300 high-fidelity recordings capturing five emotions (neutral, happy, surprised, disgusted, disappointed) from native speakers. The research aims to identify optimal classification techniques and acoustic features for Indonesian SER, given the language’s unique linguistic characteristics and the scarcity of annotated resources. Six models, Logistic Regression, KNN, Gradient Boosting, Random Forest, Naive Bayes, and SVC, were trained on 45 acoustic features, including spectral contrast, MFCCs, and zero crossing rate, extracted using Librosa. Results demonstrated Random Forest as the top performer (90% accuracy), followed by Gradient Boosting (85%) and Logistic Regression (75%), with spectral contrast (contrast2, contrast7) and MFCC1 emerging as the most discriminative features. The findings highlight the efficacy of ensemble methods in capturing nuanced emotional cues in Indonesian speech, outperforming prior studies on locally sourced datasets. Practical implications include applications in customer service analytics and mental health tools, though limitations such as the dataset’s-controlled conditions and fixed sentence structure necessitate caution in real-world deployment. Future work should expand the dataset to include regional dialects, spontaneous speech, and hybrid architectures like CNN-LSTMs. This study establishes foundational benchmarks for Indonesian SER, advocating for culturally informed models to enhance human-computer interaction in underrepresented linguistic contexts.
Plum Analytics
References
Aini, Y. K., Santoso, T. B., & Dutono, T. (2021). Pemodelan CNN Untuk Deteksi Emosi Berbasis Speech Bahasa Indonesia. Jurnal Komputer Terapan, 7(1). https://doi.org/10.35143/jkt.v7i1.4623
Akçay, M. B., & Oğuz, K. (2020). Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. In Speech Communication (Vol. 116). https://doi.org/10.1016/j.specom.2019.12.001
Akinpelu, S., & Viriri, S. (2023). Speech emotion classification using attention based network and regularized feature selection. Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-38868-2
Bustamin, A., Rizky, A. M., Warni, E., Areni, I. S., & Indrabayu. (2024). IndoWaveSentiment: Indonesian audio dataset for emotion classification. Data in Brief, 57, 111138. https://doi.org/https://doi.org/10.1016/j.dib.2024.111138
Caschera, M. C., Grifoni, P., & Ferri, F. (2022). Emotion Classification from Speech and Text in Videos Using a Multimodal Approach. Multimodal Technologies and Interaction, 6(4). https://doi.org/10.3390/mti6040028
Choudhary, R. R., Meena, G., & Mohbey, K. K. (2022). Speech Emotion Based Sentiment Recognition using Deep Neural Networks. Journal of Physics: Conference Series, 2236(1). https://doi.org/10.1088/1742-6596/2236/1/012003
Hidajat, M., Supria, Luwinda, F. A., & Sanjaya, H. (2019). Emotional Speech Classification Application Development Using Android Mobile Applications. 2019 International Conference on Information Management and Technology (ICIMTech), 400–403. https://doi.org/10.1109/ICIMTech.2019.8843816
Kumala, O. U., & Zahra, A. (2021). Indonesian Speech Emotion Recognition using Cross-Corpus Method with the Combination of MFCC and Teager Energy Features. International Journal of Advanced Computer Science and Applications, 12(4). https://doi.org/10.14569/IJACSA.2021.0120422
Luis Felipe Parra-Gallego, & Juan Rafael Orozco-Arroyave. (2023). Classification of Emotions and Evaluation of Customer Satisfaction from Speech in Real World Acoustic Environments. International Journal For Multidisciplinary Research, 5(3). https://doi.org/10.36948/ijfmr.2023.v05i03.4166
Minor, K. (2025). Developing Algorithm of Music Concepts and Operations Using The Modular Arithmetic. Engineering, MAthematics and Computer Science Journal (EMACS), 7(1), 51–59. https://doi.org/10.21512/emacsjournal.v7i1.12562
Minor, K. A., & Kartowisastro, I. H. (2022). Automatic Music Transcription Using Fourier Transform for Monophonic and Polyphonic Audio File. Ingénierie Des Systèmes d Information, 27(4), 629–635. https://doi.org/10.18280/isi.270413
Nath, S., Shahi, A. K., Martin, T., Choudhury, N., & Mandal, R. (2024). A Comparative Study on Speech Emotion Recognition Using Machine Learning. https://doi.org/10.1007/978-981-99-5435-3_5
Wijaya, A. A., Yasmina, I., & Zahra, A. (2021). Indonesian Music Emotion Recognition Based on Audio with Deep Learning Approach. Advances in Science, Technology and Engineering Systems Journal, 6(2), 716–721. https://doi.org/10.25046/aj060283
Wunarso, N. B., & Soelistio, Y. E. (2017). Towards Indonesian speech-emotion automatic recognition (I-SpEAR). Proceedings of 2017 4th International Conference on New Media Studies, CONMEDIA 2017, 2018-January. https://doi.org/10.1109/CONMEDIA.2017.8266038
Zahra, H. N., Ibrohim, M. O., Fahmi, J., Adelia, R., Nur Febryanto, F. A., & Riandi, O. (2020). Speech emotion recognition on indonesian youtube web series using deep learning approach. 2020 5th International Conference on Informatics and Computing, ICIC 2020. https://doi.org/10.1109/ICIC50835.2020.9288650
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Muhammad Rizki Nur Majiid, Karli Eka Setiawan, Prayoga Yudha Pamungkas; Taufiq Annas, Nicholas Lorenzo Setiawan

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)