Comparative Analysis of LSTM and Bi-LSTM for Classifying Indonesian New Translation Bible Texts Using Word2Vec Embedding
DOI:
https://doi.org/10.21512/commit.v19i2.12015Keywords:
Text Classification, Indonesian New Translation Bible, Long Short Term Memory (LSTM), Bidirectional-Long Short Term Memory (Bi-LSTM), Word2VecAbstract
The research aims to compare the classification accuracy of Long Short-Term Memory (LSTM) and Bidirectional LSTM (Bi-LSTM) architectures in classifying Indonesian New Translation Bible texts with Word2Vec embedding. The main objective was to examine how these deep learning models addressed complex and context-rich Indonesian biblical text classification, particularly between Old Testament and New Testament. The dataset used during the analysis contained 31,102 labeled verses with Word2Vec embedding calculated through both Continuous Bag of Words (CBOW) and Skip-Gram methods. Furthermore, the models were evaluated based on accuracy, precision, recall, and F1-score metrics. The results show Bi-LSTM performing better than LSTM in all cases, with the best performance being 92.31% when using Skip-Gram embedding at a vector size of 300 versus 91.94% by LSTM. The model demonstrates its ability to resolve semantic ambiguities in context-rich texts, such as the Bible. The research contributes to the text classification discipline through the provision of empirical evidence of the advantages of Bi-LSTM in managing biblical text processing as well as finding the optimal model architecture mix with word embedding methods. Additionally, the analysis also shows that future studies have to explore other embeddings, including GloVe (Global Vector) and FastText, or use transformer-based models such as Bidirectional Encoder Representations from Transformers (BERT) to improve the classification accuracy.
Plum Analytics
References
[1] E. Saragih, “Analysis of grammatical metaphors in doctrinal verses of Alkitab Terjemahan Baru 1974,” in Conference of ELT, Linguistics, Literature and Translation ICELT5th Graduate Schoolof NHU, North Sumatera, Indonesia, February 2018, pp. 84–96.
[2] M. Martinjak, D. Lauc, and I. Skelac, “Towards analysis of biblical entities and names using deep learning,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 5, pp. 491–497, 2023.
[3] L. Deng and Y. Liu, Deep learning in natural language processing. Springer, 2018.
[4] J. Kumar, R. Goomer, and A. K. Singh, “Long Short Term Memory Recurrent Neural Network (LSTM-RNN) based workload forecasting model for cloud datacenters,” Procedia computer science, vol. 125, pp. 676–682, 2018.
[5] E. M. Kelley, “The principles, process, and purpose of the canon of scripture,” Diligence: Journal of the Liberty University Online Religion Capstone in Research and Scholarship, vol. 5, no. 1, pp. 1–27, 2020.
[6] Lembaga Alkitab Indonesia, “Meluruskan fakta yang dipelintir tentang terjemahan LAI,” 2024. [Online]. Available: http://bit.ly/3VsXKDx
[7] A. Tarlam, “Hermeneutik dan kritik Bible,” ALKAINAH: Journal of Islamic Studies, vol. 1, no. 2, pp. 103–118, 2022.
[8] S. Cahyawijaya, G. I. Winata, B. Wilie, K. Vincentio, X. Li, A. Kuncoro, S. Ruder, Z. Y. Lim, S. Bahar, M. L. Khodra, A. Purwarianti, and P. Fung, “IndoNLG: Benchmark and resources for evaluating Indonesian natural language generation,” 2021. [Online]. Available: http://arxiv.org/abs/2104.08200
[9] M. N. A. D., I. Godbole, P. M. Kapparad, and S. Bhattacharjee, “Comparative analysis of religious texts: NLP approaches to the Bible, Quran, and Bhagavad Gita,” in Proceedings of the New Horizons in Computational Linguistics for Religious Texts. Abu Dhabi, UAE: Association for Computational Linguistics, 2025, pp. 1–10.
[10] H. J. De Jonge, “Erasmus’s translation of the New Testament: Aim and method,” The Bible Translator, vol. 67, no. 1, pp. 29–41, 2016.
[11] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” 2013. [Online]. Available: http://arxiv.org/abs/1301.3781
[12] S. Sivakumar, L. S. Videla, T. R. Kumar, J. Nagaraj, S. Itnal, and D. Haritha, “Review on Word2vec word embedding neural net,” in 2020 International Conference on Smart Electronics and Communication (ICOSEC). Trichy, India: IEEE, Sep. 10–12, 2020, pp. 282–290.
[13] S. H. Lee, H. Lee, and J. H. Kim, “Enhancing the prediction of user satisfaction with Metaverse service through machine learning,” Computers, Materials & Continua, vol. 72, no. 3, pp. 4983–4997, 2022.
[14] Q. Li, H. Peng, J. Li, C. Xia, R. Yang, L. Sun, P. S. Yu, and L. He, “A survey on text classification: From traditional to deep learning,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 13, no. 2, pp. 1–41, 2022.
[15] I. R. Hendrawan, E. Utami, and A. D. Hartanto, “Analisis perbandingan metode TF-IDF dan Word2vec pada klasifikasi teks sentimen masyarakat terhadap produk lokal di Indonesia,” Smart Comp: Jurnalnya Orang Pintar Komputer, vol. 11, no. 3, pp. 497–503, 2022.
[16] A. Suryadibrata, J. C. Young et al., “Embedding from Language Models (ELMos)-based Dependency Parser for Indonesian language,” International Journal of Advances in Soft Computing & Its Applications, vol. 13, no. 3, pp. 2–11, 2021.
[17] M. Rhanoui, S. Yousfi, M. Mikram, and H. Merizak, “Forecasting financial budget time series: ARIMA random walk vs LSTM neural network,” IAES International Journal of Artificial Intelligence, vol. 8, no. 4, pp. 317–327, 2019.
[18] D. T. Hermanto, A. Setyanto, and E. T. Luthfi, “Algoritma LSTM-CNN untuk binary klasifikasi dengan Word2vec pada media online,” Creative Information Technology Journal, vol. 8, no. 1, pp. 64–77, 2021.
[19] Y. Zhang, Q. Liu, and L. Song, “Sentence-state LSTM for text representation,” 2018. [Online]. Available: http://arxiv.org/abs/1805.02474
[20] B. Jang, M. Kim, G. Harerimana, S. U. Kang, and J. W. Kim, “Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism,” Applied Sciences, vol. 10, no. 17, pp. 1–14, 2020.
[21] J. L. C. Yew, “Comparative study on SVM and Bi-LSTM combined with Word2vec and TFIDF on the sentiment analysis of movie,” Ph.D. dissertation, Tilburg University, thesis.
[22] A. M. Almars, “Attention-based Bi-LSTM model for Arabic depression classification,” Computers, Materials & Continua, vol. 71, no. 2, pp. 3091–3106, 2022.
[23] S. F. Sabbeh and H. A. Fasihuddin, “A comparative analysis of word embedding and deep learning for Arabic sentiment classification,” Electronics, vol. 12, no. 6, pp. 1–16, 2023.
[24] S. Trihandaru, H. A. Parhusip, B. Susanto, and C. F. R. Putri, “Word cloud of UKSW lecturer research competence based on Google Scholar data,” Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, vol. 7, no. 2, pp. 52–59, 2021.
[25] A. G´eron, Hands on machine learning with Scikit-Learn, Keras & TensorFlow. O’Reiley Media, 2019.
[26] D. Qiu, H. Jiang, and S. Chen, “Fuzzy information retrieval based on continuous bag-of-words model,” Symmetry, vol. 12, no. 2, pp. 1–11, 2020.
[27] H. Park and J. Neville, “Generating post-hoc explanations for Skip-gram-based node embeddings by identifying important nodes with bridgeness,” Neural Networks, vol. 164, pp. 546–561, 2023.
[28] J. K. Yi and Y. F. Yao, “Advancing quality assessment in vertical field: Scoring calculation for text inputs to large language models,” Applied Sciences, vol. 14, no. 16, pp. 1–15, 2024.
[29] S. Skansi, Introduction to deep learning: From logical calculus to artificial intelligence. Springer International Publishing, 2018.
[30] B. Lindemann, T. M¨uller, H. Vietz, N. Jazdi, and M. Weyrich, “A survey on Long Short-Term Memory networks for time series prediction,” Procedia Cirp, vol. 99, pp. 650–655, 2021.
[31] M. Kowsher, A. Tahabilder, M. Z. I. Sanjid, N. J. Prottasha, M. S. Uddin, M. A. Hossain, and M. A. K. Jilani, “LSTM-ANN & BiLSTM-ANN: Hybrid deep learning models for enhanced classification accuracy,” Procedia Computer Science, vol. 193, pp. 131–140, 2021.
[32] D. Krstini´c, M. Braovi´c, L. ˇ Seri´c, and D. Boˇzi´c-ˇ Stuli´c, “Multi-label classifier performance evaluation with confusion matrix,” Computer Science & Information Technology, vol. 1, no. 2020, pp. 1–14, 2020.
[33] A. W. Putri, “Implementasi Artificial Neural Network (ANN) backpropagation untuk klasifikasi jenis penyakit pada daun tanaman tomat,” MATHunesa: Jurnal Ilmiah Matematika, vol. 9, no. 2, pp. 344–350, 2021.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Marthen Sattu Sambo, Suryasatriya Trihandaru, Didit Budi Nugroho, Hanna Arini Parhusip

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Â
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)