American Sign Language Translation to Display the Text (Subtitles) using a Convolutional Neural Network

Muhammad Fajar Ramadhan; Samsuryadi Samsuryadi; Anggina Primanita

doi:10.21512/emacsjournal.v6i3.11904

Authors

Muhammad Fajar Ramadhan Sriwijaya University
Samsuryadi Samsuryadi Sriwijaya University
Anggina Primanita Sriwijaya University

DOI:

https://doi.org/10.21512/emacsjournal.v6i3.11904

Keywords:

American Sign Language, DenseNet201, DenseNet201 PyTorch, Translation, Subtitles

Abstract

Sign language is a harmonious combination of hand gestures, postures, and facial expressions. One of the most used and also the most researched Sign Language is American Sign Language (ASL) because it is easier to implement and also more common to apply on a daily basic. More and more research related to American Sign Language aims to make it easier for the speech impaired to communicate with other normal people. Now, American Sign Language research is starting to refer to the vision of computers so that everyone in the world can easily understand American Sign Language through machine learning. Technology continues to develop sign language translation, especially American Sign Language using the Convolutional Neural Network. This study uses the Densenet201 and DenseNet201 PyTorch architectures to translate American Sign Language, then display the translation into written form on a monitor screen. There are 4 comparisons of data splits, namely 90:10, 80:20, 70:30, and 60:30. The results showed the best results on DenseNet201 PyTorch in the train-test dataset comparison of 70:30 with an accuracy of 0.99732, precision of 0.99737, recall (sensitivity) of 0.99732, specificity of 0.99990, F1-score of 0.99731, and error of 0.00268. The results of the translation of American Sign Language into written form were successfully carried out by performance evaluation using ROUGE-1 and ROUGE-L resulting in a precision of 0.14286, Recall (sensitivity) 0.14286, and F1-score.

Dimensions

Plum Analytics

Author Biographies

Muhammad Fajar Ramadhan, Sriwijaya University

Master's Study Program - Computer Science

Samsuryadi Samsuryadi, Sriwijaya University

Master's Study Program - Computer Science

Anggina Primanita, Sriwijaya University

Master's Study Program - Computer Science

References

Abdullahi, S. B. (2022). American Sign Language Words Recognition Using Spatiooral Prosodic and Angle Features: A Sequential Learning Approach. IEEE Access, 10, 15911â€“15923. https://doi.org/10.1109/ACCESS.2022.3148132

Abdullahi, S. B., & Chamnongthai, K. (2022). American Sign Language Words Recognition Using Spatiooral Prosodic and Angle Features: A Sequential Learning Approach. IEEE Access, 10, 15911â€“15923. https://doi.org/10.1109/ACCESS.2022.3148132

Alamsyah, D., & Pratama, D. (2020). implementasi CNN untuk klasifikasi ekspresi citra wajah pada FER-2013 DATASET. Jurnal Teknologi Informasi, 4(2), 350â€“355.

Ali, A., & Kim, Y. G. (2020). Deep Fusion for 3D Gaze Estimation from Natural Face Images Using Multi-Stream CNNs. IEEE Access, 8, 69212â€“69221. https://doi.org/10.1109/ACCESS.2020.2986815

Alshomrani, S. (2021). Arabic and American Sign Languages Alphabet Recognition by Convolutional Neural Network. Advances in Science and Technology Research Journal, 15(4), 136â€“148. https://doi.org/10.12913/22998624/142012

Aly, W. (2019). User-independent american sign language alphabet recognition based on depth image and PCANet features. IEEE Access, 7, 123138â€“123150. https://doi.org/10.1109/ACCESS.2019.2938829

Delpreto, J., Hughes, J., Dâ€™Aria, M., De Fazio, M., & Rus, D. (2022). A Wearable Smart Glove and Its Application of Pose and Gesture Detection to Sign Language Classification. IEEE Robotics and Automation Letters, 7(4), 10589â€“10596. https://doi.org/10.1109/LRA.2022.3191232

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-Janua, 2261â€“2269. https://doi.org/10.1109/CVPR.2017.243

Jadhav, S., Chougula, B., Rudrappa, G., Vijapur, N., & Tigadi, A. (2022). GoogLeNet Application towards Gesture Recognition for ASL Character Identification. IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics, ICDCECE 2022, April 2022. https://doi.org/10.1109/ICDCECE53908.2022.9793165

KASAPBAÅžI, A., ELBUSHRA, A. E. A., AL-HARDANEE, O., & YILMAZ, A. (2022). DeepASLR: A CNN based human computer interface for American Sign Language recognition for hearing-impaired individuals. Computer Methods and Programs in Biomedicine Update, 2(December 2021). https://doi.org/10.1016/j.cmpbup.2021.100048

Kenshimov, C., Mukhanov, S., Merembayev, T., & Yedilkhan, D. (2021). A Comparison Of Convolutional Neural Networks For Kazakh Sign Language Recognition. Eastern-European Journal of Enterprise Technologies, 5(2â€“113), 44â€“54. https://doi.org/10.15587/1729-4061.2021.241535

Kouvakis, V., Trevlakis, S. E., & Boulogeorgos, A. A. A. (2024). Semantic Communications for Image-Based Sign Language Transmission. IEEE Open Journal of the Communications Society, 5(January), 1088â€“1100. https://doi.org/10.1109/OJCOMS.2024.3360191

Lian, J., Dong, P., Zhang, Y., Pan, J., & Liu, K. (2020). A novel data-driven tropical cyclone track prediction model based on CNN and GRU with multi-dimensional feature selection. IEEE Access, 8, 97114â€“97128. https://doi.org/10.1109/ACCESS.2020.2992083

Lin, C.-Y. (1971). ROUGE: A Package for Automatic Evaluation of Summaries. Japanese Circulation Journal, 34(12), 1213â€“1220. https://doi.org/10.1253/jcj.34.1213

Lu, P. J., & Chuang, J. H. (2022). Fusion of Multi-Intensity Image for Deep Learning-Based Human and Face Detection. IEEE Access, 10, 8816â€“8823. https://doi.org/10.1109/ACCESS.2022.3143536

Marjusalinah, A. D. (2021). Klasifikasi Finger Spelling American Sign Language Menggunakan Convolutional Neural Network. Sriwijaya University.

Marjusalinah, A. D., Samsuryadi, S., & Buchari, M. A. (2021). Classification of Finger Spelling American Sign Language Using Convolutional Neural Network. Computer Engineering and Applications Journal, 10(2), 93â€“103. https://doi.org/10.18495/comengapp.v10i2.377

Myagila, K., & Kilavo, H. (2022). A Comparative Study on Performance of SVM and CNN in Tanzania Sign Language Translation Using Image Recognition. Applied Artificial Intelligence, 36(1). https://doi.org/10.1080/08839514.2021.2005297

Prajwal, K. R., Bull, H., Momeni, L., Albanie, S., Varol, G., & Zisserman, A. (2022). Weakly-supervised Fingerspelling Recognition in British Sign Language Videos. BMVC 2022 - 33rd British Machine Vision Conference Proceedings, 1â€“19.

Qin, Y., Pan, S., Zhou, W., Pan, D., & Li, Z. (2023). WiASL: American Sign Language writing recognition system using commercial WiFi devices. Measurement: Journal of the International Measurement Confederation, 218(March), 113125. https://doi.org/10.1016/j.measurement.2023.113125

Saleh, A. B. U., Miah, M., & Hasan, A. L. M. (2024). Sign Language Recognition Using Graph and General Deep Neural Network Based on Large Scale Dataset. IEEE Access, 12(January), 34553â€“34569. https://doi.org/10.1109/ACCESS.2024.3372425

Sharma, S., & Kumar, K. (2021). ASL-3DCNN: American sign language recognition technique using 3-D convolutional neural networks. Multimedia Tools and Applications, 80(17), 26319â€“26331. https://doi.org/10.1007/s11042-021-10768-5

Sharma, S., Kumar, K., & Singh, N. (2020). Deep Eigen Space Based ASL Recognition System. IETE Journal of Research, 0(0), 1â€“11. https://doi.org/10.1080/03772063.2020.1780164

Sofia Saidah, Suparta, I. P. Y. N., & Suhartono, E. (2022). Modifikasi Convolutional Neural Network Arsitektur GoogLeNet dengan Dull Razor Filtering untuk Klasifikasi Kanker Kulit. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi, 11(2), 148â€“153. https://doi.org/10.22146/jnteti.v11i2.2739

Wang, Y., Lu, S., & Harter, D. (2021). Towards Collaborative and Intelligent Learning Environments Based on Eye Tracking Data and Learning Analytics: A Survey. IEEE Access, 9, 137991â€“138002. https://doi.org/10.1109/ACCESS.2021.3117780

Zhang, X., Chang, Z., & Wang, Y. (2020). Multi-model method decentralized adaptive control for a class of discrete-time multi-agent systems. IEEE Access, 8, 193717â€“193727. https://doi.org/10.1109/ACCESS.2020.3030635

Zhu, Q., Zhang, P., Wang, Z., & Ye, X. (2020). A New Loss Function for CNN Classifier Based on Predefined Evenly-Distributed Class Centroids. IEEE Access, 8, 10888â€“10895. https://doi.org/10.1109/ACCESS.2019.2960065

Zhu, S., Lv, X., Feng, X., Lin, J., Jin, P., & Gao, L. (2020). Plenoptic Face Presentation Attack Detection. IEEE Access, 8, 59007â€“59014. https://doi.org/10.1109/ACCESS.2020.2980755