YOLOv8-Based Distance Estimation for Blind Navigation: Performance Comparison of OpenCV and Coordinate Attention Techniques

Erwin Syahrudin; Ema Utami; Anggit Dwi Hartanto

doi:10.21512/commit.v19i1.11820

Authors

Erwin Syahrudin Universitas Amikom Yogyakarta
Ema Utami Universitas Amikom Yogyakarta
Anggit Dwi Hartanto Universitas Amikom Yogyakarta

DOI:

https://doi.org/10.21512/commit.v19i1.11820

Keywords:

Computer Vision, YOLOv8, OpenCV, Coordinate Attention Weighting, Blind People

Abstract

Blindness presents a significant challenge in the development of assistive technologies, particularly for navigation, as it requires accurate distance perception to enable effective mobility for the visually impaired. The research addresses this issue by evaluating and comparing the performance of the YOLOv8 model integrated with OpenCV and the Coordinate Attention Weighting (CAW) technique for distance estimation in blind navigation systems. The main research objective is to improve distance estimation accuracy without the need for additional sensors. Initially, YOLOv8 with OpenCV shows less optimal results, prompting efforts to enhance its performance to surpass the effectiveness of CAW, while maintaining a sensor-free solution. The research then integrates YOLOv8 with OpenCV for baseline comparison and applies CAW for advanced feature attention in the distance estimation process. The research also integrates mathematical formulations for camera calibration and depth estimation, utilizing techniques such as triangulation and reprojection to refine the accuracy of object distance prediction. The results show that improved YOLOv8 + OpenCV significantly outperforms original YOLOv8 + OpenCV, with reduced Mean Squared Errors (MSE) across various distance intervals (0-1 m, 1-2 m, 2-3 m, 3-4 m, and 4-5 m). YOLOv8 + CAW also shows improvement compared to the original YOLOv8 + OpenCV but does not surpass the performance of the improved OpenCV integration. These findings demonstrate the potential of refined computer vision techniques in achieving high-accuracy and sensorfree distance estimation, enhancing real-time navigation systems for the blind. The research paves the way for further advancements in the development of accessible and reliable navigation technologies for the visually impaired.

Dimensions

Plum Analytics

Author Biographies

Erwin Syahrudin, Universitas Amikom Yogyakarta

Master of Informatics, Faculty of Computer Science

Ema Utami, Universitas Amikom Yogyakarta

Master of Informatics, Faculty of Computer Science

Anggit Dwi Hartanto, Universitas Amikom Yogyakarta

Master of Informatics, Faculty of Computer Science

References

[1] A. Budrionis, D. Plikynas, P. Daniuˇsis, and A. Indrulionis, “Smartphone-based computer vision travelling aids for blind and visually impaired individuals: A systematic review,” Assistive Technology, vol. 34, no. 2, pp. 178–194, 2022.

[2] D. Cos¸kun, D. Karabo˘ga, A. Bas¸t¨urk, B. Akay, O¨ . U. Nalbantog˘lu, S. Dog˘an, I˙. Pac¸al, and M. A. Karag¨oz, “A comparative study of YOLO models and a transformer-based YOLOv5 model for mass detection in mammograms,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 31, no. 7, pp. 1294–1313, 2023.

[3] D. S. Bacea and F. Oniga, “Single stage architecture for improved accuracy real-time object detection on mobile devices,” Image and Vision Computing, vol. 130, pp. 1–9, 2023.

[4] M. A. Rahman, S. Siddika, M. A. Al-Baky, and M. J. Mia, “An automated navigation system for blind people,” Bulletin of Electrical Engineering and Informatics, vol. 11, no. 1, pp. 201–212, 2022.

[5] ——, “An automated navigation system for blind people,” Bulletin of Electrical Engineering and Informatics, vol. 11, no. 1, pp. 201–212, 2022.

[6] S. Khan, S. Nazir, and H. U. Khan, “Analysis of navigation assistants for blind and visually impaired people: A systematic review,” IEEE Access, vol. 9, pp. 26 712–26 734, 2021.

[7] F. E. Z. El-Taher, A. Taha, J. Courtney, and S. Mckeever, “A systematic review of urban navigation systems for visually impaired people,” Sensors, vol. 21, no. 9, pp. 1–35, 2021.

[8] J. Hwang, K. H. Kim, J. G. Hwang, S. Jun, J. Yu, and C. Lee, “Technological opportunity analysis: Assistive technology for blind and visually impaired people,” Sustainability, vol. 12, no. 20, pp. 1–17, 2020.

[9] R. B. Islam, S. Akhter, F. Iqbal, M. S. U. Rahman, and R. Khan, “Deep learning based object detection and surrounding environment description for visually impaired people,” Heliyon, vol. 9, no. 6, pp. 1–19, 2023.

[10] S. Sun, B. Mo, J. Xu, D. Li, J. Zhao, and S. Han, “Multi-YOLOv8: An infrared moving small object detection model based on YOLOv8 for air vehicle,” Neurocomputing, vol. 588, pp. 1–24, 2024.

[11] Z. J. Khow, Y. F. Tan, H. A. Karim, and H. A. A. Rashid, “Improved YOLOv8 model for a comprehensive approach to object detection and distance estimation,” IEEE Access, vol. 12, pp. 63 754–63 767, 2024.

[12] M. Wang, Y. Jiang, C. Li, and M. Yang, “A new data processing model for distributed urban stagnant analysis based on improved YOLOv5 and OpenCV,” 2023. [Online]. Available: https://www.preprints.org/manuscript/202305.1746/v1

[13] M. Shoeb, M. A. Ali, M. Shadeel, and M. A. Bari, “Self-driving car: Using OpenCV2 and machine learning,” The International Journal of Analytical and Experimental Modal Analysis (IJAEMA), vol. XIV, no. V, pp. 325–330, 2022.

[14] B. R. Patel, S. A. Goswami, P. S. KaPatel, and Y. M. Dhakad, “Realtime object’s size measurement from distance using OpenCV and LiDAR,” Turkish Journal of Computer and Mathematics Education, vol. 12, no. 4, pp. 1044–1047, 2021.

[15] M. Zha, W. Qian, W. Yi, and J. Hua, “A lightweight YOLOv4-based forestry pest detection method using coordinate attention and feature fusion,” Entropy, vol. 23, no. 12, pp. 1–18, 2021.

[16] Z. Li, M. Xue, Y. Cui, B. Liu, R. Fu, H. Chen, and F. Ju, “Lightweight 2D human pose estimation based on joint channel coordinate attention mechanism,” Electronics, vol. 13, no. 1, pp. 1–16, 2023.

[17] R. Wang, F. Liang, B. Wang, and X. Mou, “ODCA-YOLO: An omni-dynamic convolution coordinate attention-based YOLO for wood defect detection,” Forests, vol. 14, no. 9, pp. 1–18, 2023.

[18] F. Xie, B. Lin, and Y. Liu, “Research on the coordinate attention mechanism fuse in a YOLOv5 deep learning detector for the SAR ship detection task,” Sensors, vol. 22, no. 9, pp. 1–16, 2022.

[19] J. Wu, J. Dong, W. Nie, and Z. Ye, “A lightweight YOLOv5 optimization of coordinate attention,” Applied Sciences, vol. 13, no. 3, pp. 1–13, 2023.

[20] C. Xie, H. Zhu, and Y. Fei, “Deep coordinate attention network for single image superresolution,” IET Image Processing, vol. 16, no. 1, pp. 273–284, 2022.

[21] Z. Hu, X. Zhao, J. Zhang, S. Ba, Z. Zhao, and X. Wang, “Parameter calibration and verification of elastoplastic wet sand based on attentionretention fusion deep learning mechanism,” Applied Sciences, vol. 14, no. 16, pp. 1–23, 2024.

[22] X. Yu, J. Liu, Y. Lu, S. Funahashi, T. Murai, J. Wu, Q. Li, and Z. Zhang, “Early diagnosis of Alzheimer’s disease using a group selfcalibrated coordinate attention network based on multimodal MRI,” Scientific Reports, vol. 14, no. 1, pp. 1–18, 2024.

[23] J. Kaur andW. Singh, “Tools, techniques, datasets and application areas for object detection in an image: A review,” Multimedia Tools and Applications, vol. 81, no. 27, pp. 38 297–38 351, 2022.

[24] B. Strbac, M. Gostovic, Z. Lukac, and D. Samardzija, “YOLO multi-camera object detection and distance estimation,” in 2020 Zooming Innovation in Consumer Technologies Conference (ZINC). Novi Sad, Serbia: IEEE, May 26–27, 2020, pp. 26–30.

[25] M. Vajgl, P. Hurtik, and T. Nejezchleba, “Dist-YOLO: Fast object detection with distance estimation,” Applied Sciences, vol. 12, no. 3, pp.1–13, 2022.

[26] A. B. Abadi and S. Tahcfulloh, “Digital image processing for height measurement application based on Python OpenCV and regression analysis,” JOIV: International Journal on Informatics Visualization, vol. 6, no. 4, pp. 763–770, 2022.

[27] H. Varc¸in, F. U¨ nes¸, E. Gemici, and M. Zelenakova, “Development of a three-dimensional CFD model and OpenCV code by comparing with experimental data for spillway model studies,” Water, vol. 15, no. 4, pp. 1–31, 2023.

[28] S. M. Robeson and C. J. Willmott, “Decomposition of the Mean Absolute Error (MAE) into systematic and unsystematic components,” PLOS ONE, vol. 18, no. 2, pp. 1–8, 2023.

[29] D. Ganga, V. Bharath, P. N. Sri, T. Tulasi, and S. K. Sharook, “Social distance detector using OpenCV YOLO, CNN algorithm in deep learning,” ZKG International, vol. 8, pp. 893–897.

[30] Z. Sun, M. Lin, X. Sun, Z. Tan, H. Li, and R. Jin, “MAE-DET: Revisiting maximum entropy principle in zero-shot NAS for efficient object detection,” 2021. [Online]. Available: https://arxiv.org/abs/2111.13336

[31] W. Li, J. Liu, and H. Mei, “Lightweight convolutional neural network for aircraft small target real-time detection in airport videos in complex scenes,” Scientific Reports, vol. 12, pp. 1–12, 2022.

[32] F. Hao, J. Wu, H. Lu, J. Du, J. Xu, and X. Xu, “Large coordinate kernel attention network for lightweight image super-resolution,” 2024. [Online]. Available: https://arxiv.org/abs/2405.09353

[33] R. P. Duarte, C. Cunha, and J. C. Pereira Cardoso, “Automatic camera calibration using a single image to extract intrinsic and extrinsic parameters,” International Journal of Intelligent systems and Applications in Engineering, vol. 12, no. 3, pp.1766–1778, 2024.

[34] C. Gao, G. Jiang, J. Gao, and P. Li, “A rapid method for obtaining camera intrinsic parameters based on a stereo calibration target,” Optica Open, 2024.

[35] Y. Li, “A calibration method of computer vision system based on dual attention mechanism,” Image and Vision Computing, vol. 103, 2020.

[36] M. Ahmed, B. Rasheed, H. Salloum, M. Hegazy, M. R. Bahrami, and M. Chuchkalov, “Seal pipeline: Enhancing dynamic object detection and tracking for autonomous unmanned surface vehicles in maritime environments,” Drones, vol. 8, no. 10, pp. 1–30, 2024.

[37] Z. Elri and A. Erg¨uzen, “Advancing defense capabilities through integration of electro-optical systems and computer vision technologies,” Journal of Computer & Electrical and Electronics Engineering Sciences, vol. 2, no. 1, pp. 17–24, 2024.

[38] X. Sun, Y. Jiang, Y. Ji, W. Fu, S. Yan, Q. Chen, B. Yu, and X. Gan, “Distance measurement system based on binocular stereo vision,” in IOP Conference Series: Earth and Environmental Science, vol. 252, no. 5. IOP Publishing, 2019, pp.1–7.

[39] S. Daram, P. S. R. R. Mallidi, S. Vangipuram, P. S. Karpe, and L. O. G., “Advancing public transit security via OpenCV powered facial recognition & cloud authentication framework,” 2024. [Online]. Available: https://papers.ssrn.com/sol3/papers.cfm?abstract id=4818830

[40] H. Zhu, P. Zhang, L.Wang, X. Zhang, and L. Jiao, “A multiscale object detection approach for remote

sensing images based on MSE-DenseNet and the dynamic anchor assignment,” Remote Sensing Letters, vol. 10, no. 10, pp. 959–967, 2019.

[41] P. Costa and I. Ruan, “Optimization of geometric measures of sets of moving objects,” Master’s thesis, University of Manitoba, 2024.

[42] W. Zhao, L. Yan, and Y. Zhang, “Geometricconstrained multi-view image matching method based on semi-global optimization,” Geo-Spatial Information Science, vol. 21, no. 2, pp. 115–126, 2018.