The Development of Indoor Object Recognition Tool for People with Low Vision and Blindness

Rhio Sutoyo; Andry Chowanda

doi:10.21512/comtech.v8i2.3763

Authors

Rhio Sutoyo Bina Nusantara University
Andry Chowanda Bina Nusantara University

DOI:

https://doi.org/10.21512/comtech.v8i2.3763

Keywords:

object recognition, computer vision, tool, blindness, low vision

Abstract

The purpose of this research was to develop methods and algorithms that could be applied as the underlying base for developing an object recognition tools. The method implemented in this research was initial problem identification, methods and algorithms testing and development, image database modeling, system development, and training and testing. As a result, the system can perform with 93,46% of accuracy for indoor object recognition. Even though the system achieves relatively high accuracy in recognizing objects, it is still limited to a single object detection and not able to recognize the object in real time.

Dimensions

Plum Analytics

Author Biographies

Rhio Sutoyo, Bina Nusantara University

Computer Science Department, School of Computer Science

Andry Chowanda, Bina Nusantara University

Computer Science Department, School of Computer Science

References

Anjum, S. (2012). Place recognition for indoor blind navigation. Retrieved February 23rd, 2017 from http://www.cs.cmu.edu/afs/cs.cmu.edu/user/mjs/ftp/

thesis-program/2011/theses/qatar-anjum.pdf

Bujacz, M., & Strumillo, P. (2006). Stereophonic representation of virtual 3D scenes-a simulated mobility aid for the blind. New Trends in Audio and Video, 1, 157-162.

CALTECH. (2017). Caltech256: Image datasets. Retrieved February 22nd, 2017 from http://www.vision.caltech.

edu/Image_Datasets/Caltech256/images/

Carreira, J., Li, F., & Sminchisescu, C. (2012). Object recognition by sequential figure-ground ranking. International Journal of Computer Vision, 98(3), 243-262.

Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.

Chowanda, A., Blanchfield, P., Flintham, M., & Valstar, M. (2014). Erisa: Building emotionally realistic social game-agents companions. In International Conference on Intelligent Virtual Agents (pp. 134-143). Springer International Publishing.

Chowanda, A., Blanchfield, P., Flintham, M., & Valstar, M. (2016). Computational models of emotion, personality, and social relationships for interactions in games. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems (pp. 1343-1344). International Foundation

for Autonomous Agents and Multiagent Systems.

Divvala, S. K., Hoiem, D., Hays, J. H., Efros, A. A., & Hebert, M. (2009). An empirical study of context in object detection. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. (pp. 1271-1278).

Farhadi, A., Endres, I., Hoiem, D., & Forsyth, D. (2009). Describing objects by their attributes. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on 20-25 June 2009 (pp. 1778-1785).

Gu, C., Lim, J. J., ArbelÃ¡ez, P., & Malik, J. (2009). Recognition using regions. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on 20-25 June 2009 (pp. 1030-1037).

Jakarta Globe. (2012). Indonesia has second-highest rate of blindness in world. Retrieved February 23rd, 2017 from http://jakartaglobe.id/archive/indonesia-hassecond-highest-rate-of-blindness-in-world/

Hochbaum, D. S., & Singh, V. (2009). An efficient algorithm for co-segmentation. In Computer Vision, 2009 IEEE 12th International Conference on 29 Sept.-2 Oct. 2009 (pp. 269-276).

Kim, W., Park, J., & Kim, C. (2010). A novel method for efficient indoor-outdoor image classification. Journal of Signal Processing Systems, 61(3), 251-258.

Li, L. J., Socher, R., & Fei-Fei, L. (2009). Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on 20-25 June 2009 (pp. 2036-2043).

Li, Y., Crandall, D. J., & Huttenlocher, D. P. (2009). Landmark classification in large-scale image collections. In Computer vision, IEEE 12th International Conference on 29 Sept.-2 Oct. 2009

(pp. 1957-1964).

Manduchi, R., & Coughlan, J. (2012). (Computer) vision without sight. Communications of the ACM, 55(1), 96-104.

Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision, 42(3), 145-175.

PERTUNI. (2009). Resolusi Munas VII PERTUNI 2009. Retrieved November 23rd, 2016 from http://pertuni. idp-europe.org/Resolusi2009/

Philip, B., & Updike, P. (2001). California Institute of Technology SURF project for summer. Retrieved February 23rd, 2017 from http://www.vision.caltech.edu/html-files/archive.html

Pinto, N., Cox, D. D., & DiCarlo, J. J. (2008). Why is realworld

visual object recognition hard? PLoS Comput Biol, 4(1), e27.

Pinto, N., Barhomi, Y., Cox, D. D., & DiCarlo, J. J. (2011).

Comparing state-of-the-art visual features on invariant object recognition tasks. In 2011 IEEE workshop on Applications of Computer Vision (WACV) (pp. 463-470).

Ran, L., Helal, S., & Moore, S. (2004). Drishti: an integrated

indoor/outdoor blind navigation system and service. In Pervasive Computing and Communications, 2004. PerCom 2004. Proceedings of the Second IEEE Annual Conference on 17-17 March 2004 (pp.23-30).

Sutoyo, R., Prayoga, B., Suryani, D., & Shodiq, M. (2015). The implementation of hand detection and recognition to help presentation processes. Procedia Computer Science, 59, 550-558.

Sutoyo, R., Harefa, J., & Chowanda, A. (2016). Unlock screen application design using face expression on android smartphone. In MATEC Web of Conferences (Vol. 54). EDP Sciences.

Sutoyo, R., Lesmana, T. F., & Susanto, E. (2017). KINECTATION (Kinect for Presentation): Control presentation with interactive board and record presentation with live capture tools. Journal of

Physics: Conference Series, 8(1), 1-6.

Yao, B., Yang, X., & Zhu, S. C. (2007). Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. In International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition (pp. 169-183). Springer Berlin Heidelberg.