Convolutional Neural Network Using Kalman Filter for Human Detection and Tracking on RGB-D Video

Jovin Angelico; Ken Ratri Retno Wardani

doi:10.21512/commit.v12i2.4890

Authors

Jovin Angelico Institut Teknologi Harapan Bangsa Bandung
Ken Ratri Retno Wardani Institut Teknologi Harapan Bangsa Bandung

DOI:

https://doi.org/10.21512/commit.v12i2.4890

Keywords:

Convolutional Neural Network, Human Detection, Tracking, RGB-D, Kalman Filter

Abstract

The computer ability to detect human being by computer vision is still being improved both in accuracy or computation time. In low-lighting condition, the detection accuracy is usually low. This research uses additional information, besides RGB channels, namely a depth map that shows objectsâ€™ distance relative to the camera. This research integrates Cascade Classifier (CC) to localize the potential object, the Convolutional Neural Network (CNN) technique to identify the human and nonhuman image, and the Kalman filter technique to track human movement. For training and testing purposes, there are two kinds of RGB-D datasets used with different points of view and lighting conditions. Both datasets have been selected to remove images which contain a lot of noises and occlusions so that during the training process it will be more directed. Using these integrated techniques, detection and tracking accuracy reach 77.7%. The impact of using Kalman filter increases computation efficiency by 41%.

Dimensions

Plum Analytics

Author Biographies

Jovin Angelico, Institut Teknologi Harapan Bangsa Bandung

Informatics Engineering

Ken Ratri Retno Wardani, Institut Teknologi Harapan Bangsa Bandung

Informatics Engineering

References

D. Tatarenkov and D. Podolsky, â€œThe human detection in images using the depth map,â€ in Systems of Signal Synchronization, Generating and Processing in Telecommunications (SINKHROINFO). Kazan, Russia: IEEE, July 3â€“4, 2017, pp. 1â€“4.

N. Sabri, Z. Ibrahim, M. M. Saad, N. N. A. Mangshor, and N. Jamil, â€œHuman detection in video surveillance using texture features,â€ in 2016 6th IEEE International Conference on Control System, Computing and Engineering (ICCSCE). IEEE, Nov. 25â€“27, 2016, pp. 45â€“50.

Z.-J. Lin, W.-N. Chen, J. Zhang, and J.- J. Li, â€œFast multiple human detection with neighborhood-based speciation differential evolution,â€ in Seventh International Conference on Information Science and Technology (ICIST). Da Nang, Vietnam: IEEE, April 16â€“19, 2017, pp. 200â€“207.

T. Jia, Z. Zhou, and H. Gao, â€œDepth measurement based on infrared coded structured light,â€ Journal of Sensors, vol. 2014, 2014.

L. Tian, M. Li, G. Zhang, J. Zhao, and Y. Q. Chen, â€œRobust human detection with super-pixel segmentation and random ferns classification using RGB-D camera,â€ in 2017 IEEE International Conference on Multimedia and Expo (ICME). Hong Kong, China: IEEE, July 10â€“14, 2017, pp. 1542â€“1547.

B. Choi, C. MericÂ¸li, J. Biswas, and M. Veloso, â€œFast human detection for indoor mobile robots using depth images,â€ in 2013 IEEE International Conference on Robotics and Automation. Karlsruhe, Germany: IEEE, May 6â€“10, 2013, pp. 1108â€“1113.

Fudan University, â€œClothing store RGBD dataset.â€ [Online]. Available: https://cv.fudan.edu.cn/ upload/tpl/06/f4/1780/template1780/humandetection.htm

S. Singh and S. C. Gupta, â€œHuman object detection by HoG, HoB, HoC and BO features,â€ in Fourth International Conference on Parallel, Distributed and Grid Computing (PDGC). Waknaghat, India: IEEE, Dec. 22â€“24, 2016, pp. 742â€“746.

H. K. Ragb and V. K. Asari, â€œMulti-feature fusion and PCA based approach for efficient human detection,â€ in IEEE Applied Imagery Pattern Recognition Workshop (AIPR). Washington, DC, USA: IEEE, Oct. 18â€“20, 2016, pp. 1â€“6.

J. Zhao, G. Zhang, L. Tian, and Y. Q. Chen, â€œReal-time human detection with depth camera via a physical radius-depth detector and a CNN descriptor,â€ in IEEE International Conference on Multimedia and Expo (ICME). Hong Kong, China: IEEE, July 10â€“14, 2017, pp. 1536â€“1541.

V. Sriram, K and H. Havaldar, R, â€œHuman detection and tracking in video surveillance system,â€ in IEEE International Conference on Computational Intelligence and Computing Research (ICCIC). Chennai, India: IEEE, Dec. 15â€“17, 2016, pp. 1â€“3.

L. Yusnita, N. Hadisukmana, R. B. Wahyu, R. Roestam, Y. Wahyu et al., â€œImplementation of real-time static hand gesture recognition using artificial neural network,â€ in 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT). Kuta Bali, Indonesia: IEEE, Aug. 8â€“10, 2017, pp. 1â€“6.

J. Cristanto and K. R. R. Wardani, â€œPenerapan metode single-layer feed-forward neural network menggunakan kernal gabor untuk pengenalan ekspresi wajah,â€ Jurnal Telematika, vol. 12, no. 1, 2017.

S. B. Driss, M. Soua, R. Kachouri, and M. Akil, â€œA comparison study between MPL and Convolutional Neural Network models for character recognition,â€ in SPIE Conference on Real- Time Image and Video Processing, Anaheim, CA, United States, April 10â€“11, 2017.

Stanford.edu, â€œCs231n: Convolutional neural networks for visual recognition,â€ syllabus. [Online]. Available: http://cs231n.github.io/convolutional-networks/

Z. Ren, S. Yang, F. Zou, F. Yang, C. Luan, and K. Li, â€œA face tracking framework based on convolutional neural networks and kalman filter,â€ in 2017 8th IEEE International Conference on Software Engineering and Service Science (ICSESS). Beijing, China: IEEE, Nov. 24â€“26, 2017, pp. 410â€“413.

J. Jordan, â€œSetting the learning rate of your neural network,â€ 2018. [Online]. Available: https://www.jeremyjordan.me/nn-learning-rate/