End-to-End Steering Angle Prediction for Autonomous Car Using Vision Transformer

Ilvico Sonata; Yaya Heryadi; Antoni Wibowo; Widodo Budiharto

doi:10.21512/commit.v17i2.8425

Authors

Ilvico Sonata Bina Nusantara University
Yaya Heryadi Bina Nusantara University
Antoni Wibowo Bina Nusantara University
Widodo Budiharto Bina Nusantara University

DOI:

https://doi.org/10.21512/commit.v17i2.8425

Keywords:

Steering Angle Prediction, Autonomous Car, Vision Transformer (ViT)

Abstract

The development of autonomous cars is currently increasing along with the need for safe and comfortable autonomous cars. The development of autonomous cars cannot be separated from the use of deep learning to determine the steering angle of an autonomous car according to the road conditions it faces. In the research, a Vision Transformer (ViT) model is proposed to determine the steering angle based on images taken using a front-facing camera on an autonomous car. The dataset used to train ViT is a public dataset. The dataset is taken from streets around Rancho Palos Verdes and San Pedro, California. The number of images is 45,560, which are labeled with the steering angle value for each image. The proposed model can predict steering angle well. Then, the steering angle prediction results are compared using the same dataset with existing models. The experimental results show that the proposed model has better accuracy regarding the resulting MSE value of 2,991 compared to the CNN-based model of 5,358 and the CNN-LSTM combination model of 4,065. From the results of this experiment, the ViT model can replace the existing model, namely the CNN model and the combination model between CNN and LSTM, in predicting the steering angle of an autonomous car.

Dimensions

Plum Analytics

Author Biographies

Ilvico Sonata, Bina Nusantara University

Computer Science Department, BINUS Graduate Program - Doctor of Computer Science

Yaya Heryadi, Bina Nusantara University

Computer Science Department, BINUS Graduate Program - Doctor of Computer Science

Antoni Wibowo, Bina Nusantara University

Computer Science Department, BINUS Graduate Program - Doctor of Computer Science

Widodo Budiharto, Bina Nusantara University

Computer Science Department, School of Computer Science

References

P. Penmetsa, E. K. Adanu, D. Wood, T. Wang, and S. L. Jones, â€œPerceptions and expectations of autonomous vehiclesâ€“A snapshot of vulnerable road user opinion,â€ Technological Forecasting and Social Change, vol. 143, pp. 9â€“13, 2019.

T. Sawabe, M. Kanbara, and N. Hagita, â€œComfort intelligence for autonomous vehicles,â€ in 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). Munich, Germany: IEEE, Oct. 16â€“20, 2018, pp. 350â€“353.

U. M. Gidado, H. Chiroma, N. Aljojo, S. Abubakar, S. I. Popoola, and M. A. Al-Garadi, â€œA survey on deep learning for steering angle prediction in autonomous vehicles,â€ IEEE Access, vol. 8, pp. 163 797â€“163 817, 2020.

S. Kuutti, R. Bowden, Y. Jin, P. Barber, and S. Fallah, â€œA survey of deep learning applications to autonomous vehicle control,â€ IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 2, pp. 712â€“733, 2020.

M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba, â€œEnd to end learning for self-driving cars,â€ 2016. [Online]. Available: https://arxiv.org/abs/1604.07316

H. Zhang, J. Bosch, and H. H. Olsson, â€œEndto-end federated learning for autonomous driving vehicles,â€ in 2021 International Joint Conference on Neural Networks (IJCNN). Shenzhen, China: IEEE, July 18â€“22, 2021, pp. 1â€“8.

S. Lade, P. Shrivastav, S. Waghmare, S. Hon, S. Waghmode, and S. Teli, â€œSimulation of self driving car using deep learning,â€ in 2021 International Conference on Emerging Smart Computing and Informatics (ESCI). Pune, India: IEEE, March 5â€“7, 2021, pp. 175â€“180.

Y. Zhao and Y. Chen, â€œEnd-to-end autonomous driving based on the convolution neural network model,â€ in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). Lanzhou, China: IEEE, Nov. 18â€“21, 2019, pp. 419â€“423.

C. Yang, W. Jiang, and Z. Guo, â€œTime series data classification based on dual path CNN-RNN cascade network,â€ IEEE Access, vol. 7, pp. 155 304â€“155 312, 2019.

H. Zhang, H. Lu, and A. Nayak, â€œPeriodic time series data analysis by deep learning methodology,â€ IEEE Access, vol. 8, pp. 223 078â€“223 088, 2020.

M.-j. Lee and Y.-g. Ha, â€œAutonomous driving control using end-to-end deep learning,â€ in 2020 IEEE International Conference on Big Data and Smart Computing (BigComp). Busan, Korea (South): IEEE, Feb. 19â€“22 2020, pp. 470â€“473.

H. Jiang, L. Chang, Q. Li, and D. Chen, â€œDeep transfer learning enable end-to-end steering angles prediction for self-driving car,â€ in 2020 IEEE Intelligent Vehicles Symposium (IV). Las Vegas, NV, USA: IEEE, Oct. 19â€“Nov. 13, 2020, pp. 405â€“412.

Z. Liu, K. Wang, J. Yu, and J. He, â€œEnd-to-end control of autonomous vehicles based on deep learning with visual attention,â€ in 2020 4th CAA International Conference on Vehicular Control and Intelligence (CVCI). Hangzhou, China: IEEE, Dec. 18â€“20, 2020, pp. 584â€“589.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, â€œAttention is all you need,â€ in Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc., 2017, pp. 1â€“11.

T. J. Sefara, S. G. Zwane, N. Gama, H. Sibisi, P. N. Senoamadi, and V. Marivate, â€œTransformerbased machine translation for low-resourced languages embedded with language identification,â€ in 2021 Conference on Information Communications Technology and Society (ICTAS). Durban, South Africa: IEEE, March 10â€“11, 2021, pp. 127â€“132.

A. Tjandra, S. Sakti, and S. Nakamura, â€œSpeechto-speech translation between untranscribed unknown languages,â€ in 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). Singapore: IEEE, Dec. 14â€“18, 2019, pp. 593â€“600.

C. Tho, Y. Heryadi, I. H. Kartowisastro, and W. Budiharto, â€œA comparison of lexicon-based and transformer-based sentiment analysis on code-mixed of low-resource languages,â€ in 2021 1st International Conference on Computer Science and Artificial Intelligence (ICCSAI), vol. 1. Jakarta, Indonesia: IEEE, Oct. 28, 2021, pp. 81â€“85.

K. Pipalia, R. Bhadja, and M. Shukla, â€œComparative analysis of different transformer based architectures used in sentiment analysis,â€ in 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART). Moradabad, India: IEEE, Dec. 4â€“5, 2020, pp. 411â€“415.

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, â€œAn image is worth 16Ã—16 words: Transformers for image recognition at scale,â€ 2020. [Online]. Available: http://arxiv.org/abs/2010.11929

T. Panboonyuen, S. Thongbai, W. Wongweeranimit, P. Santitamnont, K. Suphan, and C. Charoenphon, â€œObject detection of road assets using transformer-based YOLOX with feature pyramid decoder on Thai highway panorama,â€ Information, vol. 13, no. 1, pp. 1â€“12, 2021.

Z. Zhao, X. Wu, and H. Liu, â€œVision transformer for quality identification of sesame oil with stereoscopic fluorescence spectrum image,â€ LWT, vol. 158, pp. 1â€“9, 2022.

Y. Bazi, L. Bashmal, M. M. A. Rahhal, R. A. Dayil, and N. A. Ajlan, â€œVision transformers for remote sensing image classification,â€ Remote Sensing, vol. 13, no. 3, pp. 1â€“19, 2021.

R. Atienza, â€œVision transformer for fast and efficient scene text recognition,â€ in International Conference on Document Analysis and Recognition. Lausanne, Switzerland: Springer, Sept. 5â€“10, 2021, pp. 319â€“334.

M. Zeineldeen, A. Zeyer, R. SchlÂ¨uter, and H. Ney, â€œLayer-normalized LSTM for hybrid-HMM and end-to-end ASR,â€ in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain: IEEE, May 4â€“8, 2020, pp. 7679â€“7683.

A. MikoÅ‚ajczyk and M. Grochowski, â€œData augmentation for improving deep learning in image classification problem,â€ in 2018 International Interdisciplinary PhD Workshop (IIPhDW). Poland: IEEE, May 9â€“12, 2018, pp. 117â€“122.

S. Narayan and G. Tagliarini, â€œAn analysis of underfitting in MLP networks,â€ in Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., vol. 2. Montreal, QC, Canada: IEEE, July 31â€“Aug. 4, 2005, pp. 984â€“988.

H. Li, J. Li, X. Guan, B. Liang, Y. Lai, and X. Luo, â€œResearch on overfitting of deep learning,â€ in 2019 15th International Conference on Computational Intelligence and Security (CIS). Macao, China: IEEE, Dec. 13â€“16, 2019, pp. 78â€“81.

J. Kolluri, V. K. Kotte, M. S. B. Phridviraj, and S. Razia, â€œReducing overfitting problem in machine learning using novel L1/4 regularization method,â€ in 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184). Tirunelveli, India: IEEE, June 15â€“17, 2020, pp. 934â€“938.

J. Y. C. Chen and J. E. Thropp, â€œReview of low frame rate effects on human performance,â€ IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 37, no. 6, pp. 1063â€“1076, 2007.

K. Gauen, R. Dailey, J. Laiman, Y. Zi, N. Asokan, Y. H. Lu, G. K. Thiruvathukal, M. L. Shyu, and S. C. Chen, â€œComparison of visual datasets for machine learning,â€ in 2017 IEEE International Conference on Information Reuse and Integration (IRI). San Diego, CA, USA: IEEE, Aug. 4â€“6, 2017, pp. 346â€“355.