Phishing Site Detection Classification Model Using Machine Learning Approach

Authors

  • Yohan Muliono Bina Nusantara University
  • Muhammad Amar Ma’ruf Bina Nusantara University
  • Zakiyyah Mutiara Azzahra Bina Nusantara University

DOI:

https://doi.org/10.21512/emacsjournal.v5i2.9951

Keywords:

Phishing, Machine Learning, Cyber Crime, KNN, Decision Tree

Abstract

Phishing has been a cybercrime that has existed for a long time, and there are still many people who are victims of this attack. This research attempts to prevent phishing by extracting the attributes found on phishing websites. This study uses a hybrid method by combining allowlist and denylist as part of a classification system. This research utilizes 18 features to identify a phishing site in terms of address bar, abnormal request, and source code (HTML and JavaScript). Where in each feature the author determines the benchmark. This study validates the status code and detects 52 URL shortening service domains and then evaluates these abnormalities with a binary classification system. Algorithms that have good results are Decision Tree and K Nearest Neighbor (KNN). After evaluating the performance of the algorithm in terms of Precision, Recall, and F-Measure. As a result, the Decision Tree algorithm has the highest accuracy of 97.62% and the fastest computation time of 0.00894 seconds. So that the Decision Tree is superior in terms of accuracy and computation time in detecting phishing URLs.

Dimensions

Plum Analytics

Author Biographies

Yohan Muliono, Bina Nusantara University

Cyber Security Program, Computer Science Department, School of Computer Science

Muhammad Amar Ma’ruf, Bina Nusantara University

Cyber Security Program, Computer Science Department, School of Computer Science

Zakiyyah Mutiara Azzahra, Bina Nusantara University

Cyber Security Program, Computer Science Department, School of Computer Science

References

Aaron, G. (2020). Phishing Activity Trends Report. Anti Phishing Working Group.

Aljofey, A., Jiang, Q., Rasool, A., Chen, H., Liu, W., Qu, Q., & Wang, Y. (2022). An effective detection approach for phishing websites using URL and HTML features. Scientific Reports, 12(1), 8842.

Alshahrani, S. M., Jeeva, S. C., & Rajsingh, E. B. (2022). URL Phishing Detection Using Particle Swarm Optimization and Data Mining. CMC J, 73, 5625-5640.

Aminu, A., Abdulkarim, A., Aliyu, M., Yahaya, A., & Maigari, A. (2019). Detection of Phishing WebsitesUsing Random Forest and XGBOOST Algorithms. Frontiers of Knowledge Journal Series.

Ansari, M. F., Sharma, P. K., & Dash, B. (2022). Prevention of phishing attacks using AI-based Cybersecurity Awareness Training. Prevention.

Federal Bureu Of investigaion, I. C. 2020). Internet crime report 2020: https://www.ic3.gov/

Gupta, B. B., Yadav, K., Razzak, I., Psannis, K., Castiglione, A., & Chang, X. (2021). A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment. Computer Communications, 175, 47-57.

Jeeva, S. C., & Rajsingh, E. B. (2017). Phishing URL detection-based feature selection to classifiers. International Journal of Electronic Security and Digital Forensics, 9(2), 116-131.

Ketaren, E. (2017). CYBERCRIME, CYBER SPACE, DAN CYBER LAW. Article, 35–42.

Langlois, P. (2020). 2020 Data Breach Investigations Report.

Saha, I., Sarma, D., Chakma, R. J., Alam, M. N., Sultana, A., & Hossain, S. (2020, August). Phishing attacks detection using deep learning approach. In 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 1180-1185). IEEE.

Downloads

Published

2023-05-31

Issue

Section

Articles
Abstract 641  .
PDF downloaded 510  .