Phishing Site Detection Classification Model Using Machine Learning Approach
DOI:
https://doi.org/10.21512/emacsjournal.v5i2.9951Keywords:
Phishing, Machine Learning, Cyber Crime, KNN, Decision TreeAbstract
Phishing has been a cybercrime that has existed for a long time, and there are still many people who are victims of this attack. This research attempts to prevent phishing by extracting the attributes found on phishing websites. This study uses a hybrid method by combining allowlist and denylist as part of a classification system. This research utilizes 18 features to identify a phishing site in terms of address bar, abnormal request, and source code (HTML and JavaScript). Where in each feature the author determines the benchmark. This study validates the status code and detects 52 URL shortening service domains and then evaluates these abnormalities with a binary classification system. Algorithms that have good results are Decision Tree and K Nearest Neighbor (KNN). After evaluating the performance of the algorithm in terms of Precision, Recall, and F-Measure. As a result, the Decision Tree algorithm has the highest accuracy of 97.62% and the fastest computation time of 0.00894 seconds. So that the Decision Tree is superior in terms of accuracy and computation time in detecting phishing URLs.
Plum Analytics
References
Aaron, G. (2020). Phishing Activity Trends Report. Anti Phishing Working Group.
Aljofey, A., Jiang, Q., Rasool, A., Chen, H., Liu, W., Qu, Q., & Wang, Y. (2022). An effective detection approach for phishing websites using URL and HTML features. Scientific Reports, 12(1), 8842.
Alshahrani, S. M., Jeeva, S. C., & Rajsingh, E. B. (2022). URL Phishing Detection Using Particle Swarm Optimization and Data Mining. CMC J, 73, 5625-5640.
Aminu, A., Abdulkarim, A., Aliyu, M., Yahaya, A., & Maigari, A. (2019). Detection of Phishing WebsitesUsing Random Forest and XGBOOST Algorithms. Frontiers of Knowledge Journal Series.
Ansari, M. F., Sharma, P. K., & Dash, B. (2022). Prevention of phishing attacks using AI-based Cybersecurity Awareness Training. Prevention.
Federal Bureu Of investigaion, I. C. 2020). Internet crime report 2020: https://www.ic3.gov/
Gupta, B. B., Yadav, K., Razzak, I., Psannis, K., Castiglione, A., & Chang, X. (2021). A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment. Computer Communications, 175, 47-57.
Jeeva, S. C., & Rajsingh, E. B. (2017). Phishing URL detection-based feature selection to classifiers. International Journal of Electronic Security and Digital Forensics, 9(2), 116-131.
Ketaren, E. (2017). CYBERCRIME, CYBER SPACE, DAN CYBER LAW. Article, 35–42.
Langlois, P. (2020). 2020 Data Breach Investigations Report.
Saha, I., Sarma, D., Chakma, R. J., Alam, M. N., Sultana, A., & Hossain, S. (2020, August). Phishing attacks detection using deep learning approach. In 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT) (pp. 1180-1185). IEEE.
Downloads
Published
Issue
Section
License
Copyright (c) 2023 Engineering, MAthematics and Computer Science (EMACS) Journal
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License - Share Alike that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
USER RIGHTS
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution-Share Alike (CC BY-SA)