Abstract
Tremendous resources are spent by organizations guarding against and recovering from cybersecurity attacks by online hackers who gain access to sensitive and valuable user data. Many cyber infiltrations are accomplished through phishing attacks where users are tricked into interacting with web pages that appear to be legitimate. In order to successfully fool a human user, these pages are designed to look like legitimate ones. Since humans are so susceptible to being tricked, automated methods of differentiating between phishing websites and their authentic counterparts are needed as an extra line of defense. The aim of this research is to develop these methods of defense utilizing various approaches to categorize websites. Specifically, we have developed a system that uses machine learning techniques to classify websites based on their URL. We used four classifiers: the decision tree, Naïve Bayesian classifier, support vector machine (SVM), and neural network. The classifiers were tested with a data set containing 1,353 real world URLs where each could be categorized as a legitimate site, suspicious site, or phishing site. The results of the experiments show that the classifiers were successful in distinguishing real websites from fake ones over 90% of the time.
Description
This article was initially published in the International Journal of Advanced Computer Science and Applications, under a Creative Commons Attribution 4.0 International License.
Publisher
International Journal of Advanced Computer Science and Applications
Date of publication
Summer 8-1-2019
Language
english
Persistent identifier
http://hdl.handle.net/10950/1862
Document Type
Article
Recommended Citation
Kulkarni, Arun D. and Brown,, Leonard L. III, "Phishing Websites Detection using Machine Learning" (2019). Computer Science Faculty Publications and Presentations. Paper 20.
http://hdl.handle.net/10950/1862
Publisher Citation
https://thesai.org/Publications/ViewPaper?Volume=10&Issue=7&Code=IJACSA&SerialNo=2
Included in
Business Law, Public Responsibility, and Ethics Commons, Computer and Systems Architecture Commons