Abstract

Phishing is one of the significant threats in cyber security. Phishing is a form of social engineering that uses e-mails with malicious websites to solicitate personal information. Phishing e-mails are growing in alarming number. In this paper we propose a novel machine learning approach to classify phishing websites using Convolution Neural Networks (CNNs) that use URL based features. CNNs consist of a stack of convolution, pooling layers, and a fully connected layer. CNNs accept images as input and perform feature extraction and classification. Many CNN models are available today. To avoid vanishing gradient problem, recent CNNs use entropy loss function with Rectified Linear Units (ReLU). To use a CNN, we convert feature vectors into images. To evaluate our approach, we use a dataset consists of 1,353 real world URLs that were classified into three categories-legitimate, suspicious, and phishing. The images representing feature vectors are classified using a simple CNN. We developed MATLAB scripts to convert vectors into images and to implement a simple CNN model. The classification accuracy obtained was 86.5 percent.

Description

This article is published in (IJACSA) International Journal of Advanced Computer Science and Applications, under a Creative Commons Attribution License (CC-BY): http://creativecommons.org/licenses/by/4.0/.

Publisher

The Science and Information Organization

Date of publication

2023

Language

english

Persistent identifier

http://hdl.handle.net/10950/4224

Document Type

Article

Share

COinS