PhishScanner : A Web-Based Tool for Detection of Phishing Websites Using Machine Learning
DOI:
https://doi.org/10.17010/ijcs/2024/v9/i1/173693Keywords:
GBC
, KNN, Logistic Regression, Naive Bayes, Random Forest, Support Vector Machine (SVM).Paper Submission Date
, December 25, 2023, Paper sent back for Revision, January 7, 2024, Paper Acceptance Date, January 10, Paper Published Online, February 5, 2024.Abstract
Phishing is a common attack that tricks an individual to reveal personal information through fake websites. The main goal is to steal important data, including logins for social media accounts, bank account numbers, passwords, and usernames. The attacker uses his/her website that is visually and semantically similar to the actual website. Over the years there have been multiple attacks of phishing and many people have lost huge sums of money by becoming victims of these attacks. Attackers frequently employ fake emails or websites to trick victims into clicking hazardous links or downloading malicious attachments. The victim’s login details and account information may be accessible to the attacker, thanks to these acts. The complexity of phishing attacks has increased, making it more challenging for the typical user to tell whether a link in an email or a message is authentic or not. To detect these types of websites there are different methods such as blacklist approach, content based approach etc. Detecting attacks through traditional methods such as blacklists can be unreliable as it has become simpler to register new domains, and maintaining a comprehensive and current database is difficult. An internet user can interact with a web application through a graphical user interface by running it on a web server and accessing it through a web browser. Therefore, we developed a web application whose primary objective is to detect Phishing URLs by utilizing advanced machine learning models that analyze features extracted from the website. After reviewing various classification algorithms, we decided on adopting the Random Forest Algorithm because of its high accuracy of 0.965. Random Forest Algorithm is a well-known algorithm for Machine Learning that predicts using an ensemble of decision trees. It has proven to be effective in a wide range of classification tasks.Downloads
Published
How to Cite
Issue
Section
References
R. Mahajan and I. Siddavatam, “Phishing website detection using Machine Learning algorithms,†Int. J. Comput. Appl., vol. 181, no. 23, pp. 45–47, Oct. 2018, doi: 10.5120/ijca2018918026.
J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, “Phishing website classification and detection using Machine Learning,†in 2020 Int. Conf. Comput. Communication Inform., Coimbatore, India, 2020, pp. 1–6, doi: 10.1109/ICCCI48352.2020.9104161.
Pratik N. N., Vaneeta M., Prajwal D., Pradeep K. S., and S. Kakade K., “Detection of phishing websites using Machine Learning techniques,†J. Emerg. Technologies Innovative Res. vol. 7, no. 6, pp. 117–123, Jun. 2020. [Online]. Available: https://www.jetir.org/papers/JETIR2006018.pdf
M. Korkmaz, O. K. Sahingoz and B. Diri, "Detection of Phishing websites by using Machine Learning based URL analysis," in 2020 11th Int. Conf. Comput., Communication Netw. Technologies, Kharagpur, India, 2020, pp. 1–7, doi: 10.1109/ICCCNT49239.2020.9225561.
A. Krishna V., Anusree A., B. Jose, K. Anilkumar, and O. T. Lee, “Phishing detection using Machine Learning based URL analysis: A survey,†Int. J. Eng. Res. Technol., vol. 9, no. 13, 2021, doi: 10.17577/IJERTCONV9IS13033.