A Comparitive Study of E-Mail Spam Detection using Various Machine Learning Techniques
Due to the rise in the use of messaging and mailing services, spam detection tasks are of much greater importance than before. In such a set of communications, efficient classification is a comparatively onerous job. For an addressee or any email that the user does not want to have in his inbox, spam can be defined as redundant or trash email. After pre-processing and feature extraction, various machine learning algorithms were applied to a Spam base dataset from the UCI Machine Learning repository in order to classify incoming emails into two categories: spam and non-spam. The outcomes of various algorithms have been compared. This paper used random forest, naive bayes, support vector machine (SVM), logistic regression, and the k nearest (KNN) machine learning algorithm to successfully classify email spam messages. The main goal of this study is to improve the prediction accuracy of spam email filters.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.