EMAIL SPAM DETECTION USING NAÏVE BAYES ALGORITHM

Ankit Pradhan
2017
BSc.CSIT
Semester 7
Downloads 10

Electronic mail has nowadays become a convenient and inexpensive way for communication and inexpensive way for communication regardless of the distance. However, increasing volume of unsolicited emails is bringing down the productivity dramatically. There is a need for reliable anti-spam filters to separate such messages from legitimate ones. The Naïve Bayesian classification has been used in this project which is suggested to be an effective engine to pick out spam mails and separate them from hams. The algorithm was trained using Enron Dataset, a well-known spam/legitimate email dataset. I have developed this filter as a Web extension, which would consume the emails user uploads or receives and give back the predicted probability that if the given mail is spam or ham and in what degree given email is spam. The output obtained will give us the prediction if the mail is spam or ham as per the training data set. Experimental results have been collected using Enron dataset consisting of total 52076 mails including both spam and ham. The accuracy obtained using Naïve Bayes classifier is 98.32%.

E-mail Classification
Feature Extraction
Naïve Bayes Classifier
Enron Dataset
ham
spam
not spam
Weighted Average
Person Co-efficient
Prediction
Recommendation

Similar Projects