EMAIL SPAM DETECTION USING NAÏVE BAYES ALGORITHM

Ankit Pradhan

2017

BSc.CSIT

Semester 7

Downloads 12

Electronic mail has nowadays become a convenient and inexpensive way for communication and inexpensive way for communication regardless of the distance. However, increasing volume of unsolicited emails is bringing down the productivity dramatically. There is a need for reliable anti-spam filters to separate such messages from legitimate ones. The Naïve Bayesian classification has been used in this project which is suggested to be an effective engine to pick out spam mails and separate them from hams. The algorithm was trained using Enron Dataset, a well-known spam/legitimate email dataset. I have developed this filter as a Web extension, which would consume the emails user uploads or receives and give back the predicted probability that if the given mail is spam or ham and in what degree given email is spam. The output obtained will give us the prediction if the mail is spam or ham as per the training data set. Experimental results have been collected using Enron dataset consisting of total 52076 mails including both spam and ham. The accuracy obtained using Naïve Bayes classifier is 98.32%.

E-mail Classification

Feature Extraction

Naïve Bayes Classifier

Enron Dataset

ham

spam

not spam

Weighted Average

Person Co-efficient

Prediction

Recommendation

EMAIL SPAM DETECTION USING NAÏVE BAYES ALGORITHM

Similar Projects