LEXICAL BASED MALICIOUS URL DETECTION USING NEURAL NETWORK

Bipin Paudel
2017
BSc.CSIT
Semester 7
Downloads 3

With advancement of Internet and Technology, there is an exponential growth of users who are connected to Internet via some kind of electronic device. With use of Internet, various illicit activities have been conducted, which in turn has resulted to growth of criminal activities such as phishing, dissemination of malwares, financial fraud, spam-advertised ecommerce and more. URL being the entry point of all the type of activities, it can be analyzed to detect and categorize URLs into benign or malicious. URL classification has been carried out using lexical features, network traffic, hosting information, content of the URL and other strategies, in the past. Except lexical features, other techniques require time intensive lookups, which introduces delay in the classification process and thus cannot be used in real time system. This paper describes classification of URLs based on its lexical features. Classification using lexeme feature is a lightweight approach in the process for classifying malicious URLs. Analysis is carried out to find the effectiveness of the system that detects malicious URL based on lexical analysis, which in turn can be used in real time system. Process of URL classification was carried out using Neural Network where 57570 URLs were used. Feature extraction being the major task that determines the efficiency of the system, different approaches are used to extract features. Finally, comparison of the result based on the extracted features using different feature extraction approaches was carried out. The accuracy of 92.7% was obtained using test data which was a part of total dataset

Similar Projects