Bonfring International Journal of Data Mining

Impact Factor: 0.245 | International Scientific Indexing(ISI) calculate based on International Citation Report(ICR)


Improving Efficiency of Apriori Algorithms for Sequential Pattern Mining

Alpa Reshamwala and Dr. Sunita Mahajan


Abstract:

Computer Systems are exposed to an increasing number of different types of security threats due to the expanding of internet in recent years. How to detect network intrusions effectively becomes an important security technique. Many intrusions arenot composed by single events, but by a series of attack steps taken in chronological order. Analyzing the order in which events occur can improve the attack detection accuracy and reduce false alarms. Intrusion is a multi step process in which a number of events must occur sequentially in order to launch a successful attack. Intrusion detection using sequential pattern mining is a research topic focusing on the field of information security. Sequential Pattern Mining is used to discover the frequent sequential pattern in the event dataset. Sequential Pattern mining algorithms can be broadly classified into Apriori based, Pattern growth based and a combination of both. The first algorithm is based on the characteristic of Apriori and the second uses a pattern growth approach. The major drawback of the Apriori based algorithm is the multiple scans of the database, generating maximal patterns. In this paper, a simulation study of both the algorithms, a modified AprioriALL Algorithm to optimize the processing by including set theory techniques and the original AprioriALL algorithm is done on a network intrusion dataset from KDD cup 1999. Experimental results show that the modified algorithm shrinks the dataset size. At the most, it also scans the database twice. Also, as the interestingness of the itemset is increased with the dataset shrinking it leads to efficient sequences with high associativity. As the database is reduced, the time taken to mine sequences also reduces and is faster than Apriori based algorithm.

Keywords: Data mining, Sets, Sequence data, Time series, Intrusion detection system, DoS attacks

Volume: 4 | Issue: 1

Pages: 01-06

Issue Date: March , 2014

DOI: 10.9756/BIJDM.4774

Full Text

Email

Password

 


This Journal is an Open Access Journal to Facilitate the Research Community