A Frequent Web Access Path Mining Algorithm Based on the Apriori Algorithm
In order to optimize the structure and page organization of a website, the frequent web access path is often used. The Apriori algorithm is a well-known algorithm for mining frequent itemsets. In this paper, we introduce the Apriori algorithm to design the algorithm to mine frequent web access paths, and improve the Apriori algorithm with respect to the performance. As to the Apriori algorithm, when the count of the candidate 2-sequence in C2 is computed, the quantity of scanning increases very fast with the increasing number of web page sequence. So, we present an array to mark that the web paths of a database have been matched or not, and compress the web transaction database by pruning it. Then, the speed of scanning can be increased in the whole process. Finally, an example is shown to prove the validity of our algorithm.
Frequent web access path Apriori algorithm Frequent k-sequence Maximum forward path
Shuhua Gu Yutao Ma
College of Computer, China University of Geosciences, Wuhan 430074, China State Key Lab of Software Engineering,, Member, ACM,Wuhan University, Wuhan 430072, China
国际会议
武汉
英文
2007-09-21(万方平台首次上网日期,不代表论文的发表时间)