An Algorithm for Mining Approximate Frequent Itemsets Over Data Streams
It is much more difficult to mining frequent itemsets over data streams than traditional data model because data stream has the following characters: unbounded volume of data,rapid arriving rate of records,uncontrollability of records arriving order,etc. A novel algorithm is devised based on Lossy Counting to mine frequent itemsets. Logarithmic tilted time window with an attenuation coefficient is adopted to emphasize the importance of new data. Multilayer count queue mode is designed to not only avoid the counter overflowing but also query top-K itemsets quickly using a index table.
data stream frequent itemsets logarithmic tilted time window
Na Su Zhehui Wu
Department of Information Engineering,ShanDong University of Science and Technology,Taian ShanDong, College of Information Science and Engineering,ShanDong University of Science and Technology,Qingdao
国际会议
西安
英文
1444-1447
2011-12-23(万方平台首次上网日期,不代表论文的发表时间)