Frequent Items Mining on Data Stream Using Hash-Table and Heap

摘要：

Most of the existing algorithms for mining frequent items on data stream do not emphasis the importance of the recent data items. We present an algorithm to detect the items with frequency counts exceeding a user-specified threshold. Our algorithm uses a hash table L and a heap to record the potential frequent items, and can detect ε-approximate frequent data items on data stream using O(|L|+ ε-1) memory space and the processing time for each data item is O(logε1).Experimental results on several artificial and real datasets show our algorithm has higher precision, requires less memory and consumes less computation time than other similar methods.

关键词： data mining data stream frequent items time fading model hash table heap.

作者: Zhang Shan Chen Ling Tu Li

作者单位: Department of Computer Science Yang Zhou University Yangzhou,China Institute of Information Science and Technology Naming University of Aeronautics and Astronautiscs N

会议类型: 国际会议

会议名称: 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems(2009 IEEE 智能计算与智能系统国际会议)

会议地点: 上海

会议语种:英文

页码: 141-145

在线出版日期: 2009-11-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Frequent Items Mining on Data Stream Using Hash-Table and Heap