会议专题

Distributed Frequent Items Detection on Uncertain Data

Frequent items detection is one of the valuable techniques in many applications, such as network monitor, network intrusion detection, worm virus detection, and so on. This technique has been well studied on deterministic databases. However, it is a new task on emerging uncertain database, especially in distributed environment. In this paper, a new definition of frequent items on uncertain data is defined. Based on the definition, a polynomial algorithm is proposed, which can efficiently answer the queries in central environment. Furthermore, this work designs the communication-efficient algorithms for retrieving the top-k items with the largest probability from distributed sites. The algorithms compute the upper bound of each round of the transmission, and filter the data as much as possible, which have no chance to influence the query result. Extensive experiments show that the algorithms can process the queries correctly and reduce communication cost efficiently with various data set.

distributed query processing frequent item uncertain data top-k query

Shuang Wang Guoren Wang Jitong Chen

Software College Northeastern University Shenyang 110819 China College of Information Science and En College of Information Science and Engineering Northeastern University Shenyang 110819 China Software College Northeastern University Shenyang 110819 China

国际会议

6th International Conference on Advanced Data Mining and Applications(第六届先进数据挖掘及应用国际会议 ADMA 2010)

重庆

英文

509-520

2010-11-19(万方平台首次上网日期,不代表论文的发表时间)