会议专题

EMD-DSJoin:Efficient Similarity Join Over Probabilistic Data Streams Based on Earth Movers Distance

  Similarity joins on probabilistic data play a vital role in many practical applications,such as sensor reading monitoring and object tracking based on multiple video sources.Earth Movers Distance (EMD) proposed in Computer Vision is more effective in returning similar probabilistic data being more consistent to humans perception to similarity.However,the cubic time complexity of EMD hampers its wide application,especially in the analysis of fast incoming data streams.In this paper we,to the best of our knowledge,make the first attempt to address the EMD similarity join over data streams under sliding window semantics.We first design an efficient and effective index framework,named B+ Forests Index,which facilitates data pruning and offers proper strategy to deal with out-of-order data.We then propose the EMD similarity algorithm,named EMD-DSJoin,based on the proposed index framework.We perform extensive experiments on real-world datasets and verify the effectiveness and efficiency of our proposal.

EMD Data stream Sliding window similarity join EMD-DSJoin

Jia Xu Jiazhen Zhang Chao Song Qianzhen Zhang Pin Lv Taoshen Li Ningjiang Chen

School of Computer,Electronics and Information in Guangxi University,Guangxi,China;Guangxi Colleges School of Computer,Electronics and Information in Guangxi University,Guangxi,China School of Computer,Electronics and Information in Guangxi University,Guangxi,China;Guangxi Colleges

国际会议

International Asia-Pacific Web Conference(第18届国际亚太互联网大会)

苏州

英文

42-54

2016-09-23(万方平台首次上网日期,不代表论文的发表时间)