Collecting Valuable Information from Fast Text Streams

摘要：

　　It has become a challenging work to collect valuable information from fast text streams.In this work, we propose a method which gains useful information effectively and efficiently.Firstly, we maintain an analyzer based on the Trie structure and the dynamic N-Gram tokenizer;secondly, unlike the traditional search engine principle, we consider the documents as a query by building the indexes for the whole query base.The experimental results show that it has the strong adaption ability, low latency and high quality support for the complex query combination compared with the conventional methods.

关键词： Fast Text Stream Information Collection Trie N-Gram

作者: Baoyuan Qi Gang Ma Zhongzhi Shi Wei Wang

作者单位: Key Lab of Intelligent Information Processing, Institute of Computing Technology,CAS, Beijing 100190 Key Lab of Intelligent Information Processing, Institute of Computing Technology,CAS, Beijing 100190 Beijing Lexo Technologies Co., Ltd.Beijing 100080, China

会议类型: 国际会议

会议名称: 8th International Conference on Intelligent Information Processing(2014年IFIP智能信息处理国际会议)

会议地点: 杭州

会议语种:英文

页码: 96-105

在线出版日期: 2014-10-01（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Collecting Valuable Information from Fast Text Streams