CLUSTERING WEB SEARCH RESULTS USING SEMANTIC INFORMATION

摘要：

Clustering web search results will help users finding relevant information quickly. Suffix tree clustering (STC) algorithm is well fit for clustering web documents. This paper puts forward an improved web search results clustering algorithm based on STC. It uses latent semantic indexing method to assist finding common descriptive and meaningful topic phrases for the final document clusters. Using semantic information for clustering web snippets is able to make search engine results easy to browse and help users quickly find web information interested. Evaluation of experiment results demonstrates that clustering web search results based on the improved suffix tree algorithm gets better performance in cluster label quality and snippets assignment precision.

关键词： Latent semantic indezing Singular value decomposition Suffiz tree clustering

作者: HAN WEN GUO-SHUN HUANG ZHAO LI

作者单位: School of Science, FOSHAN University, Foshan 528000, China School of Computer Science and Engineering, South China University of Technology, Guangzhou 510641,

会议类型: 国际会议

会议名称: 2009 International Conference on Machine Learning and Cybernetics(2009机器学习与控制论国际会议)

会议地点: 保定

会议语种:英文

页码: 1504-1509

在线出版日期: 2009-07-12（万方平台首次上网日期，不代表论文的发表时间）

会议专题

CLUSTERING WEB SEARCH RESULTS USING SEMANTIC INFORMATION