A Topic Detection and Tracking method combining NLP with Suffix Tree Clustering
A topic detection and tracking method combining semantic analysis with Suffix Tree Clustering (STC) algorithm is presented. A feature selection using NLP algorithm was introduced to select the noun, verb and name entity as the input of STC. Focusing on the topic drifting, we formed the VSM of cluster by the key words extracted from the nodes of suffix tree by mutual information algorithm. After the similarity computing of clusters and topic detection and tracking, a semantic analysis was introduced to filter the words with same meaning and analyze the semantic structure of words in label of cluster. Finally a content-relevant description was generated for each topic. The experiments showed that this method can detect and track the topics from the news articles effectively.
cluster STC semantic analysis topic detection and tacking mutual information
Yaohong JIN
Institute of Chinese Information Processing Beijing Normal University Beijing, P. R. CHINA
国际会议
杭州
英文
227-230
2012-03-23(万方平台首次上网日期,不代表论文的发表时间)