会议专题

A Weakly Supervised Optimize Method in Latent Semantic Indezing

Latent Semantic Indexing (LSI) is an effective method in the way of feature extraction, which has been applied to many text learning tasks, such as text clustering and information retrieval. This paper thoroughly analyses the influence of term co-occurrences on the mapping of Latent Semantic Indexing and brings forward a method named pseudo document which strengthens the beneficial term co-occurrences by adding heuristic knowledge to text collection so as to make the mapping of Latent Semantic Indexing more reasonable. The experimental results show that the method named pseudo document can effectively improve the performance of patent retrieval.

Latent Semantic Indezing Term Co-occurrence Pseudo Document Patent Retrieval

Duo JI Dongbo GUO Dongfeng CAI Yu BAI

Knowledge Engineering Research Center, Shenyang Institute of Aeronautical Engineering. Shenyang, China

国际会议

International Conference on Natural Language Processing and Knowledge Engineering(IEEE自然语言处理与知识工程国际会议 IEEE NLP-KE 2009)

大连

英文

1-7

2009-09-24(万方平台首次上网日期,不代表论文的发表时间)