A Weakly Supervised Optimize Method in Latent Semantic Indezing
Latent Semantic Indexing (LSI) is an effective method in the way of feature extraction, which has been applied to many text learning tasks, such as text clustering and information retrieval. This paper thoroughly analyses the influence of term co-occurrences on the mapping of Latent Semantic Indexing and brings forward a method named pseudo document which strengthens the beneficial term co-occurrences by adding heuristic knowledge to text collection so as to make the mapping of Latent Semantic Indexing more reasonable. The experimental results show that the method named pseudo document can effectively improve the performance of patent retrieval.
Latent Semantic Indezing Term Co-occurrence Pseudo Document Patent Retrieval
Duo JI Dongbo GUO Dongfeng CAI Yu BAI
Knowledge Engineering Research Center, Shenyang Institute of Aeronautical Engineering. Shenyang, China
国际会议
大连
英文
1-7
2009-09-24(万方平台首次上网日期,不代表论文的发表时间)