会议专题

Chinese Query Correction Based on the Combination of Statistics and Characteristics

In the Retrieval System, the query words correction is an important auxiliary in improving query efficiency. In this paper, according to the characteristics of the Chinese language, a candidate set has been generated for each term of the query string. After the cross combination of the candidate sets, the grid of candidates is created. With the characteristic form combining the factor of n-gram statistical model and phonetic similarity, query term hits, and n-gram similarity, the candidates ranking model has been set and generally, balanced, then the candidates are sorted to get the optimal correction results. The experiments show that, our query correction model based on the combination of statistics and characteristics has obtained higher correction accuracy and recall rate.

query correction n-gram statistical model phonetic similarity n-gram similarity

SONG Ling XUBai XIE Peng Yu CHEN Li Fang

School of Computer and Electronicslnformation Guangxi University Nanning, China School of Computer and Electronics Information Guangxi University Nanning, China School of Computer and Electronics Information Guangxi University Nanning, China

国际会议

The 13th IEEE Joint International Computer Science and Information Technology Conference(2011年第13届IEEE联合国际计算机科学与信息技术会议 JICSIT 2011)

重庆

英文

1050-1053

2011-08-20(万方平台首次上网日期,不代表论文的发表时间)