Chinese Query Correction Based on the Combination of Statistics and Characteristics
In the Retrieval System, the query words correction is an important auxiliary in improving query efficiency. In this paper, according to the characteristics of the Chinese language, a candidate set has been generated for each term of the query string. After the cross combination of the candidate sets, the grid of candidates is created. With the characteristic form combining the factor of n-gram statistical model and phonetic similarity, query term hits, and n-gram similarity, the candidates ranking model has been set and generally, balanced, then the candidates are sorted to get the optimal correction results. The experiments show that, our query correction model based on the combination of statistics and characteristics has obtained higher correction accuracy and recall rate.
query correction n-gram statistical model phonetic similarity n-gram similarity
SONG Ling XUBai XIE Peng Yu CHEN Li Fang
School of Computer and Electronicslnformation Guangxi University Nanning, China School of Computer and Electronics Information Guangxi University Nanning, China School of Computer and Electronics Information Guangxi University Nanning, China
国际会议
重庆
英文
1050-1053
2011-08-20(万方平台首次上网日期,不代表论文的发表时间)