Research on Improved Algorithm for Chinese Word Segmentation Based on Markov Chain
Chinese words segmentation is an important technique for Chinese web data mining. After the research made on some Chinese word segmentation nowadays, an improved algorithm is proposed in this paper. The algorithm updates dictionary by using Two-way Markov Chain, and does word segmentation by applying an improved Forward Maximum Matching Method based on word frequency statistic. The simulation shows this algorithm can finish word segmentation for a given text quickly and accurately.
Pang Baomao Shi Haoshan
College of Electronic Information,Northwest Polytechnical University,Xian 710072,China The Air Forc College of Electronic Information,Northwest Polytechnical University,Xian 710072,China
国际会议
The Fifth International Conference on Information Assurance and Security(第五届信息保障与安全国际会议)
西安
英文
236-238
2009-08-18(万方平台首次上网日期,不代表论文的发表时间)