Research of an Improved Algorithm for Chinese Word Segmentation Dictionary Based on Double-Array Trie Tree
Chinese word segmentation dictionary based on the Double-Array Trie Tree has higher efficiency of search, but the dynamic insertion will con sume a lot of time.This paper presents an improved algorithm-iDAT, which is based on Double-Array Trie Tree for Chinese Word Segmentation Dictionary.After initialization the original dictionary.We implement a Hash process to the empty sequence index values for base array.The final Hash table stores the sum of the empty sequence before the current empty sequence.This algorithm adopt Sunday jumps algorithm of Single Pattern Matching.With slightly and reason able space cost increasing, iDAT reduces the average time complexity of the dynamic insertion process in Trie Tree.Practical results shows it has a good op eration performance.
Double-Array Trie Tree Time Complexity Word Segmentation Dictionary
Wenchuan Yang Jian Liu Miao Yu
Beijing University of Posts and Telecommunication, Beijing, 100876, China
国际会议
Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)
重庆
英文
355-362
2013-11-15(万方平台首次上网日期,不代表论文的发表时间)