An Efficient Method of Language Identification Using LVQ Network
This paper presents a new method to identify languages.A LVQ (learning vector quantization) network aimed at language identification is introduced.The presence of particular characters,words and the statistical information of word lengths are used as a feature vector.The new classification technique is faster than the conventional N-gram based classification approach,but it performs similarly in correct classification rate.In an identification experiment with 8 Roman alphabet languages,the LVQ network achieved 97.6% correct classification rate with 500 bytes,but it is five times faster than N-gram based approach.
Han Xiao Lei Yu Kai Chen
School of Information Engineering,Beijing University of Posts and Telecommunications,China
国际会议
9th International Conference on Signal Processing(第九届国际信号处理学术会议)(ICSP08)
北京
英文
2008-10-26(万方平台首次上网日期,不代表论文的发表时间)