A Data-driven Approach for Cross Transformation Between Mongolian texts
This paper discusses a data-driven approach to transforming different graphic texts of Mongolian.Using the proposed approach, it is possible to transcribe or translate texts between similar languages such as Mongolian graphic texts used in different regions and countries, as well as the Altaic family languages like Uygur Turkic and Kazakh.The approach has been implemented based on DP (dynamic programming) matching supported by the knowledge-based sequence matching, referred to a multilingual dictionary and a data-driven approach of the target language corpus.Experimental results demonstrate that the proposed method achieves 86.4% transformation accuracy (in F-measure) for the NM (Cyrillic) to the TM (Traditional Mongolian) mainly used in the inner Mongolia, and 91.1% NM to Todo, which is mainly used in Xinjiang areas in China.
Mongolian texts cross language transformation DP data driven approach
Dawa Yidemucao Muheyat Niyazbek Ayjarken Amantay
School of Information Science and Engineering Xinjiang University,Urumqi,China
国际会议
2013 2nd International Conference on Science and Social Research (2013年第二届科学与社会研究国际会议)(ICSSR2013)
北京
英文
375-380
2013-07-13(万方平台首次上网日期,不代表论文的发表时间)