会议专题

Classification of the Structure of Square Hmong Characters and Analysis of Its Statistical Properties

  Analysis of the character structure characteristics can lay an information foundation for the intelligent processing of square Hmong characters.Combined with the analysis of character structure characteristics,this paper presents a definition of the linearization of square Hmong characters,a definition of equivalence class division of the structure of square Hmong characters,and proposes a decision algorithm of structure equivalence class.According to the above algorithm,the structure of square Hmong characters is divided into eight equivalent classes.Analysis of the statistical properties,including the cumulative probability distribution,complexity,and information entropy of square Hmong characters appearing in practical documents,shows that,first,more than 90%of square Hmong characters appearing in practical documents are composed of two components,and more than 80%of these characters possess a leftright,top-bottom,or lower-left-enclosed structure,second,the number of mean components in a square Hmong character is slightly greater than 2,third,the information entropy of the structure of Hmong characters is within the interval(1.19,2.16).Results reveal that square Hmong characters appearing frequently in practical documents follow the principle of simple structure orientation.

Information entropy Probability distribution Square Hmong character Statistical analysis

Li-Ping Mo Kai-Qing Zhou Liang-Bin Cao Wei Jiang

College of Information Science and Engineering,Jishou University,Jishou 416000,Hunan,China

国际会议

2018自然语言处理与中文计算国际会议(NLPCC2018)

呼和浩特

英文

154-165

2018-08-26(万方平台首次上网日期,不代表论文的发表时间)