The Spoken/Written Language Classification of English Sentences with Bilingual Information
To alleviate the problem with Chinese being poor at telling the dif ference between spoken and written English which is important for learning and using the language, we propose to classify English sentences with bilingual in formation into the two categories automatically.Based on the text categoriza tion technology, we explore a variety of features, including words, statistics and their combinations, and find that a classification accuracy nearly 95% can be achieved in the open test through Chinese characters + sentence length + aver age syllable number, or other similar combinations.
Text categorization Sentence classification Spoken and written language Bilingual sentences
Kuan Li Zhongyang Xiong Yufang Zhang Xiaohua Liu Ming Zhou Guanghua Zhang
College of Computer Science, Chongqing University, Chongqing, China Microsoft Research Asia, Beijing, China
国际会议
Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)
重庆
英文
370-377
2013-11-15(万方平台首次上网日期,不代表论文的发表时间)