会议专题

Quantitative evaluation of writing styles based on text analysis: methods and case study

Mathematical metrics indexing writing style of literature works based on a series of text classification techniques are introduced in this paper. Four different Chinese translation versions of the classical masterpiece of Maupassants Boule de Suif (Ball of Fat) are adopted as a case study to illustrate the inherent popularity, conformity and unique stylistic choices of translation language by different translators. Character frequency entropy (CFE) developed from modified Zipf-Mandelbrot principle is used here to evaluate the inherent popularity. The diction of phrasal materials and their clustering indices are then scrutinized with a critical parser of Chinese Word Segmentation (CWS) to evaluate writers conformity to conventional language. Sentence length and dispersion are calculated to reveal the habit of a loose or a compact syntax. The full analysis of sample texts from zi (character), ci (word) to ju (sentence) demonstrates a panorama of linguistic style of translators involved.

Text analysis Writing style Word frequency entropy Chinese Word Segmentation Sentence indexing

Jingmei Zhang Guangzhou Zeng Jingxiang Zhang

School of Computer Science & Technology Shandong University School of Information Management Shandon School of Information Science & Engineering University of Jinan Jinan, China

国际会议

2011 6th Joint International Information Technology and Artificial Intelligence Conference(2011年第六届IEEE联合国际信息技术与人工智能会议 IEEE ITAIC 2011)

重庆

英文

181-185

2011-08-20(万方平台首次上网日期,不代表论文的发表时间)