Quantitative evaluation of writing styles based on text analysis: methods and case study
Mathematical metrics indexing writing style of literature works based on a series of text classification techniques are introduced in this paper. Four different Chinese translation versions of the classical masterpiece of Maupassants Boule de Suif (Ball of Fat) are adopted as a case study to illustrate the inherent popularity, conformity and unique stylistic choices of translation language by different translators. Character frequency entropy (CFE) developed from modified Zipf-Mandelbrot principle is used here to evaluate the inherent popularity. The diction of phrasal materials and their clustering indices are then scrutinized with a critical parser of Chinese Word Segmentation (CWS) to evaluate writers conformity to conventional language. Sentence length and dispersion are calculated to reveal the habit of a loose or a compact syntax. The full analysis of sample texts from zi (character), ci (word) to ju (sentence) demonstrates a panorama of linguistic style of translators involved.
Text analysis Writing style Word frequency entropy Chinese Word Segmentation Sentence indexing
Jingmei Zhang Guangzhou Zeng Jingxiang Zhang
School of Computer Science & Technology Shandong University School of Information Management Shandon School of Information Science & Engineering University of Jinan Jinan, China
国际会议
重庆
英文
181-185
2011-08-20(万方平台首次上网日期,不代表论文的发表时间)