A quasi word-based compression method of English text using byte-oriented coding scheme
In this paper we present a universal compression algorithm for English text, ERecode. The proposed scheme highlights the importance of pre-processing work for English text, and employs one or two bytes code values to recode the 511 most common used English words, sequences of symbols and ASCⅡ codes based on their occurrence frequency. Acting as a preprocessing tool for English text by the popular compression utilities, ERecode can improve their compression ratio from 0.89% to 19.65%. The proposed method also is applicable to text files for other languages.
CHANG Wei-ling YUN Xiao-chun FANG Bin-xing WANG Shu-peng LI Shu-hao
Research Centre of Computer Network and Information Security Technology,Harbin Institute of Technology,Harbin 150001,China;Institute of Computing Technology,Chinese Academy of Science,Beijing 100080,China
国际会议
The Ninth International Conference on Web-Age Information Management(第九届web时代信息管理国际会议)(WAIM 2008)
张家界
英文
2008-07-20(万方平台首次上网日期,不代表论文的发表时间)