Apply Language Nature Rhythm to Large Scale Duplicated Text Detection
It is urgent that detect the duplication in large scale text in the Web. An arithmetic based on language rhythm for text duplication detection is proposed here. Get the nature rhythm marked by punctuations in text and build the rhythm compare matrix to complete the publication detection for each paragraph. This arithmetic is different with the other one which is based on words analysis. And it has a high accuracy and a low complicacy.
Duplicated text detection language nature rhythm punctuation
Chen Fan Feng Zhiyong Zhao Geng
School of Computer Science and Technology, Tianjin University Information Science & Technology Depar School of Computer Science and Technology, Tianjin University Hebei University of Technology Tianjin, china
国际会议
沈阳
英文
635-640
2011-11-22(万方平台首次上网日期,不代表论文的发表时间)