会议专题

Detection of Simple Plagiarism in Computer Science Papers

Plagiarism is the use of the language and thoughts of another work and the representation of them as ones own original work. Various levels of plagiarism exist in many domains in general and in academic papers in particular. Therefore, diverse efforts are taken to automatically identify plagiarism. In this research, we developed software capable of simple plagiarism detection. We have built a corpus (C) containing 10,100 academic papers in computer science written in English and two test sets including papers that were randomly chosen from C. A widespread variety of baseline methods has been developed to identify identical or similar papers. Several methods are novel. The experimental results and their analysis show interesting findings. Some of the novel methods are among the best predictive methods.

Yaakov HaCohen-Kerner Aharon Tayeb Natan Ben-Dror

Department of Computer Science, Jerusalem College of Technology (Machon Lev) Department of Computer Science, Jerusalem College of Technology (Machon Lev)

国际会议

The 23rd International Conference on Computational Linguistics(第23届国际计算语言学大会)

北京

英文

421-429

2010-08-01(万方平台首次上网日期,不代表论文的发表时间)