会议专题

Near-replicas of Web Pages Eliminating Repetitive Algorithms Based onMD5

  The development of the internet and exponential growth of network information produce a large number of duplicated pages on the network,reducing the retrieval of recall and precision and affecting the retrieval efficiency.The accuracy of the web,therefore,influences the quality of search engine.On the basis of the structural text description,this paper proposes an improved eliminating repetitive algorithm method,which is based on MD5 of Near-replicas.It proves that the method has a good effect on improving the recall and the precision through experiment.

structured web MD5 eliminating repetitive of Web pages eliminating repetitive algorithm

Junya Yan Xiaohui Ma Wenjuan Zhao

Business College of Shanxi University, Taiyuan Shanxi, China

国际会议

2012 2nd international Conference on Materials Science and Information Technology(2012第二届材料科学与信息技术国际会议)(MSIT2012)

西安

英文

1752-1756

2012-08-24(万方平台首次上网日期,不代表论文的发表时间)