A Full Distributed Web Crawler Based on Structured Network
Distributed Web crawlers have recently received more and more attention from researchers.Full decentralized crawler without a centralized managing server seems to be an interesting architectural paradigm for realizing large scale information collecting systems for its scalability,failure resilience and increased autonomy of nodes.This paper provides a novel full distributed Web crawler system which is based on structured network,and a distributed crawling model is developed and applied in it which improves the performance of the system.Some important issues such as assignment of tasks,solution of scalability have been discussed.Finally,an experimental study is used to verify the advantages of system,and the results are comparatively satisfying.
Web crawling full distributed structured network
Kunpeng Zhu Zhiming Xu Xiaolong Wang Yuming Zhao
Intelligent Technology and Natural Language Processing Lab.School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China
国际会议
4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)
哈尔滨
英文
478-483
2008-01-16(万方平台首次上网日期,不代表论文的发表时间)