会议专题

THE DESIGN AND IMPLEMENTATION OF THE CRAWLER-INAR

This paper discusses the design and implementation of a web crawler-Inar written in C++ executed on Linux. It is a single-threaded crawler base on asynchronous I/O technology.It is under development now. This paper describes the architecture of the web crawler and discusses the design and the function of its each component in detail. For some design problems that we met in practice, such as URL queues design,hash algorithm design, we proposed our solution.

Crawler single thread asynchronous I/O web

YU-XIN DING XIAO-LONG WANG LE-BIN LIN QI ZHANG YONG-HUI WU

Department of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China

国际会议

2006 International Conference on Machine Learning and Cybernetics(IEEE第五届机器学习与控制论坛)

大连

英文

4527-4530

2006-08-13(万方平台首次上网日期,不代表论文的发表时间)