会议专题

Web Page Data Collection Based on Multithread

  The web data collection is the process of collecting the semi-structured,large-scale and redundant data which include web content,web structure and web usage in the web by the crawler and it is often used for the information extraction,information retrieval,search engine and web data mining.In this paper,the web data collection principle is introduced and some related topics are discussed such as page download,coding problem,updated strategy,static and dynamic page.The multithread technology is described and multithread mode for the web data collection is proposed.The web data collection with multithread can get better resource utilization,better average response time and better performance.

web page data collection multithread

Wentao Liu

School of Mathematic and Computer Science Wuhan Polytechnic University Wuhan Hubei Province 430023,China

国际会议

2013 2nd International Conference on Computer Science and Electronics Engineering(ICCSEE2013)(2013年第二届计算机科学与电子工程国际会议)

杭州

英文

2024-2027

2013-03-22(万方平台首次上网日期,不代表论文的发表时间)