Based-on Vector Space Model Research and Implementation of Web Filtering
This paper proposes a new information filtering method and implements it, which applies two-level filtering strategy and combines filtering technology based on URL and content filtering.It only executes content filtering when the requested URL isnt in white URL lists and black URL lists,and updates URL lists according to content filtering steps result.In this way,it has both real-time characteristic of URL filtering and comprehensive characteristic of content filtering.The web page filtering system captures HTTP packets by using Winsock 2 SPI,extracts web pages content by applying the new proposed method and represents text by vector space modal. It proposes a two-stage filtering arithmetic based on vector space modal and realizes its system.Experimental results show that the system has good filtering accuracy and performance.
URL filtering content filtering vector space modal
LU Qingmei Zhang hongliang
Department of Electronic Science and Technology, North University of China, Taiyuan, Shanxi, China, 030051
国际会议
第八届国际测试技术研讨会(8th International Symposium on Test and Measurement)
重庆
英文
3590-3593
2009-08-01(万方平台首次上网日期,不代表论文的发表时间)