会议专题

DESIGN AND IMPLEMENTATION OF WEB HOT-TOPIC TALK MINING BASED ON SCALE-FREE NETWORK

Data mining of Web hot-topic talks is one of the important branches on the web text mining. In traditional data mining system of the web hot-topic talks, it is assumed that the importance of each web page is equal. However,complex network composed of the web hot-topic talks is not a homogeneous network because of having the scale-free characteristic on the Intemet. So the assumption above is not reasonable for this network. In this paper the topology of the complex network is analyzed and shown that the network has the scale-free characteristic firstly. Then the mining system based on the scale-free topology is designed, and some main modules are introduced. The workflow of this system is presented, and the implementations of two core modules,which are the analysis module of sites topology and the distributing proportion to these web-pages, are proposed in detail. Finally, the merits and shortcomings of this system are concluded, and this paper is summarized.

Web hot-topic talk data mining scale-free network power-law distribution

SEN QIN GUAN-ZHONG DAI YAN-LING LI

College of Automation, Northwestern Polytechnical University, Xian, 710072, China

国际会议

2006 International Conference on Machine Learning and Cybernetics(IEEE第五届机器学习与控制论坛)

大连

英文

1184-1189

2006-08-13(万方平台首次上网日期,不代表论文的发表时间)