The research of Self-Organizing Maps based on Document Collections

摘要：

　　Web text mining is a new issue in the knowledge discovery research field.It is aimed to help people discover knowledge from large quantities of semi-structured or unstructured text in the web.Several approaches,including some pure and hybrid information retrieval (IR) methods,have been proposed to tackle such an issue.Among these approaches,combining the Self-Organizing Map (SOM) method with the principles of the vector-space model,appears to be a promising alternative for the traditional purely IR-based methods in this problem domain.The encoded documents are organized on another self-organizing map,a document map,on which nearby locations contain similar documents.Special consideration is given to the computation of very large document maps which is possible with general-purpose computers if the dimensionality of the word category histograms is first reduced with a random mapping method and if computationally efficient algorithms are used in computing the SOMs.

关键词： Data mining Document Collections SOM WEBSOM

作者: Yi Ding Xian Fu

作者单位: The college of computer science and technology Hubei Normal University, Huangshi,China

会议类型: 国际会议

会议名称: the 2012 International Conference on Frontiers of Advanced Materials and Engineering Technology (2012年先进材料与工程技术国际会议(FAMET 2012))

会议地点: 厦门

会议语种:英文

页码: 1232-1235

在线出版日期: 2012-01-04（万方平台首次上网日期，不代表论文的发表时间）

会议专题

The research of Self-Organizing Maps based on Document Collections