Based on rough sets and the associated analysis of KNN text classification research
With the rapid development of network information technology,the text is as a basic information carrier and begins to present exponential growth.The existing text classification methods havent got information from the vast amounts of information resources timely and accurately.In order to solve the problem,the paper puts forward a new method about text categorization.It is a KNN algorithm based on rough set and correlation analysis.Firstly,we introduce the concept of rough set.In the training set of text vector space,we divide all kinds of text vector spaces into certain and uncertain areas.For certain areas,we can directly judge its category.For uncertain areas,we determine the type of text vector through KNN text classification algorithm based on correlation analysis.Experimental results show that the KNN text classification algorithm based on rough sets and the associated analysis have greatly improved the efficiency and accuracy of text categorization.It can meet the requirements of processing large amounts of text data.
text classification k-NearestNeighbo Correlation analysis The rough set
Guo Aizhang Yang Tao
Qilu University of Technology Jinan250353,China
国际会议
贵阳
英文
485-488
2015-08-18(万方平台首次上网日期,不代表论文的发表时间)