会议专题

An Iterative Multi-Strategy Approach to Classification

In certain knowledge discovery tasks that involve classification, the number of classes may be unknown ahead of time and it may vary depending on the application context. For example, given a remotely sensed imagery dataset, the number of land cover types is not known ahead of time and it may vary with different analytic objectives (e. g., treating forest as one class or two classes by differentiating the deciduous from the coniferous). Furthermore, different classification methods have different strengths and weaknesses so that each method works well only with data of certain classes. It is desirable to combine multiple classifiers to best utilize their strengths with a given classification task. This paper presents an iterative methodology that combines clustering, classification, and domain knowledge to obtain enhanced classification results. It first uses clustering techniques and clustering evaluation metrics to determine the number of clusters in the data. The metrics include sum of squared errors, a skewness measure, and a separationcohesion index. Then it iteratively trains several classifiers and uses their predictions to obtain optimal classification results. At each iteration, the classes predicted by the most accurate classifier are kept if the accuracy exceeds the required threshold and training datasets for the remaining classes are obtained by incorporating domain knowledge. The use of the methodology is demonstrated using two satellite imagery datasets.

knowledge discovery clustering classification iterative method

Honglei Zhu Hongwei Zhu

Clark Labs, Clark University, 950 Main Street, Worcester, MA 01610, USA College of Business and Public Administration, Old Dominion University, Norfolk, VA 23529, USA

国际会议

The 9th International Symposium on Knowledge and Systems Sciences,The 4th Asia-Pacific International Conference on Knowledge Management(第九届国际知识与系统科学学术年会暨第四届亚太国际知识管理年会)

广州

英文

15-21

2008-12-11(万方平台首次上网日期,不代表论文的发表时间)