会议专题

High Dimensional Sparse data Clustering Algorithm Based on Concept Feature Vector (CABOCFV)

Finding clusters of data objects in high dimensional space is challenging,especially considering that such data can be sparse and highly skewed.This paper focuses on using Concept Lattice to solve high dimensional sparse data clustering problem.Concept Lattice Theory is an effective tool for data analysis and knowledge processing,which integrates the concept intent (attribute) and concept extent (object),and describes the hierarchical relationship of concept nodes.The construction of concept lattice itself is a process of concept clustering,but it produces a huge number of concept nodes due to its own completeness.Whereas we are not interested in the concept nodes whose extent is too large or too small.This paper proposes an effective high dimensional sparse data Clustering Algorithm Based On Concept Feature Vector (CABOCFV),which reduces the redundancy of concept construction using ‘Concept Sparse Feature Distance and ‘Concept Feature Vector,and raises an effective noise recognition strategy.CABOCFV clustering algorithm is not susceptible to the input order of data objects,and scans the database only once.Experiments show that CABOCFV is effective and efficient for high dimensional sparse data clustering.

Clustering Analysis High Dimensional Data Concept Lattice Construction

Sen Wu Shujuan Gu Xuedong Gao

School of Economics and ManagementUniversity of Science and Technology Beijing,USTBBeijing,P.R.China School of Economics and Management University of Science and Technology Beijing,USTB Beijing,P.R.Chi

国际会议

2008 IEEE International Conference on Service Operations and Logistics, and Informatics(IEEE/SOLI’2008)(IEEE服务运作、物流与信息年会)

北京

英文

2008-10-12(万方平台首次上网日期,不代表论文的发表时间)