Ensemble Non-negative Matrix Factorization for Clustering Biomedical Documents
Searching and mining biomedical literature database, such as MEDLINE, is the main source of generating scientific hypothesis for biomedical researchers. Through grouping similar documents together, clustering techniques can facilitate users need of effectively finding interested documents. Since non-negative matrix factorization (NMF) can effectively capture the latent semantic space with non-negative factorization.in both the basis and the weight, it has been utilized to clustering general text documents. Considering the stochastic nature of NMF with respect to initialization, we propose to use ensemble NMF for biomedical document clustering. The performance of ensemble NMF was evaluated on clustering a large number of datasets generated from TREC Genomics track dataset. The experimental results show that our method outperforms classical clustering algorithms bisect k-means, k-means and hierarchical clustering significantly in most of the datasets.
Shanfeng Zhu Wei Yuan Fei Wang
School of Computer Science and Technology, Fudan University, Shanghai 200433, China Shanghai Key Lab of Intelligent Information Processing, Fudan University., Shanghai 200433, China
国际会议
The Second International Symposium(OSB08)(第二届国际优化及系统生物学学术会议)
云南丽江
英文
358-364
2008-10-31(万方平台首次上网日期,不代表论文的发表时间)