Dimensionality Reduction for Text Using LLE

摘要：

Dimensionality reduction is a necessary preprocessing step in many fields of information processing such as information retrieval, pattern recognition and data compression. Its goal is to discover the representative or the discriminative information residing in raw data. Locally linear embedding (LLE), one of effective manifold learning algorithms, addresses this problem by computing low-dimensional, neighborhood preserving embeddings of high-dimensional data. The embedding is derived from the symmetries for locally linear reconstructions. And the computation of this embedding is related to an eigen-problem in the i mplement. Since LLE was proposed, it has been being applied to deal with image data only because it originated from facial recognition. However, the problem of curse of dimensionality is very prevalent. Therefore, we here try to apply this algorithm for text processing. In this paper, we introduce the LLE briefly and analyze its advantage and latent disadvantages, and the relationship between LSI and LLE in the graph embedding framework is then discussed from a theoretic view. Finally, the experimental results are show with the datasets of Reuters21578 and TDT2.

作者: Chuan HE Zhe DONG Ruifan LI Yixin ZHONG

作者单位: School of Information Engineering,Beijing University of Posts and Telecommunications.Beijing,China

会议类型: 国际会议

会议名称: The 2008 IEEE International Conference on Natural Language Processing and Knowledge Engineering(IEEE NLP-KE 2008)(2008IEEE自然语言处理与知识工程国际会议)

会议地点: 北京

会议语种:英文

在线出版日期: 2008-10-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Dimensionality Reduction for Text Using LLE