会议专题

Learning Semantic Similarity for Multi-label Text Categorization

  The multi-label text categorization is supervised learning,where a document is associated with multiple labels simultaneously.The current multi-label text categorization approaches suffer from limitations when the expensive labelled text data is little but the unlabelled text data is abundant,because they are unable to exploit information from unlabelled text data.To address this problem,we learn the word semantic similarity by deep learning using the unlabelled text data,and then incorporate the learned word semantic similarity into current multi-label text categorization approaches.We conduct experiments with the Slashdot and Tmc2007 datasets,and these experiments demonstrate our proposed method will greatly improve the performance of current multi-label text categorization approaches.

Li Li Mengxiang Wang Longkai Zhang Houfeng Wang

Key Laboratory of Computational Linguistics,Peking University,Ministry of Education,Beijing,China

国际会议

Chinese Lexical Semantics 15th Workshop(CLSW 2014)(第十五届汉语词汇语义学国际研讨会)

澳门

英文

260-269

2014-06-09(万方平台首次上网日期,不代表论文的发表时间)