会议专题

Active Learning for Cross-Lingual Sentiment Classification

  Cross-lingual sentiment classification alms to predict the sentiment orientation of a text in a language (named as the target language) with the help of the resources from another language (named as the source language).How ever, current cross-lingual performance is normally far away from satisfaction due to the huge difference in linguistic expression and social culture.In this pa per, we suggest to perform active learning for cross-lingual sentiment classifica tion, where only a small scale of samples are actively selected and manually annotated to achieve reasonable performance in a short time for the target lan guage.The challenge therein is that there are normally much more labeled sam pies in the source language than those in the target language.This makes the small amount of labeled samples from the target language flooded in the ab oundance of labeled samples from the source language, which largely reduces their impact on cross-lingual sentiment classification.To address this issue, we propose a data quality controlling approach in the source language to select high-quality samples from the source language.Specifically, we propose two kinds of data quality measurements, intra-and extra-quality measurements, from the certainty and similarity perspectives.Empirical studies verify the appropriateness of our active learning approach to cross-lingual sentiment classification.

Shoushan Li Rong Wang Huanhuan Liu Chu-Ren Huang

Natural Language Processing Lab, School of Computer Science and Technology, Soochow University, Chin Natural Language Processing Lab, School of Computer Science and Technology, Soochow University, Chin CBS, The Hong Kong Polytechnic University, Hong Kong

国际会议

Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)

重庆

英文

236-246

2013-11-15(万方平台首次上网日期,不代表论文的发表时间)