会议专题

A Dynamic Programming Model for Text Segmentation Based on Min-Max Similarity

Text segmentation has a wide range of applications such as information retrieval,question answering and text summarization.In recent years,the use of semantics has been proven to be effective in improving the performance of text segmentation.Particularly,in finding the subtopic boundaries,there have been efforts in focusing on either maximizing the lexical similarity within a segment or minimizing the similarity between adjacent segments.However,no optimal solutions have been attempted to simultaneously achieve maximum withinsegment similarity and minimum between-segment similarity.In this paper,a domain independent model based on min-max similarity (MMS) is proposed in order to fill the void.Dynamic programming is adopted to achieve global optimization of the segmentation criterion function.Comparative experimental results on real corpus have shown that MMS model outperforms previous segmentation approaches.

text segmentation within-segment similarity between-segment similarity segment lengths similarity weighting dynamic programming

Na Ye Jingbo Zhu Yan Zheng Matthew Y.Ma Huizhen Wang Bin Zhang

Institute of Computer Software and Theory,Northeastern University,Shenyang 110004,China IPVALUE Management Inc.991 Rt.22 West,Bridgewater,NJ 08807,USA Institute of Computer Applications,Northeastern University,Shenyang 110004,China

国际会议

4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)

哈尔滨

英文

141-152

2008-01-16(万方平台首次上网日期,不代表论文的发表时间)