会议专题

MSBGA: A MULTI-DOCUMENT SUMMARIZATION SYSTEM BASED ON GENETIC ALGORITHM

The multi-document summarizer using genetic algorithm-based sentence extraction (MSBGA) regards summarization process as an optimization problem where the optimal summary is chosen among a set of summaries formed by the conjunction of the original articles sentences. To solve the NP hard optimization problem, MSBGA adopts genetic algorithm, which can choose the optimal summary on global aspect. The evaluation function employs four features according to the criteria of a good summary: satisfied length,high coverage, high informativeness and low redundancy. To improve the accuracy of term frequency, MSBGA employs a novel method TFS, which takes word sense into account while calculating term frequency. The experiments on DUC04 data show that our strategy is effective and the ROUGE-1 score is only 0.55% lower than the best participant in DUC04.

Multi-document summarization genetic algorithm MSBGA term frequency with sense (TFS)

YAN-XIANG HE DE-XI LIU DONG-HONG JI HUA YANG CHONG TENG

School of Computer, Wuhan University, Wuhan 430079 P.R.China School of Computer, Wuhan University, Wuhan 430079 P.R.China;School of Physics, Xiangfan University, Center for Study of Language and Information, Wuhan University, Wuhan 430079 P.R.China;Institute for School of Computer, Wuhan University, Wuhan 430079 P.R.China;Center for Study of Language and Inform

国际会议

2006 International Conference on Machine Learning and Cybernetics(IEEE第五届机器学习与控制论坛)

大连

英文

2659-2664

2006-08-13(万方平台首次上网日期,不代表论文的发表时间)