A Robust Transcription System for Soccer Video Database

摘要：

This paper presents a robust approach for the transcription of soccer video database. By exploiting audio channels in the video, spoken information is transcribed using a canonical speech recognition system. Since soccer videos vary in both speech quality and content, the transcription system is posed with three main problems: noisy data, foreign term interferences, and emotional variations in speech prosody. Three solutions are proposed to each of the problems respectively: a noise reduction scheme, a cross-lingual transliteration model, and an advanced acoustic modeling technique. Experimental evaluations of the proposed methods are conducted on the Vietnamese AFF Suzuki-cup database consisting of over 14-hour video. In the best case, system performance reaches 83.3％ accuracy rate.

作者: Nhut M. Pham Duc A. Duong Quan H. Vu

作者单位: University of Science, VNU-HCM, Vietnam

会议类型: 国际会议

会议名称: 第十届中国虚拟现实年会

会议地点: 上海

会议语种:英文

页码: 1066-1072

在线出版日期: 2010-10-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Robust Transcription System for Soccer Video Database