会议专题

Automatic Speech Sentence Segmentation from Multi-paragraph Databases*

Speech sentence is the input of automatic phonetic segmentation or ranscription. This paper discusses our efforts on automatic speech sentence segmentation from multi-parag raph speech databases for building Text-To-Speech (TTS)system speech corpus automatically. We present a) a system of automatic speech sentence segmentation from broadcasting audio based on forced alignment technique, in which a checking Mechanism based on speech recognition technique is also used, b) an iterative algorithm to improve the system, c) a music detector based on a scheme combination of Variable Duration Hidden Markov Model (VDHMM) and Gaussian Mixture Model (GMM). Experiments show that the improved system has 98.93% of Sentence Accurate Rate (SAR) and generates 646 correct sentences, compared with 97.85% of SAR, and 155 correct sentences in original system.

speech process automatic speech sentence segmentation multi-paragraph speech database

ZHANG Wei PANG Minhui DU Ranran LIU Yayu

Department of Computer Science and technology Ocean University of China Qingdao 266100, China

国际会议

2010 International Conference on Measuring Technology and Mechatronics Automation(ICMTMA 2010)(2010年检测技术与机电自动化国际会议)

长沙

英文

721-724

2010-03-13(万方平台首次上网日期,不代表论文的发表时间)