Two Stage Concatenation Speech Synthesis for Embedded Devices

摘要：

Although high quality TTS engines based on concatenation speech synthesis have been developed and applied in many products (such as various call center or information inquiry systems) successfully, the limitation of memory storage and computational power of many embedded devices such as most of low-tier cellular phone obstacles their implementation. By accounting for the speech quality, memory storage, computational complexity and reusability of the CELP based vocoder module (generally resident on DSP of almost all cellular phones), a practical two stage concatenation speech synthesis scheme for low-tier phone based application is described in this paper. In the two stage framework, all the back-end processing of TTS engine is divided into two phases (parameters concatenating and waveform synthesizing) that are conducted by MCU and DSP of mobile phone respectively. Furthermore, a novel four case smooth concatenation method is proposed to accomplish the smoothing concatenation of speech unit efficiently.

作者: Yue Dong-jian

作者单位: School of Computer Engineering and Science, Shanghai University, Shanghai 200072, P.R.China

会议类型: 国际会议

会议名称: 第十届中国虚拟现实年会

会议地点: 上海

会议语种:英文

页码: 1652-1656

在线出版日期: 2010-10-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Two Stage Concatenation Speech Synthesis for Embedded Devices