Two Stage Concatenation Speech Synthesis for Embedded Devices
Although high quality TTS engines based on concatenation speech synthesis have been developed and applied in many products (such as various call center or information inquiry systems) successfully, the limitation of memory storage and computational power of many embedded devices such as most of low-tier cellular phone obstacles their implementation. By accounting for the speech quality, memory storage, computational complexity and reusability of the CELP based vocoder module (generally resident on DSP of almost all cellular phones), a practical two stage concatenation speech synthesis scheme for low-tier phone based application is described in this paper. In the two stage framework, all the back-end processing of TTS engine is divided into two phases (parameters concatenating and waveform synthesizing) that are conducted by MCU and DSP of mobile phone respectively. Furthermore, a novel four case smooth concatenation method is proposed to accomplish the smoothing concatenation of speech unit efficiently.
Yue Dong-jian
School of Computer Engineering and Science, Shanghai University, Shanghai 200072, P.R.China
国际会议
上海
英文
1652-1656
2010-10-20(万方平台首次上网日期,不代表论文的发表时间)