An Iterative Approach to Model Merging for Speech Pattern Discovery

摘要：

This paper introduces a novel approach to automatically discover recurrent speech patterns from multi-speaker corpus without a priori knowledge. The proposed approach is based on the sub-word acoustic units and it iteratively concatenates the most-likely joint sub-word units to produce a longer acoustic unit till our proposed stop criterion is satisfied. Among the resulting acoustic units, the units with the most stable number of occurrences are selected as the lexicon. The proposed approach has been applied to automatically discover English words from TIDIGIT corpus. The experimental results measured by F1 score showed the proposed approach can effectively detect and extract the recurrent patterns. This technique can be used for lexicon generation from an unknown speech corpus or in audio content summarization.

作者: Lei Wang Eng Siong Chng Haizhou Li

作者单位: Nanyang Technological University, Singapore Nanyang Technological University, Singapore Institute for Infocomm Research, Singapore

会议类型: 国际会议

会议名称: 2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)

会议地点: 西安

会议语种:英文

页码: 1-5

在线出版日期: 2011-10-18（万方平台首次上网日期，不代表论文的发表时间）

会议专题

An Iterative Approach to Model Merging for Speech Pattern Discovery