Removing fillers to induce semantic classes for a Chinese dialogue system
In this paper, we introduced an nnsupervised method to remove fillers in spoken dialogues semi-automatically based on their probability distribution. Disfluencies such as fillers, repairs often make the sentence ill-formed, longer and hard to process. Fillers were emphasized instead of repairs in this paper. We conduct the unigram and bigram distribution of fillers on our Chinese voice search data and find that only using these distributions, fillers are in the first 1% of all words. We give a new perspective of fillers distribution and new measure to detect fillers on the natural dialogue corpus.
fillers detection fillers distribution spoken dialogues
Yali Li Yonghong Yan
ThinkIT laboratory Institute of acoustics Chinese academy of science Beijing China
国际会议
The 2nd IEEE International Conference on Advanced Computer Control(第二届先进计算机控制国际会议 ICACC 2010)
沈阳
英文
163-166
2010-03-27(万方平台首次上网日期,不代表论文的发表时间)