Speaker Recognition through Nonstationary Vector AR Model

摘要：

In the mean Mel cepstrum, which is obtained from analysis of speech signals by Fourier transform, the time-varying characteristic frequencies are extracted, two of which are selected to established time series composed by the characteristic frequency Mel cepstrum value series. Using the methods of time series pretreatment and mathematical statistics, their deterministic component and stochastic component are separated. Binary time series are composed of the two stochastic components. In order to further extract the speakers speech signal parameters, time-varying parameter vector AR (TVPVAR) model is established and analyzed. Using these parameters, the speakers are identified based on both stochastic components and the residuals of TVPVAR model. Speeches of 10 speakers, 100 speeches per speaker, are sampled, from which a speech is selected in turn to be recognized. Experiments show that: compared with the recognition rate (98.7％) based on the stochastic component, the recognition rate (99.6％) based on the residuals of TVPVAR model has improved. It proves that the TVPVAR model is effective to analyze autocovariance nonstationary vector time series.

关键词： nonstationary time Series TVPVAR model speaker recognition mahalanobis distance

作者: Xingxing Lu Wanchun Fei

作者单位: College of Textile and Clothing Engineering, Soochow University, Suzhou, 215006, China College of Textile and Clothing Engineering, Soochow University, Suzhou, 215006, China National Engi

会议类型: 国际会议

会议名称: 2011 Eighth International Conference on Fuzzy System and Knowledge Discovery(第八届模糊系统与知识发现国际会议 FSKD 2011)

会议地点: 上海

会议语种:英文

页码: 465-469

在线出版日期: 2011-07-26（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Speaker Recognition through Nonstationary Vector AR Model