会议专题

Effect of Clinical Depression on Automatic Speaker Identification

This study investigates effects of a clinical environment on speaker recognition rates. Two sets of speakers were used: a clinical set containing speech recordings of 70 clinically depressed speakers and a control set containing 68 non-depressed speakers. MFCC characteristic features were used to produce statistical models of speakers using four modeling methods: GMM_EM, GMM_K-means, GMM_LBG, and LBG_ITVQ. In all cases the speaker recognition rates for the depressed speakers were lower (60%-71%) than for the non-depressed speakers (79%-89%). In this work we also analyze the performance of VQ based Gaussian modeling and suggest that GMM-EM has the higher recognition rates, however the performance of GMM-ITVQ is comparable to GMM-EM. We also perform the experiments using different number of Gaussian mixtures in between 1-1024 and obtain the results that adding more mixtures increases the complexity, makes the thinner distribution of data and thus degrades the recognition rate. Results in this work also suggest that the size of train and test speech could affect the recognition rates largely.

speaker recognition depression clinical environment GMM GMM-VQ

Sheeraz Memon Namunu C.Maddage Margaret Lech Nicholas Allen

School of Electrical and Computer Engineering RMIT University Melbourne,Australia Department of Psychology The University Of Melbourne Melbourne,Australia

国际会议

The 3rd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009)(第三届生物信息与生物医学工程国际会议)

北京

英文

1-4

2009-06-11(万方平台首次上网日期,不代表论文的发表时间)