I-vector Based Speaker Gender Recognition

摘要：

　　Automatic gender recognition has been becoming very important in potential applications.Many state-of-the-art gender recognition approaches based on a variety of biometrics,such as face,body shape,voice,are proposed recently.Among them,relying on voice is suboptimal due to significant variations in pitch,emotion,and noise in real-world speech.Inspired from the speaker recognition approaches relying on i-vector presentation in NIST SRE,its believed that i-vector contains information about gender as a part of speakers characters,and works for speaker recognition as well as for gender recognition in complex environments.So,we apply the total variability space analysis to gender classification and propose i-vector based discrimination for speaker gender recognition.The results of experiments on TIMIT corpus and NUST603_2014 database show that the proposed i-vector based speaker gender recognition improves the performance up to 99.9%,and surpasses the pitch method and UBM-SVM baseline subsystems in term of accuracy comparatively.

关键词： speech processing gender recognition i-vector mel frequency cepstrum coefficient

作者: Minghe Wang Ying Chen Zhenmin Tang Erhua Zhang

作者单位: School of Computer Science and Engineering Nanjing University of Science and Technology, NUST Nanjing, China

会议类型: 国际会议

会议名称: 2015 IEEE Advanced Information Technology, Electronic and Automation Control Conference(IAEAC 2015)(2015 IEEE先进信息技术,电子与自动化控制国际会议)

会议地点: 重庆

会议语种:英文

页码: 729-732

在线出版日期: 2015-12-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

I-vector Based Speaker Gender Recognition