Towards human-like production and binaural localization of speech sounds in humanoid robots

摘要：

We present a prototype of a humanoid robot head equipped with human-like speech sound localization and production systems designed for a new generation of robots that should autonomously evolve language and other cognitive skills. Similarly to the human auditory apparatus, the robot head contains a binaural sensor system based upon a frequency domain binaural model. This enables the robot to detect and locate the speaker autonomously on the basis of the produced speech signals. However, the temporal regularity of incoming sounds is in humans analyzed on different time scales, with the millisecond range giving rise to the sensation of pitch and the periods on the order of seconds giving rise to the sensation of rhythm. In addition, unlike for humans, detecting and localizing multiple sound signals is a rather nontrivial problem for machine audition. We therefore discuss a possible implementation of human-like spatiotemporal processing of sounds in single and multisource scenarios. Our future goals are to adequately combine the constructed speech synthesis and physical audio systems, and to establish an algorithm for detailed spatiotemporal localization of both single and concurrent speech sound sources, with roughly human-like temporal and spatial processing capabilities.

关键词： binaural speech localization production humanoid

作者: Robert Wolff Mario Lasseck Manfred Hild Oscar Vilarroya Tarik Hadzibeganovic

作者单位: Labor für Neurorobotik Institut für Informatik Humboldt-Universit.t zu Berlin Berlin,Germany Unitat de Recerca en Neurociència Cognitiva Departament de Psiquiatria i Medicina Legal Universitat

会议类型: 国际会议

会议名称: The 3rd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2009)(第三届生物信息与生物医学工程国际会议)

会议地点: 北京

会议语种:英文

页码: 1-4

在线出版日期: 2009-06-11（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Towards human-like production and binaural localization of speech sounds in humanoid robots