Multimodal Dialog System for Kyoto Sightseeing Guide
We proposed a dialog system on Kyoto tourist information assistance in a client-server fashion. Our proposed system is called the “proactive dialog system and aims to present acceptable information in an acceptable time. We developed two prototype systems. The first one is designed for mobile use. It was implemented in iPhone and its application is opened to the public in AppStore. The second one is designed for multi-modal information integration on large display panel. It can detect non-verbal information, such as changes in gaze and facial direction as well as head gestures of the user during dialog, and recommend suitable information. These two prototype client systems are basically connecting to the server module. This server module uses a weighted finite-state transducer (WFST) in which user concept and system action tags are input and output of the transducer. We implemented a dialog scenario to present sightseeing information on the system. In our proposed dialog system, we designed our system’s behavior like human behavior. One of the most enduring problems in spoken dialogue systems research is realizing a natural dialogue in a human-human form. One-direction researchers have been utilizing spontaneous nonverbal and paralinguistic information. So that we collect human to human dialog corpus, and semi-automatically design a scenario which handles dialog in response to user’ input so as to accomplish a task efficiently. Especially we focus on users’ verbal feedback and non-verbal feedback in the form of nods. This paper presents our proposed system’s outline and its function. After that in this paper, we display the results of an evaluation of image processing techniques for estimating facial direction from a camera for a multi-modal spoken dialog system on a large display panel. Experiments that consist of 100 sessions with 80 subjects were conducted to evaluate the system’s efficiency. The system grows particularly clear when dialog contains recommendations.
Hideki Kashioka Teruhisa Misu Etsuo Mizukami Yoshinori Shiga Kentaro Kayama Chiori Hori Hisashi Kawai
National Institute of Information and Communications Technology (NICT), Kyoto, Japan.
国际会议
2011亚太信号与信息处理协会年度峰会(APSIPAASC 2011)
西安
英文
1-6
2011-10-18(万方平台首次上网日期,不代表论文的发表时间)