A Non-myopic Approach Based on Reinforcement Learning for Multiple Moving Targets Search

摘要：

Myopic information-based approaches maximizing information gain for single one observation opportunity are effective to search for multiple moving targets in ocean surveillance by space-based sensors. A non-myopic approach based on reinforcement learning is developed in order to maximize information gain for the long term. Reinforcement learning adjusts optimal control policy and learns system behaviors through trial-and-error experience from interactions with a dynamic environment. System states are characterized by the expected information gain, action-value functions are estimated by online SARAR (lambda) algorithm and parameterized control policy is approximated by neural networks. Finally, simulations show that non-myopic approach after sufficient training can provide better performance than myopic approach.

关键词： optimal search theory reinforcement learning multiple moving targets maritime surveillance satellite

作者: Yifan Xu Yuejin Tan Zhenyu Lian Renjie He

作者单位: College of Information System and Management National University of Defense Technology Changsha,Hunan 410073,PRC

会议类型: 国际会议

会议名称: 2010 IEEE信息与自动化国际会议(ICIA 2010)

会议地点: 哈尔滨

会议语种:英文

页码: 1-6

在线出版日期: 2010-06-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Non-myopic Approach Based on Reinforcement Learning for Multiple Moving Targets Search