Learn to Coordinate with Generic Non-Stationary Opponents

摘要：

Learning to coordinate with non-stationary opponents is a major challenge for adaptive agents-Most previous research investigated only restricted classes of such dynamic opponents. The main contribution of this paper is twofold: (i) A class of generic non-stationary opponents is introduced. The opponents keep mixed strategies which change with less regularity. Its showed that the independent reinforcement learners (ILs), which have neither prior knowledge nor opponent models, cannot coordinate well with this type of opponent, (ii) A new exploration strategy, the DAE (Detect and Explore) mechanism, is tailored for the ILs in such coordination tasks. This mechanism allows the ILs dynamically detect changes in the opponents behavior and adjust their learning rate and exploration temperature. Its showed that ILs using this strategy are still able to converge in self-play, and are able to coordinate well with the non-stationary opponents.

关键词： Reinforcement learning Coordination game Non-stationary opponent Exploration strategy.

作者: ZHANG Kaifu

作者单位: Department of Computer Science and Technology, Tsinghua University Beijing 100084, China

会议类型: 国际会议

会议名称: Firth IEEE International Conference on Cognitive Informatics(第五届认知信息国际会议)

会议地点: 北京

会议语种:英文

页码: 558-565

在线出版日期: 2006-07-17（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Learn to Coordinate with Generic Non-Stationary Opponents