Research on Improvement of Model-Free Average Reward Reinforcement Learning and Its Simulation Ezperiment

摘要：

Traditional reinforcement learning always emphasizes the independent learning of a single agent. In Multi-Agent System (MAS), considering the relationship between independent learning and group learning, this paper presents a hybrid algorithm based on average reward reinforcement learning. In learning process of the modified algorithm, it still pays attention to the independent learning. In order to select an action which can reflect the multi-agent environmental information, we add the observed information and the prediction of other agents actions when the learning agent chooses his action according to the current environmental state. The advantage of this design is that not only the agent will learn the optimal policy through autonomous study, but also as one member of MAS, the learning process can be integrated into the whole multi-agent environment. Robocup simulation league (2D) is a typical multi-agent system. By applying the new method to the training of the player, we prove the feasibility and validity of this algorithm.

关键词： Multi-agent system Reinforcement learning R-learning Robocup

作者: Wei Chen Zhenkun Zhai Xiong Li Jing Guo Jie Wang

作者单位: Faculty of Automation,Guangdong University of Technology,Guangzhou 510006

会议类型: 国际会议

会议名称: 2009年中国控制与决策会议(2009 Chinese Control and Decision Conference)

会议地点: 广西桂林

会议语种:英文

页码: 4933-4936

在线出版日期: 2009-06-17（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Research on Improvement of Model-Free Average Reward Reinforcement Learning and Its Simulation Ezperiment