A MULTI-AGENT REINFORCEMENT LEARNING USING ACTOR-CRITIC METHODS
This paper investigates a new algorithm in Multi-agent Reinforcement Learning. We propose a multi-agent learning algorithm that is extend single agent Actor-Critic methods to the multi-agent setting. To realize the algorithm, we introduced the value of agents temporal best-response strategy instead of the value of an equilibria. So, our algorithm uses the linear programming to compute Q values. When there are multi Nash equilibrium in the games, the mixed equilibrium was be reached. Our learning algorithm works within the very general framework of n-player, general-sum stochastic games, and learns both the game structure and its associated optimal policy.
Multi-agent Reinforcement learning Actor-critic methods Temporal best-response strategy Nash equilibrium
CHUN-GUI LI MENG WANG QING-NENG YUAN
Department of Computer Engineering, Guangxi University of Technology, Liuzhou 545006, P.R.China
国际会议
2008 International Conference on Machine Learning and Cybernetics(2008机器学习与控制论国际会议)
昆明
英文
878-882
2008-07-12(万方平台首次上网日期,不代表论文的发表时间)