Multi-agent Cooperation by Q-learning in Continuous Action Domain
In this paper we propose Q-learning with continuous action space and extend this algorithm to a multi-agent system.Conventional Q-learning needs a pre-defined and discrete state space.But it is not practical because the states of the environment in the real world and actions are both continuous.The algorithm will use a concept that is similar to the SRV(Stochastic Real-Valued Unit)to train the actions in each state.The convergence of the SRV may fall into local solution even if it has never reached the optimal solution.In order to overcome this drawback,the Q-learning with SRRV(Stochastic Recording Real-Valued unit)is proposed,and it shows that the SRRV will converge more quickly.
Q-learning Stochastic Real-Valued Unit
Kao-Shing Hwang Member of IEEE Yu-Hong Lin Chia-Yue Lo
Electrical Engineering,National Chung Cheng University 168,University Rd.,Min Hsiung Chia-Yi,Taiwan,ROC
国际会议
武汉
英文
2008-11-01(万方平台首次上网日期,不代表论文的发表时间)