Battlefield Agent Alliance Decision-Making Two Layer Reinforcement Learning Algorithm
In the background of Agent Alliance combat deduction, here we present a Two Layer Reinforcement learning algorithm, referred to a TLRL algorithm, for the special requirements of battlefield simulation environment Agents offensive and defensive decision-making study. The algorithm model is classified into two layers: one is the global decision-making Agent, called Commandant Agent, learning from the environment as well as both enemies and friends actions, the other is the Servant Agents optimizing the action by receiving local environment feedback. Finally the war situation deduction which is carried out on the simulation platform TBS we set up, has showed the fast convergence and effectiveness of this algorithm.
battlefield agent alliance decision-making reinforcement learning
Xie Zhi-jun Dong Chao-yang Yang Fei Chen Wei
School of Automation Science and Electrical Engineering Beihang University Beijing, China School of Aerospace Science and Engineering Beihang University Beijing, China
国际会议
太原
英文
174-178
2010-10-22(万方平台首次上网日期,不代表论文的发表时间)