Hierarchical State Representation and Q-Learning for Agent-Based Herding

摘要：

A primary challenge of agent-based policy learning in complex and uncertain environments is escalating computational complexity with the size of the task space and the number of agents. Nonetheless, there is ample evidence in the natural world that high functioning social mammals learn to solve complex problems with ease. This ability to solve computationally intractable problems stems in part from brain circuits for hierarchical representation of state and action spaces and learned policies arising from these representations. Using such mechanisms for state representation and action abstraction, we constrain state-action choices in reinforcement learning in order to improve learning efficiency and generalization of learned policies within a single-agent herding task. We show that satisficing and generalizable policies emerge, which reduce computational cost, and/or memory resources.

关键词： Markov decision process reinforcement learning hierarchical state representation robotic herding

作者: Tao Mao Laura E. Ray

作者单位: Thayer School of Engineering Dartmouth College Hanover, NH 03755, U.S.A Thayer School of Engineering Dartmouth College Hanover, NH 03755, U.S.Am

会议类型: 国际会议

会议名称: 2011 3rd International Conference on Computer and Automation Engineering(ICCAE 2011)(2011年第三届IEEE计算机与自动化工程国际会议)

会议地点: 重庆

会议语种:英文

页码: 129-135

在线出版日期: 2011-01-21（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Hierarchical State Representation and Q-Learning for Agent-Based Herding