Research on the Algorithm of Interval Numbers Reinforcement Learning

摘要：

Aiming at the problem that in agent reinforcement learning process, it is generally difficult to represent the environmental information and specialist experience with precise value, this paper proposes an interval numbers Q-learning algorithm according to the traditional idea of the Q-learning algorithm.which is based on the numerical value information. First of all, the paper explains the reinforcement learning which involves the environmcntal information of interval numbers, and then offers the steps of the Q-learning algorithm and the two principles to define the optimum strategies, both of which are based on the information of interval numbers. This kind of algorithm is characterized by the fact that in the reinforcement learning the utility function and the reward signal can be expressed as the interval numbers and the two principles offered show that the algorithm proposed has the convergence. In the algorithm, it is suggested that the combination Boltzmann of mechanism with the experiential inference can effectively expand the exploration so as to avoid the partial optimum.This is helpful for the learning machine to accumulate experience and to maintain the stability of fine strategies.

关键词： Machine Learning Interval Numbers Q-learning Agent Intelligent System

作者: Guoqiang Xiong Quan Pan Hongcai Zhang

作者单位: School of Business Administration, Xian University of Technology Xian, P.R.China, 710054; Northwes Northwestern Polytechnical University Xian, P.R.China, 710072

会议类型: 国际会议

会议名称: 2006 International Symposium on Distributed Computing and Applications to Business,Engineering and Science(2006年国际电子、工程及科学领域的分布式计算应用学术研讨会)

会议地点: 杭州

会议语种:英文

页码: 791-794

在线出版日期: 2006-10-12（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Research on the Algorithm of Interval Numbers Reinforcement Learning