会议专题

Research on the Algorithm of Interval Numbers Reinforcement Learning

Aiming at the problem that in agent reinforcement learning process, it is generally difficult to represent the environmental information and specialist experience with precise value, this paper proposes an interval numbers Q-learning algorithm according to the traditional idea of the Q-learning algorithm.which is based on the numerical value information. First of all, the paper explains the reinforcement learning which involves the environmcntal information of interval numbers, and then offers the steps of the Q-learning algorithm and the two principles to define the optimum strategies, both of which are based on the information of interval numbers. This kind of algorithm is characterized by the fact that in the reinforcement learning the utility function and the reward signal can be expressed as the interval numbers and the two principles offered show that the algorithm proposed has the convergence. In the algorithm, it is suggested that the combination Boltzmann of mechanism with the experiential inference can effectively expand the exploration so as to avoid the partial optimum.This is helpful for the learning machine to accumulate experience and to maintain the stability of fine strategies.

Machine Learning Interval Numbers Q-learning Agent Intelligent System

Guoqiang Xiong Quan Pan Hongcai Zhang

School of Business Administration, Xian University of Technology Xian, P.R.China, 710054; Northwes Northwestern Polytechnical University Xian, P.R.China, 710072

国际会议

2006 International Symposium on Distributed Computing and Applications to Business,Engineering and Science(2006年国际电子、工程及科学领域的分布式计算应用学术研讨会)

杭州

英文

791-794

2006-10-12(万方平台首次上网日期,不代表论文的发表时间)