Model-free Adaptive Dynamic Programming for Unknown Systems
In this paper, we present online model-free adaptive critic(AC)schemesbasedonapproximatedynamic programming (ADP) to solve for optimal control problems in both discrete-time and continuous-time domains. In the discretetime case, it is shown that the proposed ADP algorithm is in fact solving the underlying Generalized Algebraic Riccati Equation(GARE) of the corresponding optimal control problem or zerosum game. In the continuous-time domain, an ADP scheme is introduced to solve for the underlying ARE of the optimal control problem. It is shown that this continuous-time ADP scheme is in fact a Quasi-Newton method to solve the ARE. In both time domains, the adaptive critic algorithms are easy to initialize since initial policies are not required to be stabilizing.
Approximate Dynamic Programming Adaptive Critics Q-learning Policy iterations Optimal control Zero-sum games.
Murad Abu-Khalaf Frank L.Lewis Asma Al-Tamimi Draguna Vrabie
Automation and Robotics Research Institute The University of Texas at Arlington 7300 Jack Newell Blvd.S, Ft.Worth, Texas 76118-7115
国际会议
厦门
英文
105-114
2006-07-27(万方平台首次上网日期,不代表论文的发表时间)