ANALYSIS OF UCT ALGORITHM POLICIES IN IMPERFECT INFORMATION GAME

摘要：

　　For the problem of mini-max tree search,Upper Confidence Bound (UCB) algorithm for multi-armed bandit problem has already been extended to algorithm UCT (UCB applied to Trees).It has shown advantages in the search tree with high branching factors and attained a great success in several domains such as Go program.In this paper,exploration and exploitation balance factor (EBF) is introduced as important parameter in UCT policies.Based on a known domain,which is called Siguo game,the performances for the different parameterized policies of UCT algorithm are compared and analysis is provided also.Following,some hypotheses about the cause of the problems are presented.Moreover,the suggested method about adoption and parameterization of UCT policies is provided for different type and characteristics of game problems.

关键词： Computer game Imperfect information UCT Monte-Carlo sampling Exploration-exploitation

作者: Jiajia Zhang Xuan Wang Ling Yang Jia Ji Dongsheng Zhi

作者单位: Intelligence Computing Research Center Harbin Institute of Technology Shenzhen Graduate School,C302,HIT Campus Shenzhen University Town,NanShan District,XiLi,Shenzhen 518055,China

会议类型: 国际会议

会议名称: 2012 2nd IEEE International Conference on Cloud Computing and Intelligence Systems (2012年第2届IEEE云计算与智能系统国际会议(IEEE CCIS2012))

会议地点: 杭州

会议语种:英文

页码: 168-173

在线出版日期: 2012-10-30（万方平台首次上网日期，不代表论文的发表时间）

会议专题

ANALYSIS OF UCT ALGORITHM POLICIES IN IMPERFECT INFORMATION GAME