In recent years, the policy gradient method in intensive learning has attracted wide attention with its good convergence performance. At the same time, regulation of hyper parameters is also a matter of concern. Based...
详细信息
ISBN:
(纸本)9781538626191
In recent years, the policy gradient method in intensive learning has attracted wide attention with its good convergence performance. At the same time, regulation of hyper parameters is also a matter of concern. Based on the advantages of Actor-Critic structure (AC), the Natural-Gradient Actor-Critic algorithm (NAC) in the discount model is studied in this article. Then the Natural-Gradient Actor-Critic with ADADELTA (A-NAC) algorithm is proposed. The use of ADADELTA is adapted to adjust the learning rate in the actor network, and further improves the convergence speed of the NAC algorithm. Simulation results show that NAC/A-NAC have better learning efficiency and faster convergence rate than regular gradient AC methods.
暂无评论