咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Fully Spiking Actor Network wi... 收藏
arXiv

Fully Spiking Actor Network with Intra-layer Connections for Reinforcement Learning

作     者:Chen, Ding Peng, Peixi Huang, Tiejun Tian, Yonghong 

作者机构:Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai200240 China Department of Computer Science and Technology Peking University Beijing100871 China Network Intelligence Research PengCheng Laboratory Shenzhen518066 China School of Electronics Computer Engineering Peking University Shenzhen518055 China 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Reinforcement learning 

摘      要:With the help of special neuromorphic hardware, spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (DRL). In this paper, we focus on the task where the agent needs to learn multi-dimensional deterministic policies to control, which is very common in real scenarios. Recently, the surrogate gradient method has been utilized for training multi-layer SNNs, which allows SNNs to achieve comparable performance with the corresponding deep networks in this task. Most existing spike-based RL methods take the firing rate as the output of SNNs, and convert it to represent continuous action space (i.e., the deterministic policy) through a fully-connected (FC) layer. However, the decimal characteristic of the firing rate brings the floating-point matrix operations to the FC layer, making the whole SNN unable to deploy on the neuromorphic hardware directly. To develop a fully spiking actor network without any floating-point matrix operations, we draw inspiration from the non-spiking interneurons found in insects and employ the membrane voltage of the non-spiking neurons to represent the action. Before the non-spiking neurons, multiple population neurons are introduced to decode different dimensions of actions. Since each population is used to decode a dimension of action, we argue that the neurons in each population should be connected in time domain and space domain. Hence, the intra-layer connections are used in output populations to enhance the representation capacity. This mechanism exists extensively in animals and has been demonstrated effectively. Finally, we propose a fully spiking actor network with intra-layer connections (ILC-SAN). Extensive experimental results demonstrate that the proposed method outperforms the state-of-the-art performance on continuous control tasks from OpenAI gym.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分