Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent P
郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布!
IEEE TRANSACTIONS ON COMPUTERS, VOL. 66, NO. 2, FEBRUARY 2017
Abstract
Index Terms—Reinforcement learning, spiking neural network, hardware neural network, spike-timing-dependent plasticity, and actor-critic network
1 INTRODUCTION
2 BACKGROUND
3 STDP AS A MEASURE OF GRADIENT
4 REINFORCEMENT LEARNING WITH STDP
5 HARDWARE ARCHITECTURE
6 CONCLUSION AND FUTURE WORK