Hardware-Friendly Actor-Critic Reinforcement Learning Through Modulation of Spike-Timing-Dependent P


郑重声明:原文参见标题,如有侵权,请联系作者,将会撤销发布!

IEEE TRANSACTIONS ON COMPUTERS, VOL. 66, NO. 2, FEBRUARY 2017

Abstract

Index Terms—Reinforcement learning, spiking neural network, hardware neural network, spike-timing-dependent plasticity, and actor-critic network

1 INTRODUCTION

2 BACKGROUND

3 STDP AS A MEASURE OF GRADIENT

4 REINFORCEMENT LEARNING WITH STDP

5 HARDWARE ARCHITECTURE

6 CONCLUSION AND FUTURE WORK

相关