562015-03-24 , 法政大学大学院理工学・工学研究科
Goto and Shibata(2010) proposed a reinforcement learning algorithm using a recurrent neural network. Back Propagation Through Time (BPTT) was used for the neural network’s learning rule. This algorithm autonomously acquires prediction functions for tasks that are difficult to be accomplished without these functions. To verify the effectiveness of this method, they used episodic tasks where the starting state and the terminal state of the tasks could be given explicitly. However, in the real world, there are many continuous tasks that cannot indicate the starting and terminal state in detail. This study verifies the performance of the previous method on continuous tasks and presents a new method that uses Real Time Recurrent Learning (RTRL) which allows real-time learning for the neural network. The results indicated that the previous method had a good performance even in continuous tasks. On the other hand, the new method using RTRL was inferior to the previous method in performance for both continuous tasks and episodic tasks.