Into AI

A Paper a Week - Week 18

Posted on June 8, 2024

Introduction For the eighteenth post in this series, I read “Policy Gradient Methods for Reinforcement Learning with Function Approximation” by Sutton et al. This paper presents the famous “policy gradient theorem” which is the basis for policy gradient methods in reinforcement learning. The authors show that the gradient of a... [Read More]

Tags:

week18
paper

A Paper a Week - Week 17

Deep Reinforcement Learning with Double Q-learning

Posted on May 18, 2024

Introduction For the seventeenth post in this series, I read “Deep Reinforcement Learning with Double Q-learning” by van Hasselt et al. This paper introduces a new algorithm called Double DQN, which extends the DQN algorithm by integrating Double Q-learning in order to address the overestimation bias of Q-learning. The authors... [Read More]

Tags:

week17
paper

A Paper a Week - Week 16

Monte-Carlo Tree Search and Rapid Action Value Estimation in Computer Go - Part 2

Posted on May 11, 2024

Introduction For the sixteenth post in the series, I continued reading “Monte-Carlo Tree Search and Rapid Action Value Estimation in Computer Go” by Sylvain Gelly. This week I’ll be covering the rest of the paper, which discusses the application of MCTS to the game of Go and the introduction of... [Read More]

Tags:

week16
paper

A Paper a Week - Week 15

Monte-Carlo Tree Search and Rapid Action Value Estimation in Computer Go

Posted on April 13, 2024

Introduction For the fifteenth post in the series, I read “Monte-Carlo Tree Search and Rapid Action Value Estimation in Computer Go” by Sylvain Gelly. The paper presents a survey of the Monte-Carlo Tree Search (MCTS) algorithm and its application to the game of Go (pre-AlphaGo). However, because the paper is... [Read More]

Tags:

week15
paper

A Paper a Week - Week 14

Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search, by Rémi Coulom

Posted on April 6, 2024

Introduction For the fourteenth post in this series, I read “Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search”, by Rémi Coulom. The paper mainly focuses on the fundamentals of the Monte-Carlo Tree Search algorithm and how to implement it in an efficient fashion for a 9x9 Go playing agent.... [Read More]

Tags:

week14
paper