A Geometric Nash Approach in Tuning the Learning Rate in Q-Learning Algorithm

Abstract

This paper proposes a geometric approach for estimating the α value in Q learning. We establish a systematic framework that optimizes the α parameter, thereby enhancing learning efficiency and stability. Our results show that there is a relationship between the learning rate and the angle between a vector T (total time steps in each episode of learning) and R (the reward vector for each episode). The concept of angular bisector between vectors T and R and Nash Equilibrium provide insight into estimating α such that the algorithm minimizes losses arising from exploration-exploitation trade-off.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…