DRL meets DSA Networks: Convergence Analysis and Its Application to System Design

Abstract

In dynamic spectrum access (DSA) networks, secondary users (SUs) need to opportunistically access primary users' (PUs) radio spectrum without causing significant interference. Since the interaction between the SU and the PU systems are limited, deep reinforcement learning (DRL) has been introduced to help SUs to conduct spectrum access. Specifically, deep recurrent Q network (DRQN) has been utilized in DSA networks for SUs to aggregate the information from the recent experiences to make spectrum access decisions. DRQN is notorious for its sample efficiency in the sense that it needs a rather large number of training data samples to tune its parameters which is a computationally demanding task. In our recent work, deep echo state network (DEQN) has been introduced to DSA networks to address the sample efficiency issue of DRQN. In this paper, we analytically show that DEQN comparatively requires less amount of training samples than DRQN to converge to the best policy. Furthermore, we introduce a method to determine the right hyperparameters for the DEQN providing system design guidance for DEQN-based DSA networks. Extensive performance evaluation confirms that DEQN-based DSA strategy is the superior choice with regard to computational power while outperforming DRQN-based DSA strategies.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…