LLM-Driven Large-Scale Spectrum Access
Abstract
Efficient spectrum management in massive-scale wireless networks is increasingly challenged by explosive action spaces and the computational intractability of traditional optimization. This study proposes a Large-Scale LLM-Driven Spectrum Access (LSA) framework rooted in Group Relative Policy Optimization (GRPO). To overcome the computational collapse caused by ultra-long prompts in large-scale scenarios, we develop a hierarchical state serialization mechanism that synthesizes global environment statistics with localized critical constraints, enabling the LLM to perform high-dimensional reasoning within a bounded context window. Simulation results under strictly time-bounded inference protocols reveal that the code-driven paradigm eliminates the SFT cold-start bottleneck and leverages direct execution feedback to achieve superior scaling laws. The framework maintains robust spectral utility and generalization across varying network scales, yielding consistent and empirically superior performance over non-deterministic heuristics, and surpassing partitioned classical solvers in ultra-dense regimes under matched compute budgets.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.