Large Language Models Can Perform Automatic Modulation Classification via Discretized Self-supervised Candidate Retrieval

Abstract

Identifying wireless modulation schemes is essential for cognitive radio, but standard supervised models often degrade under distribution shift, and training domain-specific wireless foundation models from scratch is computationally prohibitive. Large Language Models (LLMs) offer a promising training-free alternative via in-context learning, yet feeding raw floating-point signal statistics into LLMs overwhelms models with numerical noise and exhausts token budgets. We introduce DiSC-AMC, a framework that reformulates Automatic Modulation Classification (AMC) as an LLM reasoning task by combining aggressive feature discretization with nearest-neighbor retrieval over self-supervised embeddings. By mapping continuous features to coarse symbolic tokens, DiSC-AMC aligns abstract signal patterns with LLM reasoning capabilities and reduces prompt length by over 50\%. Simultaneously, utilizing a DINOv2 visual encoder to retrieve the kNN most similar labeled exemplars provides highly relevant, query-specific context rather than generic class averages. On a 10-class benchmark, a fine-tuned 7B-parameter LLM using DiSC-AMC achieves 83.0\% in-distribution accuracy (-10\,to\,+10\,dB) and 82.50\% out-of-distribution (OOD) accuracy (-11\,to\,-15\,dB), outperforming supervised baselines. Comprehensive ablations on vanilla LLMs demonstrate the token efficiency of DiSC-AMC. A training-free 7B LLM achieves 71\% accuracy using only 0.5\,K-token prompt,surpassing a 200B-parameter baseline that relies on a 2.9K-token prompt. Furthermore, similarity-based exemplar retrieval outperforms naive class-average selection by over 20\%. Finally, we identify a fundamental limitation of this pipeline. At extreme OOD noise levels (-30\,dB), the underlying self-supervised representations collapse, degrading retrieval quality and reducing classification to random chance.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…