Symbol Distributions in Semantic Communications: A Source-Channel Equilibrium Perspective
Abstract
Semantic communication systems often use end-to-end neural networks to map input data into continuous symbols. These symbols, which are essentially neural network features, have fixed dimensions and often exhibit heavy-tailed distributions. However, the mechanism behind this distributional shape remains underexplored due to the end-to-end nature of encoder training, hindering systematic analysis and design. In this paper, we propose a parametric model for semantic symbol distributions. We model end-to-end training as inducing two coupled pressures on the symbol distribution: a source pressure that favors power allocation minimizing the average description cost, and a channel pressure that favors distributions with higher channel utilization. Under surrogate objectives that capture these effects, we obtain a Student's t-distribution as a model for the semantic symbols. Experiments on image-based semantic systems show that the model closely predicts how the shape parameter varies with (i) explicit symbol rate control and (ii) dataset entropy variability. Furthermore, enforcing a target symbol distribution via regularization (e.g., a Gaussian prior) improves training convergence, which is consistent with our hypothesis.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.