Avoiding normalization uncertainties in deep learning architectures for end-to-end communication
Abstract
Recently, deep learning is considered to optimize the end-to-end performance of digital communication systems. The promise of learning a digital communication scheme from data is attractive, since this makes the scheme adaptable and precisely tunable to many scenarios and channel models. In this paper, we analyse a widely used neural network architecture and show that the training of the end-to-end architecture suffers from normalization errors introduced by an average power constraint. To solve this issue, we propose a modified architecture: shifting the batch slicing after the normalization layer. This approach meets the normalization constraints better, especially in the case of small batch sizes. Finally, we experimentally demonstrate that our modified architecture leads to significantly improved performance of trained models, even for large batch sizes where normalization constraints are more easily met.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.