Mathematical Foundations of Polyphonic Music Generation via Structural Inductive Bias

Joonwon Seo

Mathematical Foundations of Polyphonic Music Generation via Structural Inductive Bias

Abstract

This monograph addresses the "Missing Middle" problem in AI music generation - the challenge of producing coherent, phrase-level musical structure. Using Beethoven's piano sonatas as a case study, I introduce the Smart Embedding architecture, a factorized representation grounded in the empirically verified independence of pitch and hand attributes (NMI=0.167). The architecture achieves a 48.3% reduction in embedding parameters while improving validation loss by 9.47%. Theoretically, I establish formal guarantees through information theory, Rademacher complexity analysis (yielding a 28.09% tighter generalization bound), and category-theoretic interpretation. These results are further supported by Singular Value Decomposition analysis and a blind expert listening study (N=53). Collectively, this work presents a dual contribution that combines architectural innovation with mathematical rigor, offering a principled framework for building more efficient, stable, and interpretable generative models for complex sequential data.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…