Phase transitions for the noisy transformer model in arbitrary dimension
Abstract
We study the McKean--Vlasov free energy on the unit sphere associated with the unnormalized self-attention (USA) model for noisy transformer dynamics. We prove a sharp global-minimizer dichotomy in every dimension d2. There is a unique β*(d)>0 such that equation* Id/2+1(β*(d))Id/2(β*(d))=1d, equation* where Iν is the modified Bessel function of the first kind. For 0<β β*(d), the uniform density remains the unique global minimizer up to the linear-stability threshold equation* K\#(d)(β)=βd/22d/2Γ(d/2)Id/2(β), equation* and the phase transition is continuous. For β>β*(d), the uniform density is not globally minimizing at K\#(d)(β), so the critical coupling satisfies Kc<K\#(d)(β) and the transition is discontinuous. This result generalizes the authors' recent d=2 work arXiv:2604.16288 to arbitrary dimension. The proof uses the sharp Beckner--Onofri/logarithmic Hardy-Littlewood-Sobolev (HLS) inequality on the sphere, together with a Funk--Hecke/Bessel coefficient computation and a degree-two quartic obstruction.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.