Thermodynamic Diffusion Inference with Minimal Digital Conditioning

Abstract

Diffusion-model inference and overdamped Langevin dynamics are formally identical. A physical substrate that encodes the score function therefore equilibrates to the correct output by thermodynamics alone, requiring no digital arithmetic during inference and potentially achieving a 10,000× reduction in energy relative to a GPU. Two fundamental barriers have until now prevented this equivalence from being realized at production scale: non-local skip connections, which locally coupled analog substrates cannot represent, and input conditioning, in which the coupling constants carry roughly 2,600× too little signal to anchor the system to a specific input. We resolve both obstacles. Hierarchical bilinear coupling encodes U-Net skip connections as rank-k inter-module interactions derived directly from the singular structure of the encoder and decoder Gram matrices, requiring only O(Dk) physical connections instead of O(D2). A minimal digital interface -- a 4-dimensional bottleneck encoder together with a 16-unit transfer network, totalling 2,560 parameters -- overcomes the conditioning barrier. When evaluated on activations drawn from a trained denoising U-Net, the complete system attains a decoder cosine similarity of 0.9906 against an oracle upper bound of 1.0000, while preserving theoretical net energy savings of approximately 107× over GPU inference. These results constitute the first demonstration of trained-weight, production-scale thermodynamic diffusion inference.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…