Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models
Abstract
Vision-Language-Action (VLA) models reach high success rates on clean inputs but collapse under small adversarial perturbations: a 16/255 PGD attack drops OpenVLA-7B's LIBERO success from 95\% to under 5\%. Whether this trade-off has a theoretical floor was open. We prove that it does. For any VLA policy, capability I(;) and robustness I(;)-I(;δ) sum to at most H()+I(X;), the task entropy plus adversarial channel capacity. The proof reduces to two applications of the Data Processing Inequality. The pixel-level bound is loose by 103 nats and serves as a ceiling guarantee; an encoder-specific corollary tightens it by over an order of magnitude, into a regime where realized capability already consumes 5--9\% of the budget. We validate Theorem~thm:main with zero violations across 308 cells: 252 closed-form Gaussian-VLA, 48 OpenVLA-7B+LIBERO+PGD (4 suites × 4 × 3 seeds), 4 Square-Attack, and 4 multi-step (T=10). A complementary measurability inequality disc disc further holds across 144 cross-architecture cells spanning OpenVLA, OpenVLA-OFT (continuous-L1), and SmolVLA (flow-matching). The same construction yields three label-free diagnostics: a pre-flight encoder ceiling, a defense-forensics probe that localizes input-side vs.\ language-model intervention, and a head-agnostic robustness ratio comparable across discrete-token, L1-regression, and flow-matching policies. Together these provide the cross-setting axis defense and architecture comparisons currently lack.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.