Linear Recurrent Neural Networks as Time-Delay Embeddings
Abstract
Sequence models, and particularly Linear Recurrent Neural Networks (LRNNs) of the form hk+1 = W hk + yk + b, are widely applicable in time-series analysis for dynamical systems, yet, as black-box algorithms, much is unknown about why they perform well. In this work, we leverage Takens' embedding theorem, which provides conditions under which partially observed time series organized into delay-coordinate vectors can faithfully represent the original system's dynamics, as a theoretical framework for explaining how and why sequence models preserve and reconstruct dynamical systems. For LRNNs, concatenating output states into delay-coordinate vectors gives rise to a ``delay" matrix Mn,m∈ C(nm) × (n+1)m: a block matrix consisting of identity matrices I ∈ Rm × m repeated n times along the main diagonal and weight matrices W ∈ Cm × m featured n times along the super-diagonal. Mn,m relates the delay-coordinates of the input time series to those of the LRNN output states, and, for Mn,m to be an embedding, it must be full row-rank. We provide explicit conditions for Mn,m to be full row-rank and prove the condition number of Mn,m and determinant of Mn,m Mn,m*--measures of embedding stability--are bounded independent of n, at least for certain ranges of W's singular values: namely, when σ(W) 1. This result explains why the spectrum of W for trained LRNNs tends to converge to within the unit circle.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.