Beyond All-to-All: Causal-Aligned Transformer with Dynamic Structure Learning for Multivariate Time Series Forecasting

Abstract

Most existing multivariate time series forecasting methods adopt an all-to-all paradigm that feeds all variable histories into a unified model to predict their future values without distinguishing their individual roles. However, this undifferentiated paradigm makes it difficult to identify variable-specific causal influences and often entangles causally relevant information with spurious correlations. To address this limitation, we propose an all-to-one forecasting paradigm that predicts each target variable separately. Specifically, we first construct a Structural Causal Model from observational data and then, for each target variable, we partition the historical sequence into four subsegments according to the inferred causal structure: endogenous, direct causal, collider causal, and spurious correlation. Furthermore, we propose the Causal Decomposition Transformer (CDT), which integrates a dynamic causal adapter to learn causal structures initialized by the inferred graph, enabling correction of imperfect causal discovery during training. Furthermore, motivated by causal theory, we apply a projection-based output constraint to mitigate collider induced bias and improve robustness. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of the CDT.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…