CCLSTM: Coupled Convolutional Long-Short Term Memory Network for Occupancy Flow Forecasting

Abstract

Predicting future states of dynamic agents is a fundamental task in autonomous driving. An expressive representation for this purpose is Occupancy Flow Fields, which provide a scalable and unified format for modeling motion, spatial extent, and multi-modal future distributions. While recent methods have achieved strong results using this representation, they often depend on high-quality vectorized inputs, which are unavailable or difficult to generate in practice, and the use of transformer-based architectures, which are computationally intensive and costly to deploy. To address these issues, we propose Coupled Convolutional LSTM (CCLSTM), a lightweight, end-to-end trainable architecture based solely on convolutional operations. Without relying on vectorized inputs or self-attention mechanisms, CCLSTM effectively captures temporal dynamics and spatial occupancy-flow correlations using a compact recurrent convolutional structure. Despite its simplicity, CCLSTM achieves state-of-the-art performance on occupancy flow metrics and, as of this submission, ranks \(1st\) in all metrics on the 2024 Waymo Occupancy and Flow Prediction Challenge leaderboard.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…