LaGen: Towards Autoregressive LiDAR Scene Generation

Abstract

Generative world models for autonomous driving (AD) are of great value in applications such as data augmentation, closed-loop simulation, and safety-critical scenario evaluation. Unlike the widely studied image modality, in this work we explore generative world models for LiDAR data. Existing generation methods for LiDAR predominantly focus on single frame generation or lack the capacity for interactive simulation, while existing prediction approaches require multiple frames of historical input and can only deterministically predict multiple frames at once. Both paradigms fail to support long-horizon interactive generation. To this end, we introduce LaGen, which, to the best of our knowledge is the first autoregressive framework capable of generating long-horizon LiDAR scenes in a frame-by-frame, interactive manner. LaGen is able to take a single-frame input as a starting point and effectively utilize bounding box information as conditions to generate high-fidelity 4D scene. In addition, we introduce a scene decoupling estimation module to enhance the model's interactive generation capability for object-level content, as well as a noise modulation module to mitigate error accumulation during long-horizon generation. We extensively evaluate LaGen's performance in controlled data generation and long-horizon scene generation on the nuScenes dataset. The experimental results demonstrate that LaGen achieves state-of-the-art performance, especially on later frames. The code is publicly available at: https://github.com/szzhou88/LaGen.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…