Spatiotemporal Feature Alignment and Weighted Fusion in Collaborative Perception Enabled by Network Synchronization and Age of Information
Abstract
Collaborative perception in Internet of Vehicles (IoV) aggregates multi-vehicle observations for broader scene coverage and improved decision-making. However, fusion quality degrades under spatiotemporal heterogeneity from unsynchronized clocks, communication delays, and motion variations across vehicles. Prior work mitigates these through spatial transformations or fixed time-offset corrections, overlooking time-varying clock drifts and delays that cause persistent feature misalignment. To overcome these, we propose a spatiotemporal feature alignment and weighted fusion framework. Specifically, network synchronization is designed to continuously compensate for clock state differences between vehicles and establish a common time reference, onto which all feature timestamps can be mapped. After synchronization, to align the freshness of received features since their generation, their Age of Information (AoI) is determined by estimating network delay with given feature size and link quality. Our spatiotemporal feature alignment then projects vehicles' features into one spatial coordinate and corrects them to a synchronized fusion instant using AoIs, enabling all features to describe the scene coherently. Furthermore, due to varying synchronization and alignment quality, we estimate their uncertainties and integrate with AoI to generate feature weights for efficient fusion, prioritizing fresh, reliable feature regions. Simulations show consistent perception accuracy improvements over strong baselines under clock drifts and link delays.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.