WiFo-M2: Empower Wireless Communications With Plug-and-Play Environment Sensing via Foundation Model
Abstract
The emerging convergence of next-generation wireless networks and agentic artificial intelligence (AI) is inspiring a new vision: embodied intelligent network entities utilize environmental sensing to refine their physical-layer (PHY) actions. Despite a growing body of preliminary work, prevailing small and task-specific AI models require extensive manual design of data pre-processing, network architecture, and fine-tuning, leaving them tightly coupled to particular PHY actions, system configurations, and deployment scenarios. To address this, we propose a paradigm shift with WiFo-M2, a foundation model that enables environment sensing to be easily integrated into PHY actions, delivering universal performance gains. To extract generalizable out-of-band (OOB) channel-aware features from environment sensing, we introduce ContraSoM, a contrastive pre-training strategy. Once pre-trained, WiFo-M2 infers future OOB channel-aware features from historical sensory data and strengthens feature robustness via modality-specific data augmentation. Experiments show that WiFo-M2 improves the performance of diverse PHY actions and demonstrates strong generalization to unseen scenarios.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.